algorithms – Optimal order to traverse a grid with caching

Let’s say I want to evaluate a two-argument function using all possible combinations of elements from sets A and B as arguments.

For example, if the function is addition, we get this grid:

       A

     1 2 3 
   1 2 3 4
B  2 3 4 5
   3 4 5 6

Let’s also say that retrieving an element from a set is has a fixed cost, but we can remember up to N elements at a time.

Is there a known optimal order in which to consider the elements, in terms of cost?

For example, we could start with the first element of A, then consider every element of B. If the size of B is much larger than N then every new element of B will be a “cache” miss, making this a not very efficien strategy.