An interesting interview question

I ran across a problem today that I think would make an excellent interview question.

You work in a warehouse that has N storage compartments, labeled D-1 through D-N. Each storage compartment can hold one shipping crate. There’s also a temporary holding area, called D-T, that can hold a single crate. There also are N crates, labeled C-1 through C-N, each of which is in a storage compartment.

There is a forklift that can pick up a single crate, move it to an empty position, and put it down.

The owner, after touring the facility, decreed that the crates must be rearranged so that crate C-x is in space D-x (C-1 in D-1, C-2 in D-2, … C-N in D-N).

What algorithm would you employ to rearrange the crates in the minimum number of moves?

This is a fairly simple problem. If you pose it as a “balls and bins” problem, with a physical model to play with, most people will develop the optimum algorithm in a matter of minutes. But pose it as a programming problem and a lot of programmers have trouble coming up with the same algorithm.

If you’re interested in trying the problem yourself, cut five squares (crates) from a piece of paper, and label them C1 through C5. Draw six boxes on another sheet of paper, and label them D1 through D6. Now, arrange the crates on the spaces in this order: C4, C1, C5, C3, C2. Leave D6 empty. Finally, rearrange them. Remember, if you pick up a card you have to place it in an empty spot before you can pick up another card.

So if this problem is so simple, why is it such an interesting interview question and why do programmers have trouble with it?

One reason is that it’s not really a sorting problem. At least not in the traditional sense. You see, I’m not asking the candidate to write a sort. In fact, I’m not asking him to write a program at all. I’m asking him to develop an algorithm that will rearrange the crates in a specific order, given the constraints. The candidate’s first reaction tells me a lot about how he approaches a problem. Granted, I have to take into account that the candidate will expect a whiteboard programming problem, but if his immediate reaction is to stand up and start writing some kind of sort method, I can pretty much guarantee that he’s going down the wrong track.

The algorithm that uses the fewest moves is not one that you would typically write. It’s either computationally expensive, or it uses additional extra memory. And there’s a well known efficient solution to sorting sequential items. That algorithm works well in the computer, but is less than optimum when translated to the physical world where it takes time for a forklift to pick up and move an item.

The well known algorithm starts at the first storage compartment, D-1. If the item in that position is not crate C-1, it swaps the crate with the crate in the position that it needs to be in. It continues making swaps until C-1 is in D-1, and then moves on to compartment D-2 and does the same thing. So, for example, if you have five crates that are arranged in the order [4,1,5,3,2,x] (the x is for the empty spot, initially D-T) the sequence of swaps is:

    Swap 4 with 3 (the item that's in position D-4), giving [3,1,5,4,2,x]
    Swap 3 with 5, giving [5,1,3,4,2,x]
    Swap 5 with 2, giving [2,1,3,4,5,x]
    Swap 2 with 1, giving [1,2,3,4,5,x]

At which point the programmer says, “Done!”

The programmer who develops this solution and calls it done might think again if he re-reads the problem statement. Note that nowhere did the algorithm use the temporary location, D-T. Either the programmer will smell a rat, or he’ll dismiss that temporary location as a red herring.

As it turns out, it’s not a red herring. A swap is actually a sequence of three separate operations. Swapping the contents of D-1 and D-4, for example, results in:

  1. Move the contents of D-1 to a temporary location.
  2. Move the contents of D-4 to D-1.
  3. Move the contents of the temporary location to D-1.

The well known solution to the problem–the solution that is in many ways optimum on the computer–results in 12 individual moves:

  1. Move crate C-4 from D-1 to D-T.
  2. Move crate C-3 from D-4 to D-1.
  3. Move crate C-4 from D-T to D-4. (Crate C-4 is in position.)
  4. Move crate C-3 from D-1 to D-T.
  5. Move crate C-5 from D-3 to D-1.
  6. Move crate C-3 from D-T to D-3. (Crate C-3 is in position.)
  7. Move crate C-5 from D-1 to D-T.
  8. Move crate C-2 from D-5 to D-1.
  9. Move crate C-5 from D-T to D-5. (Crate C-5 is in position.)
  10. Move crate C-2 from D-1 to D-T.
  11. Move crate C-1 from D-2 to D-1. (Crate C-1 is in position.)
  12. Move crate C-2 from D-T to D-2. (Crate C-2 is in position.)

In the worst case, of which this is an example, the algorithm makes N-1 swaps, or 3N-3 moves.

At this point, it might be useful to have a discussion about what problem we’re really trying to solve. The problem asked for the minimum number of moves. A reminder that we’re talking about a physical process might be in order. To programmers who are used to things happening almost instantly, a few extra moves here and there don’t really make a difference. The solution presented above is asymptotically optimum in that the number of moves required is linearly proportional to the number of crates. That’s typically a good enough answer for a programming question. But, again, this isn’t really a programming question. One can assume that the client wants to make the minimum number of moves because he has to pay a forklift operator $15 per hour and it takes an average of five minutes for every move. So if he can reduce the number of moves from 12 to 6, he saves $30.

The solution that most people presented with the balls and bins problem develop does things a bit differently. Rather than starting by moving crate C-4 to its rightful place, the algorithm starts by emptying the first bin (i.e. moving C-4 to D-T), and then filling the empty location with the item that belongs there. Then, it fills the new empty slot with the element that belongs there, etc. Let’s see how it works, again starting with [4,1,5,3,2,x].

  1. Move crate C-4 from D-1 to D-T.
  2. Move crate C-1 from D-2 to D-1. (Crate C-1 is in position.)
  3. Move crate C-2 from D-5 to D-2. (Crate C-2 is in position.)
  4. Move crate C-5 from D-3 to D-5. (Crate C-5 is in position.)
  5. Move crate C-3 from D-4 to D-3. (Crate C-3 is in position.)
  6. Move crate C-4 from D-T to D-4. (crate C-4 is in position.)

That took six moves, or half of what the “optimum” solution took. That’s not quite a fair comparison, though, because that’s not the worst case for this algorithm. The worst case occurs when moves result in a cycle that leaves D-T empty before all of the items are in order. Consider the sequence [2,1,5,3,4,x]. Using our new algorithm, we start with:

  1. Move crate C-2 from D-1 to D-T.
  2. Move crate C-1 from D-2 to D-1. (Crate C-1 is in position.)
  3. Move crate C-2 from D-T to D-2. (Crate C-2 is in position.)
    At this point, our temporary spot is empty, so we need to make a new empty spot by moving the first out of place crate to D-T.
  4. Move crate C-5 from D-3 to D-T.
  5. Move crate C-3 from D-4 to D-3. (Crate C-3 is in position.)
  6. Move crate C-4 from D-5 to D-4. (Crate C-4 is in position.)
  7. Move crate C-5 from D-T to D-5. (Crate C-5 is in position.)

It’s interesting to note that the swapping algorithm requires three swaps (nine moves) to rearrange items given this initial arrangement.

As far as I’ve been able to determine, the worst case for this algorithm requires 3N/2 moves. It will never make more moves than the swapping algorithm, and often makes fewer.

The point of posing the problem to the candidate is to get him to develop an approach to the problem before he starts coding.

Another interesting thing about this question is that it allows for a follow-on question. Assuming that the candidate develops the optimum algorithm, ask him to write code that, given the starting position, outputs the moves required to rearrange the crates. In other words, implement the algorithm in code.

At this point, he’ll have to make a decision: use sequential search to locate specific crates, or create an index of some kind to hold the original layout so he can quickly look up an individual crate’s location. Again, he should be asking questions because the requirements are deliberately ambiguous. In particular, he might think that he can’t use the index because we only allowed him one extra swapping space for the crates. But again, the original problem statement doesn’t mention a computer program, and the follow-on question doesn’t place any restrictions on the memory consumption.

I’ll leave implementation of the algorithm as an exercise for those who are interested. It’s not difficult, just a bit odd for those who are used to writing traditional sorting algorithms.

Ultimately, the reason I find this problem a potentially useful interview question is because it forces the candidate to differentiate between the customer’s requirements (the optimum sequence of steps required to arrange his crates), and then internal implementation details. The extent to which the candidate can make that differentiation tells me a lot about how she will approach the real-world problems that we’ll present to her on a daily basis. In addition, the simple problem is rife with opportunities to have other discussions about approaches to problems and common coding tradeoffs.

It occurred to me as I was writing this that I should construct a balls and bins prop that I can trot out if the candidate doesn’t trip to the algorithm relatively quickly. It would be interesting to see how quickly the candidate, if given the prop, would discover the optimum algorithm.

I’m kind of excited about getting an opportunity to try this out.