This is something I worked on a year ago, so I'll keep it (relatively) brief.


There's a keypad on my apartment building which accepts 5-digit codes. One day on the way in, I started thinking about how long it would take to guess a working code by brute force. The most obvious answer is that with five digits, you have %10^5 = 100,000% possible codes; since each one of these is 5 digits long, you're looking at %500,000% button pushes to guarantee finding an arbitrary working code.


But then I realized you could be a little smarter about it. If you assume the keypad only keeps track of the most recent five digits, you can do a rolling approach that cuts down your workload. For example, say I hit 12345 67890. If the keypad works as described, you're not just checking two codes there, you're checking 6: 12345, 23456, 34567, 45678, 56789, 67890.


The next natural question was how many button-pushes this could actually save. After doing some work, I satisfied myself that you could cut it down by a factor of %~D%, where %D% is the number of digits in a code. So if you typed the right digits in the right order, instead of the %500,000% before, you're only looking at %100,004% (you need an extra 4 to wrap the last few codes up).


The next natural question was: how do you actually come up with that string of digits? It has to be perfect, in that it contains every possible 5-digit combination without repeating any of them. As with almost any exploratory problem, the best approach is to simplify as much as possible. For instance, consider a binary code three digits long, which only has %2^3 = 8% different codes: %{000, 001, 010, 011, 100, 101, 110, 111}%.

My formula suggested you should be able to hit all eight of those in a %2^3 + 3 - 1 = 10% digit string, and it's easy enough to put one together by a little trial and error: 0 0 0 1 1 1 0 1. I found it was easiest to treat these strings as cyclical, so the 0 1 at the end wrap around to give you 0 1 0 and 1 0 0. As a bonus, any rotation of this string will work just as well.


As I scaled the problem up, however, more and more things became clear. First, above a certain point, you start getting multiple viable optimal strings that are not simple transformations of one another. Second, finding an elegant way to generate these strings was not turning out to be easy.

I found one mechanical way of generating a valid string that worked, but I didn't love it. If you list all the combinations you have to cover, and then slot each combination into a buffer greedily, meaning the earliest spot where it can fit (potentially with overlap), it works out.

mechanical approach


At some point I realized the generative process could be viewed as a directed graph, the nodes representing an N-length code, its successors delineating alternatives for continuing the string. After a few attempts, I got a pretty clear-looking one down (the node labels 0-7 are standing in for their binary counterparts):

directed graph

As it turns out, you can start on any node and if you trace a Hamiltonian cycle—touching each vertex only once and ending back at the start—the numbers you hit along the way form a valid optimal string. This approach also scales with the parameters of the problem, but requires a messier or multi-dimensional graph.


Whenever I stumble into open-ended problems like this, I avoid Googling for the answer at first, because what's the fun in that? I was pretty sure this would be a solved problem, though, and after spending a while working this through myself, I looked for and found Wikipedia's page on De Bruijn sequences. As usual, I was beaten to the punch by somebody a hundred years earlier. Hilariously, however, in this case the results matched up better than expected. Check out the Wiki page, notably a) the graph on the right, and b) the first line of the section "Uses".


If you want to see the wild scratch pad which brought me from point A to point B, by all means, enjoy.