(or, "Yet another way to learn more about the underlying structure of the universe.")

Let us assume.

I was reading my article on the distribution of computation space and going back over my results, but this time around, a new potential implication took root in my imagination.

Let's assume for the moment that Turing-completeness is necessary and sufficient for computation of any general sort, and furthermore, that there are no such mechanisms in existence that can compute on a fundamentally higher level. In other words, let us assume that every computing device or mechanism is isomorphic to any other such device or mechanism.

Let's also assume that my experimental analysis of computation space was more or less correct, at least in the broad strokes. If someone else conducted a similar experiment using a substantially different instruction set or methodology, some of the specifics might turn out differently, but I suspect the more fundamental properties I found would still show up.

These aren't trivial assumptions, but they're at the very least plausible. So we proceed.

Scarcity.

Near the end of my article, I speculated that primes would probably be under-represented in a list of numbers generated by random code and sorted by frequency. This was partly an observation, but mostly a logical argument; by their nature, primes eschew association with simple patterns, and simple patterns are the best you're going to get out of a random program generator. I would not expect primes to start showing up in bulk until we reached the level of complexity where a prime sieve might reasonably form, and that's way past the scope of what we were dealing with so far. (Side note: it would be interesting to figure out the shortest possible prime generating program using that sort of basic instruction set.)

What the article doesn't include is my follow-up analysis on the subject, which indeed showed that primes seem to be relatively less frequent than other integers in their neighborhood. Here's a plot relating the set of integers (absolute values were taken) to the number of times each one appeared in the cumulative output of a billion random micro-programs.

Integer Frequency through N=600

There is a lot in this plot that bears discussion, but let's stick to the highlights:

  • It is roughly logarithmic.
  • There are conspicuous spikes (and patterns of spikes), most notably around each %2^n%.
  • The red dots—you may have to click on it and zoom in to see them—are primes.
Conspicuous spikes.

If you haven't figured it out yourself, the spikes on the powers of two are a consequence of the binary assembly code in which these micro-programs are written. It's operationally trivial for computers to multiply or divide by two, so you'll see it a lot in a random distribution like this. And getting to %64% is not much more work than getting to %32% (after all, the infrastructure is already in place.) The same goes for %128%, %256%, and so on, which is why you see these spikes and why they're barely even tapering off.

Red dots.

Closer examination of where these dots fall show that they're almost always in a crevice, or at least on a slope, which is to say that the number in question has a notably smaller chance of being generated than one or both of its neighbors.

However, primes are way too slippery a customer to allow this to always happen. Take, for example, %n=31%. This is the first prime that's sitting up on a mesa. I argued before that primes are hard to generate, so what gives?

Well, for many purposes, the distribution of primes is effectively random. Because of this, some of them are doomed to land right next to a really popular neighbor, as with our poor %31%. As discussed, %32% is especially easy to generate, and as it happens, adding or subtracting %1% is also generally a piece of cake. It's just bad luck for %31% that his neighbor attracts all this traffic for him.

Degrees of introversion.

The example of %31% is particularly egregious, since those %2^n% spikes are the biggest bullies on the playground. But all primes are affected by this phenomenon to some extent. Since a prime by its nature can't be directly conjured via a loop (read: multiplication, exponentiation, hyper-operations), the only way to get to these numbers (short of e.g. prime sieves) is to land on some composite number close to it, and then happen to add or subtract the right constant. The frequency of any prime is ultimately a reflection of its neighborhood.

There are lots of consequences to this arrangement. Turning it around, one good example is the case of twin primes, which are primes that differ only by %2%. You only get twin primes when the number between them is divisible by %6%. Any multiple of %6%, being divisible by %2%, %3%, and %6%, is right up there as an easy target for random generation. The upshot here is that when you have a twin prime pair, you will almost always see a single-value spike between them, since they're surrounding such a juicy target.

But enough fun with plots.

Dark integers.

This is a term I pulled out of my ass which I like so much that I refuse to Google it and find out it's already taken.

We've shown how certain integers are "popular" in terms of computational space frequency, like the powers of two; we have also shown that some tend in the other direction, such as the primes. A dark integer, according to me, is loosely defined as an integer that's especially hard to generate through an algorithm of any kind. I am hampered by the fact that I don't have a working theory as to how to strictly identify these numbers other than by brute force, but I have some idea of their properties.

They are not necessarily prime—in fact, they seemingly prefer to be complicated composites with many disparate factors, often including one or more large primes as a factor. It does make some sense that this would be the ideal setup for a dark integer. The erratic and informationally-dense set of factors it comprises makes it an unpopular number already, and not being prime allows it to be bordered by or nearby to one or more primes, which will not attract any spillover traffic.

'Darkness' in this sense is a relative term, at least for the moment. Perhaps it will make sense to define a dark number as any integer %N% that has a strictly larger Kolmogorov complexity than any %n < N%, although that's still difficult to prove. At any rate, some numbers are darker than others, and while it should roughly correspond to which numbers are lowest on a plot such as the one above, we have to remember that this is an experimentally-derived data set and prone to noise, especially as frequency decreases.

That said, I would like to tip my hat to %3465%, which was the lowest number that did not appear a single time in my billion programs. Wolfram Alpha has this to say about %3465%:

  • %3465 = 3^2 \times 5 \times 7 \times 11%.
  • %3465% has the representation %3465=2^7 3^3+9%.
  • %3465% divides %34^6-1%.

Whether any of that is significant is too soon to say.

Time to get crazy.

So, do you remember what we're assuming about computers? Basically, that a computer is a computer by any other name? It comes into play now.

  • Assumption: The universe is not magic. It plays by the same Turing-equivalent rules as everything in it. This means physics is ultimately deterministic, and if we had infinite time and infinite memory, we could simulate our universe on a Turing machine.
  • Assumption: The Big Bang happened. The initial configuration of energy at its inception was effectively random.

The universe seemingly operates by following physical laws, with all matter and energy having been kicked off by that hot mess of a Big Bang. This strikes me as loosely analogous to my Skynet micro-program generator, what with random initial conditions carrying out clearly defined step by step processes. More to the point, our earlier assumption posits that they are mathematically interchangeable for our purposes.

Hypothesis: For any given countable set of anything in the universe, the quantity of that thing is relatively unlikely to be a dark integer, all things being equal.

To test the hypothesis:

  1. Identify dark integers. The more, the larger, the merrier.
  2. Identify anything that can be quantified.
  3. Quantify the anything.
  4. Repeat to within error bounds for confirmation or disproof.

There may, unfortunately, be some bounds on what the anything could actually be. Ideally, it would be as simple as catching a bunch of atoms of something over and over and tracking the result, but more likely it would have to be something more directly related to the Big Bang, such as star counts in galaxies.

If all of this were sound, conclusions could be drawn.

If there are fewer dark numbers after all, it may be further evidence of the truly random and arbitrary nature of creation, and more importantly, it may be strong evidence of a deterministic universe and all that that implies.

If there is no discrepancy when it comes to dark numbers, it may imply that there is a true stochastic or otherwise bizarre layer to physics, or it may imply that the Big Bang was ordered more deliberately than one would think, or that there is some agency at work that is disrupting the "natural" order of things in one way or another.

Either way, we learn.


Addendum

The dark integers on the graph above are specific to the instruction set used to generate it. While there is reason to expect that most sets of operations will be similar inasmuch as a binary system seems to be the simplest approach, it is not a guarantee. One can imagine pathological instruction sets that will yield an entirely different contour, such as one that uses the primes as its base units.

Long story short, if the binary analysis doesn't turn up results, it might be worth investigating the possibility of reverse engineering an instruction set by identifying its specific dark integers and trying to tease out what sort of atomic operations would give rise to those holes.

While applying any of this to cosmology is wildly speculative, biology and the mechanism of evolution seem like a perfect candidate for testing the viability of dark integers in general. Evolution is undeniably a finite computational force that lends itself well to the whole idea. Multi-cellular organisms are literally built from code, and it stands to reason that on balance, evolutionary pressures will tend to restrict most quantifiable biological features to easier-to-compute amounts. More on this as I investigate.