I will try to give only the briefest overview of cellular automata here, as there are already more than enough excellent introductions to the topic on the World Wide Internet.

A cellular automaton is a model of computation wherein a cyclic or unbounded array of individual *cells* each contain an integral value, often constrained to 0 or 1. Every such model is governed by a specific ruleset and is referred to singularly as a *cellular automaton* (CA).

CAs are configured to operate in one or more spatial dimensions. While there is no limit on this, the majority of research is limited to 1-D and 2-D models. Some models extend arbitrarily far as needed along each dimension, although for some purposes, finitely long wrap-around spans of cells are used. There is always a timelike progression which drives the computation: the combination of the specific ruleset and the initial cell values at \( t_0 \) completely determine the behavior of the CA as time progresses.

Like anything else, CAs have areas of strength and areas of weakness, relative to other models of computation. One of their great strengths is an innate capacity for massively parallel calculations, which is appealing as we increasingly face hard physical limits on serial computation speeds, and perhaps more to the point, begins to reveal shared properties with the concept of Newtonian spacetime and other facets of the "real world."

But perhaps the greatest strength of CA models is their underlying simplicity. While they can be made arbitrarily complicated, they rarely are, and more importantly, don't need to be. The simplest species of CA are known as Elementary Cellular Automatons (ECAs), shorthand for the 256 simplest rulesets that aren't all trivial.

Every ECA consists of only one spatial dimension. Each cell takes only two possible values, and bases its behavior solely on its own value and those of its two immediate neighbors. The precise rule for updating that value is what distinguishes the 256 ECAs from one another. Every cell in a standard cellular automaton operates under precisely the same rules as every other cell—it is only the different values of their neighbor cells which allow variation in their behavior.

One of the simplest rules we might have would be to check the value (often called "color") of the cell to your right and then change your own value to match. On the next time step, the cell to your left will do the same thing you just did, and so that value will move a step further; indeed, every starting value will propagate at a constant rate to the left with this rule. When studying ECAs, time is typically depicted as moving down on the y-axis, yielding a 2-D snapshot of what actually takes place over every moment along that one line of cells. The example discussed here would present as a diagonal line crossing space and time in such a plot.

Many rules are along the lines of the straightforward propagation example we just gave, but Rule 170 fits that description most precisely. In this rule diagram, the lower middle cell denotes what the middle cell will become after one time step, based on the values of itself and its neighbors as depicted above it. For these simple CAs, there are only 8 such rules, which together completely define the behavior of the model; this is why there are only \(2 ^ {2 ^ 3} = 256 \) possible rules at this level of simplicity.

To be precise, the top line in each icon shows the given cells \( c_{i-1}, c_{i}, c_{i+1} \) at time \( t_k \), with the cell below showing the resulting value of \( c_i \) at \( t_{k+1} \). As discussed in our example, we see that the new center cell simply copies whatever color was previously on its right side.

This is a fairly boring ruleset and results in a fairly boring spacetime plot—scrolling down, you can find Rule 170 near the bottom right, sporting plenty of diagonal lines.

Note that a comparable rule that merely pushes everything to the right instead of the left is essentially the same thing. In fact, each of the 256 ECAs is identical in spirit to 1 or 3 other rulesets due to left-right mirroring and/or complementary value-inversion rules, leaving only 88 truly unique CAs. These are shown below with the identical rule numbers listed (see Wolfram codes) with the top, bolded number featured in the plot. The plots themselves are of a 128-cell wide cyclical configuration captured for 128 time steps.

Undoubtedly the most famous example of a CA is Mike Conway's Game of Life, a 2-D model. Despite extremely straightforward rules of operation, it yields surprising complexity; it has long since been proved Turing Complete, meaning it is able to carry out any computation that any other TC model can, and is therefore subject to a number of undecidability and incompleteness proofs that have been established by Turing, Godel, Rice, and others over the 20th century.

For example, these results mean that it is impossible, in general, to know if one arbitrary configuration will ever lead to some other specific configuration; in fact, for infinitely many starting configurations, it is impossible to say anything of consequence about their eventual behavior, including whether they will ever stop changing or repeat themselves. But I digress. Excellent overviews of such impossibility proofs and an enormous amount of material are readily available on the Game of Life, so we'll leave that be.

(ignore below)

hi-res shots of Rule 54 structure flowing from a single toggled bit

]]>On Facebook, in response to this article that's been circulating today for some inscrutable reason, one A. Melaragni said:

Apparently the guy is just talking about our own galaxy; there are BILLIONS AND BILLIONS of galaxies in the universe. Not to mention that even if other life doesn't exist in our galaxy now, that doesn't mean it never did or never will. There is just so much room out there that I think it would be bizarre if life DIDN'T exist elsewhere.

Which got me thinking. In the absence of proof positive one way or another about the existence of life elsewhere or elsewhen, we can still do some perfectly legitimate Bayesian reasoning based on what we've [not] observed, and how the universe seems to work.

While I don't really buy it, I'll concede there are some legitimate reasons to think we might be the only life anywhere. The only other plausible option is that there is an abundance of life: is, was, and will be.

My reasoning is that physics doesn't do one-offs. You don't see a new kind of particle exactly once and never again, you don't see one wildly unique type of stellar body sitting among trillions of others in our galaxy. In general, there are strong mathematical reasons why if anything happens once, it (or something similar enough) will most likely happen twice. And if there's anything less likely than someone happening just once, it's something happening just twice, which would make a mockery of probability.

So, I figure it's probably not just us out here, and if that's the case, it's *certainly* not going to be just us and one or two other planets of life. Let's consider the two main possibilities.

In the scenario where intelligent life happens, and where you accept my assertion that there will probably be a lot of it, we can draw some conclusions. Crucially, **we are unlikely to be unusual**, within the range of all life that eventually takes form. Barring a meddling God, the various salient characteristics of a lifeform—longevity, intelligence, size, high reliance on optical/EM sensory input, overall temperament—are gonna end up distributed as big fat Gaussians, and we're gonna be right near the big fat middle of them for most of these things. Yes, there will be truly alien and bizarre creatures out on the fringes, so that's fun, but it's not us.

This also applies to our timeline of development of technology relative to others. We are one species selected at random from all those who have or will exist. There are some cosmological reasons why we might be one of the earlier ones, but I feel that's heavily outweighed by the implicit probabilistic evidence under discussion; if there are to be a million different life-bearing planets, what are the odds we're the very first to start to get our shit together?

Then of course you run into your Fermi paradox, which on the whole is pretty ominous. Without getting too sidetracked in that, it strongly suggests that either we'll never make it to the stars, or when we do, it will be in a form or mode unrecognizable by present-day us.

To recap:

- If there's any life besides us, there's probably a shitload of it.
- If there's a shitload of it, lots of them probably have a huge head start.
- For whatever reason, they're all gone, unrecognizable, or (at best) undetectable.
*Without exception*. This implies that whatever the attractor at work might ultimately be, it is likely inevitable and undeniable.

Basically, for anyone who dreams of 50s-scifi-style cruising around the galaxy and meeting aliens, give it up. Either there aren't any out there, or there is some overwhelmingly strong reason not to make contact on those sorts of terms, or there'll be some pesky obstacle like inexorable self-annihilation in the way. If we ever do make it to the tipping point where we have the social and engineering infrastructure for interstellar flight, and start doing it up in earnest, I figure that's the nail in the coffin for there being any other life out there. So, the other scenario:

In this scenario we are, somehow, a black swan—most likely, there would still be an infinite number of aliens in our infinite universe, but they'd be so negligibly rare as to essentially guarantee none in our light cone, which means they effectively don't exist. There's not a whole lot to say on this other than to point out the silver lining—that the massive, invisible, all-subsuming agent watching us hungrily from behind the curtain of the Fermi paradox would suddenly become a non-issue.

It may be a lonely existence, but this scenario is one in which there's no especially good reason why we can't go exploring all over observable space and spread like cancer. There just won't be all that much more to do out there, at least until we say "okay, fuck it" and seed some new form of life deliberately. I think that, of the two scenarios, this is actually the better deal, as it sidesteps the many and sundry sinister explanations for the current deafening silence.

**Note to self**: there was something worth exploring there that I skipped by. Given certain kinds of systems governed by a relatively small set of rules but seeded with some level of random initial conditions or ongoing perturbations, is it in fact true that it can be less likely for something to happen twice than to happen once? And if so, how much must a thing have to happen before the probabilities pull even again? I think there could be depth there. Maybe more weight behind %0%, %1%, and %oo% being the only relevant quantities of things. Which is arguably already well established. And %oo% is just a gussied-up %0%.

But I'll go digress.

]]>Three new recent discoveries.

it turns out that you can generate a complete list of primes in the following fashion.

Using %n% where %p + q = 2n%, let %n = 2% and find any Goldbach partitions %p, q%; if %max(q in bb Q)% is not in set %bb P%, add that %q% to %bb P%. This will generate a set %bb P% such that %bb P% contains every prime up through %2n-3%, in order. Furthermore, the corresponding values of %p% will be small. I haven't determined exactly, but I am confident of %p <= sqrt(2n)% as an upper bound.

Let %q% be some prime not yet in %bb P%. The first possible inclusion of %q% in a Goldbach pair will be %3 + q = 2n%, and as soon as %n = (q + 3)/2%, you will have that %q%, and it will be the largest such %q% for that %n%.

There are repeats (corresponding to prime gaps) and even backtracks of the largest %q% (deserving further investigation), but they necessarily consist of primes already in %bb P%.

the Cappallo-Goldbach Conjecture, refining the domain of %p% and %q% to make a stronger assertion:

%sf"For any even " N > 2,sf" there exist primes " p, q sf" such that "% %p + q = N sf" and " sqrt(N) <= p <="N//2" - sqrt(n).%>

If true, this bound is an improvement on the original %2 <= p <= N//2 <= q <= N-2%.

Or, more simply: there will always be a valid %p% such that %p >= sqrt(N)%.

Note also that there need not be a %p < sqrt(N)%; the first such counter-example is found at %N = 98%, yielding %19 + 79% as the smallest of three partitions.

I have not proven why this should be, but confirmed it empirically through %N = 5000%. I find it suggestive that %N - sqrt(N) = sqrt(N)(sqrt(N) - 1)%, which is reminiscent of the binomial expressions which can also be used to identify primes.

For instance, it appears likely (see Lehmer's totient problem) that for any prime %p%, it must be possible to factor %p - 1% into an arbitrary number of terms of the form %(k^n - k^(n-1))%, where %k% is a unique prime and %n >= 1, n in NN%.

This also likely connects to results such as the %i%th row of Pascal's Triangle consisting only of multiples of %i " iff " i% is prime: or equivalently, that %p in bbb P iff {p:" " p | ((p),(i)) :}, 0 < i < p}%.

**the set of all Goldbach numbers have a bijective mapping to the set of all semiprimes.**

The numbers going across the top represent our %n%. Visually, if one drops a vertical line to meet the diagonal, and then travels on a down/left diagonal, any intersections crossed represent Goldbach pairs.

For example, the line dropped from %8% hits the intersection of the horizontal %11% line and the vertical %5% line, meaning that %5 + 11 = 2(8)% is a solution; the same applies to %3 + 13 = 2(8)%. Since every semiprime can be represented as intersections on this sort of diagram, and every Goldbach pair will likewise be generated once, when intersecting a semiprime, the mapping becomes apparent.

Alternatively, consider that any semiprime is the product of two specific primes, and conversely, any two primes form a unique semiprime. Likewise, any Goldbach pair consists of two specific primes, and while their sum may not be unique among Goldbach pairs, that pair will still only be used a single time in the set of all Goldbach pairs, providing our one-to-one map. As addition and multiplication are both commutative, ordering is irrelevant; it suffices to either always lead by the smaller/larger of the two operands, or allow both orderings when taking the sum or product.

(Under construction.)

For starters, it is known that there are infinite primes, thus infinite semiprimes, which in turn provides another proof of infinite Goldbach pairs. I also reason that since any given even number has a finite number of Goldbach pairs, there must be an infinite number of Goldbach sums—that is, even numbers with at least one Goldbach pair. If there were only a finite number of these sums, each with a finite number of Goldbach pairs, we would lose our bijection. Unfortunately, this is necessary but not sufficient to prove Goldbach's Conjecture, as the infinitude of sums says nothing about their frequency within %NN%.

There should be other low-hanging fruit here. For example, since it is ...

It is interesting that while the product of two primes is a unique integer, the sum of a Goldbach pair is typically not (e.g. %3+13=5+11=16%). However, because of our one-to-one mapping, we know that the number of semiprimes must equal the number of Goldbach sums.

Let's look at bounding %p% and %q%. If we choose some %C% such that %p <= q <= C%, we yield %(pi^2(C)+pi(C))/2% semiprimes / Goldbach pairs.

(Ignoring %2%s for the moment...)

Let %C=7%.

- %3 * 3 = 9%
- %3 * 5 = 15%
- %3 * 7 = 21%
- %5 * 5 = 25%
- %5 * 7 = 35%
- %7 * 7 = 49%
- %3 + 3 = 2(3) = 6%
- %3 + 5 = 2(4) = 8%
- %3 + 7 = 2(5) = 10%
- %5 + 5 = 2(5) = 10%
- %5 + 7 = 2(6) = 12%
- %7 + 7 = 2(7) = 14%

Let %C = 17%. We obtain the following semiprimes:

- %3 * 3 = 9%
- %3 * 5 = 15%
- %3 * 7 = 21%
- %3 * 11 = 33%
- %3 * 13 = 39%
- %3 * 17 = 51%
- %5 * 5 = 25%
- %5 * 7 = 35%
- %5 * 11 = 55%
- %5 * 13 = 65%
- %5 * 17 = 85%
- %7 * 7 = 49%
- %7 * 11 = 77%
- %7 * 13 = 91%
- %7 * 17 = 119%
- %11 * 11 = 121%
- %11 * 13 = 143%
- %11 * 17 = 187%
- %13 * 13 = 169%
- %13 * 17 = 221%
- %17 * 17 = 289%

All semiprimes will be unique, and they will be bounded above by %C^2% and below by %3^2%.

We obtain the following Goldbach partitions:

- %3 + 3 = 2(3) = 6%
- %3 + 5 = 2(4) = 8%
- %3 + 7 = 2(5) = 10%
- %3 + 11 = 2(7) = 14%
- %3 + 13 = 2(8) = 16%
- %3 + 17 = 2(10) = 20%
- %5 + 5 = 2(5) = 10%
- %5 + 7 = 2(6) = 12%
- %5 + 11 = 2(8) = 16%
- %5 + 13 = 2(9) = 18%
- %5 + 17 = 2(11) = 22%
- %7 + 7 = 2(7) = 14%
- %7 + 11 = 2(9) = 18%
- %7 + 13 = 2(10) = 20%
- %7 + 17 = 2(12) = 24%
- %11 + 11 = 2(11) = 22%
- %11 + 13 = 2(12) = 24%
- %11 + 17 = 2(14) = 28%
- %13 + 13 = 2(13) = 26%
- %13 + 17 = 2(15) = 30%
- %17 + 17 = 2(17) = 34%

All Goldbach pairs will be unique, and their sums bounded above by %2C% and below by %2*3%.

I have been working through this without knowing where I was going, but now that I re-read this last part, it strikes me that the Goldbach bounds are the derivatives of the semiprime bounds with respect to the bound.

I believe the first observation is essentially trivial number manipulation and contains no new useful truth. I have not yet formed an opinion about the utility of the second.

]]>*This is something I worked on a year ago, so I'll keep it (relatively) brief.*

There's a keypad on my apartment building which accepts 5-digit codes. One day on the way in, I started thinking about how long it would take to guess a working code by brute force. The most obvious answer is that with five digits, you have %10^5 = 100,000% possible codes; since each one of these is 5 digits long, you're looking at %500,000% button pushes to guarantee finding an arbitrary working code.

But then I realized you could be a little smarter about it. If you assume the keypad only keeps track of the most recent five digits, you can do a rolling approach that cuts down your workload. For example, say I hit **12345 67890**. If the keypad works as described, you're not just checking two codes there, you're checking 6: 12345, 23456, 34567, 45678, 56789, 67890.

The next natural question was how many button-pushes this could actually save. After doing some work, I satisfied myself that you could cut it down by a factor of %~D%, where %D% is the number of digits in a code. So if you typed the right digits in the right order, instead of the %500,000% before, you're only looking at %100,004% (you need an extra 4 to wrap the last few codes up).

The *next* natural question was: how do you actually come up with that string of digits? It has to be perfect, in that it contains every possible 5-digit combination without repeating any of them. As with almost any exploratory problem, the best approach is to simplify as much as possible. For instance, consider a binary code three digits long, which only has %2^3 = 8% different codes: %{000, 001, 010, 011, 100, 101, 110, 111}%.

My formula suggested you should be able to hit all eight of those in a %2^3 + 3 - 1 = 10% digit string, and it's easy enough to put one together by a little trial and error: **0 0 0 1 1 1 0 1**. I found it was easiest to treat these strings as cyclical, so the **0 1** at the end wrap around to give you **0 1 0** and **1 0 0**. As a bonus, any rotation of this string will work just as well.

As I scaled the problem up, however, more and more things became clear. First, above a certain point, you start getting multiple viable optimal strings that are not simple transformations of one another. Second, finding an elegant way to generate these strings was not turning out to be easy.

I found one mechanical way of generating a valid string that worked, but I didn't love it. If you list all the combinations you have to cover, and then slot each combination into a buffer greedily, meaning the earliest spot where it can fit (potentially with overlap), it works out.

E.g.:

At some point I realized the generative process could be viewed as a directed graph, the nodes representing an N-length code, its successors delineating alternatives for continuing the string. After a few attempts, I got a pretty clear-looking one down (the node labels 0-7 are standing in for their binary counterparts):

As it turns out, you can start on any node and if you trace a Hamiltonian cycle—touching each vertex only once and ending back at the start—the numbers you hit along the way form a valid optimal string. This approach also scales with the parameters of the problem, but requires a messier or multi-dimensional graph.

Whenever I stumble into open-ended problems like this, I avoid Googling for the answer at first, because what's the fun in that? I was pretty sure this would be a solved problem, though, and after spending a while working this through myself, I looked for and found Wikipedia's page on De Bruijn sequences. As usual, I was beaten to the punch by somebody a hundred years earlier. Hilariously, however, in this case the results matched up better than expected. Check out the Wiki page, notably a) the graph on the right, and b) the first line of the section "Uses".

If you want to see the wild scratch pad which brought me from point A to point B, by all means, enjoy.

]]>*As usual, the main ideas here are wildly speculative, written down as food for thought. I am not claiming to be a cosmetologist or astrologer.*

For the purposes of this post, let's suppose that the universe is ultimately discrete. By this, I mean that when you get down to its most fundamental level, there are "pixels" in one way or another, some ultimate underlying property of everything which is atomic and indivisible, digital and not analog. If the uncertainty inherent in quantum mechanics drives the deepest level, then the rest of this may not apply, but it is certainly possible that there are deeper underlying forces yet to be identified, so we don't know yet.

Note that the discreteness of the universe does not preclude space or time from being infinite (although it is arguably an infinity of a lower order). However, I'm about to suggest here that it *does* link the finite-ness of space and time inextricably: either they are both infinite, or both finite.

Consider a discrete universe with a finite volume, but lasting forever. Any such universe could be reasonably treated as an enormous finite-state machine. No matter how vast it might be, there would be only so many possible configurations it could take on before repeating itself.

If its physics are deterministic—configured such that any given arrangement of "pixels" (or "atoms" or "bits") necessarily defines its successor—it would use only a tiny fraction of all possible states. Even if there is some purely random influence at that level, and it could conceivably reach every single possible state, there would still be a limit to the number of possible states.

Granted, it would be enormous; for example, assuming a universe comprising %10^100% bits, there would be %2^(10^100)% possible configurations for it to be in at any given moment. Note that this is a big fucking number; we're talking %~30000...0% possible states, where the %...% is replaced by about %1,000,000,000,%%000,000,000,%%000,000,000,%%000,000,000,%%000,000,000,%%000,000,000,% %000,000,000,% %000,000,000,% %000,000,000,%%000,000,000,%%000,000,000% *digits*.

If we consider its progression over time, we're looking at all the permutations of those configurations, representing an upper bound on every possible narrative the universe could follow, of which there would be %2^(10^100) !% configurations. The number above, which we could not write because it has more digits than the number of electrons in the observable universe, is the number of *digits* of this new, incomprehensibly larger number. (Yet still finite, so overall, pretty small as numbers go.)

But I digress. So let's focus on the deterministic case, which seems likely to be the correct one in a discrete universe. If we have a finite number of bits, that means that sooner or later, we're bound to come back to an exact state that already occurred in the past. That point necessarily defines a set of states through which the universe will cycle, over and over, forever.

Each cycle will only take a finite amount of time. This means that time cannot truly be termed infinite, since there would be absolutely no reference point by which a state occurring in one cycle could be distinguished from the same state in the next cycle. Time would be a loop of finite length. Thus: *in a deterministic universe, finite space implies finite time*.

What was less clear to me is whether the converse follows, that finite time implies finite space. The symmetry of space and time biases me strongly towards thinking this is probably the case, but let's look at it.

In the finite-space situation above, I claim that time is effectively finite because there are a limited number of distinct states, rendering it meaningless to speak of any time beyond that necessary to differentiate each of them. In a finite-time universe containing infinite space, we might be tempted to look for the same general pattern; a finite cycle such that regardless of distance traveled (rather than time elapsed), there are a limited number of possibilities one can end up with.

Cue relativity.

Let's consider our infinite-space universe as an infinite number of bits spiraling out from a point-source observer (you). Pretend there is no speed of light limiting things in this universe. Even with a finite lifespan, the universe would still very much be spatially infinite, both in the sense of being able to observe every single bit in that infinite string, and secondarily the possibility of traveling an arbitrary distance within it so that you are now surrounded by an arbitrarily large amount of completely different information than wherever you started. This is clearly not analogous to the finite-space situation above.

How might we fix that asymmetry between time and space if we were designing a universe and wanted to keep things simple and balanced? Mix in relativity. With the speed of light governing things, any observer is now limited to a light cone demarcating a strict upper bound on the observable universe; this is equivalent to establishing a finite limit to the number of bits and consequently number of possible states perceivable for a finite interval of time.

In that sense, the invariant nature of the speed of light seems almost specifically tailored for the *purpose* of linking space and time together. All of the fun relativistic side effects you end up with are logical necessities conspiring to limit the information available starting at any point in space over a given period of time. Thus: *in a deterministic universe, finite time implies finite space—so long as you have relativity!*

Finally, the logical corollary to these assertions, taken together, is that infinite space implies infinite time and vice versa.

**Conclusion:** Maybe that's why relativity.

I've been trying to visualize the problem in many different ways, tables and graphs and geometry and anything else that seems plausible. Here's one I nailed down this morning that's concise and readily graspable for any coder.

First, ignore the %2%. It just gets in the way.

And then we get going. Starting with %3%, we'll build up a list of primes and I'll show how you can easily check for Goldbach numbers while you do. As you build a string of bits, on each new number, you reverse the bits you have so far and slide it over one.

Much easier understood with a demonstration, starting with %n=6 => n/2=3% and incrementing %n=n+2%, as it must be even.

```
n=6 3
prm? 1
rev 1
n=8 3 5
prm? 1 1
rev 1 1
n=10 3 5 7
prm? 1 1 1
rev 1 1 1
n=12 3 5 7 9
prm? 1 1 1 0
rev 0 1 1 1
n=14 3 5 7 9 11
prm? 1 1 1 0 1
rev 1 0 1 1 1
n=16 3 5 7 9 11 13
prm? 1 1 1 0 1 1
rev 1 1 0 1 1 1
n=18 3 5 7 9 11 13 15
prm? 1 1 1 0 1 1 0
rev 0 1 1 0 1 1 1
n=20 3 5 7 9 11 13 15 17
prm? 1 1 1 0 1 1 0 1
rev 1 0 1 1 0 1 1 1
```

All we're doing with rev is reversing the order of the "prime?" bits. If you watch from one to the next, you can see that this is actually resulting in a kind of ongoing shift each time. Observe:

```
prm? 1 1 1 0 1 1 0 1 1 0 1 0 0 1
n 3 5 7 9 11 13 15 17 19 21 23 25 27 29
6 1
8 1 1
10 1 1 1
12 0 1 1 1
14 1 0 1 1 1
16 1 1 0 1 1 1
18 0 1 1 0 1 1 1
20 1 0 1 1 0 1 1 1
22 1 1 0 1 1 0 1 1 1
24 0 1 1 0 1 1 0 1 1 1
26 1 0 1 1 0 1 1 0 1 1 1
28 0 1 0 1 1 0 1 1 0 1 1 1
30 0 0 1 0 1 1 0 1 1 0 1 1 1
32 1 0 0 1 0 1 1 0 1 1 0 1 1 1
```

Notice all we are doing, for each new line, is taking %prm?(n)%, sticking it at the beginning of the new line, and shoving over all the rest. Same as above, but easier to see here. Note also the prime bits read down vertically in the regular order.

The thing is, you can check for any and all Goldbach numbers by simply bitwise ANDing two corresponding strings together. (This may actually be an efficient way to code this, should such a need arise.) Take %n=28%:

```
3 5 7 9 11 13 15 17 19 21 23 25
prm? 1 1 1 0 1 1 0 1 1 0 1 0
rev 0 1 0 1 1 0 1 1 0 1 1 1
-----------------------------------------
& 0 1 0 0 1 0 0 1 0 0 0 1
```

The resulting bits identify which numbers are "Goldbach numbers," primes which will sum to %n%. In this case we get two pairs:

%5 + 23 = 28%

%11 + 17 = 28%

This process will give you all valid Goldbach numbers for a given %n%.

```
prm? 1 1 1 0 1 1 0 1 1 0 1 0 0 1
n 3 5 7 9 11 13 15 17 19 21 23 25 27 29
6 (1) 3+3=6
8 (1 1) 3+5=8
10 (1 (1) 1) 3+7=10 5+5=10
12 0 (1 1) 1 5+7=12
14 (1 0 (1) 1 1) 3+11=14 7+7=14
16 (1 (1 0 1 1) 1) 3+13=16 5+11=16
18 0 (1 (1 0 1) 1) 1 5+13=18 7+11=18
```

...and so on. I added the parentheses to emphasize the pairs, but it's just matching up by working from the outsides to the center. It did make me notice something I should have seen before, which is that if you don't want to bother with all the reversing and stuff, you accomplish the same thing by taking the prime bit string, counting in from both sides, and watching for matches that are both 1. Same exact thing.

The astute observer may point out that I'm not actually *doing* anything in this write-up, which is true. Especially viewed this most recent way, all we're doing is literally taking the Goldbach Conjecture by its definition, and looking manually for pairs of primes that add up. If there is something to be gained from this approach, I suspect it lies in study of the "sliding" nature of the reversal version, and seeing how it conspires not to leave holes.

This is almost certainly biased on my hunch that Goldbach's undoing lies in a thorough examination of the mechanism driving its avoidance of symmetry; to violate Goldbach, there would have to be an instance of a certain symmetry, which would in turn be representative of a periodic pattern to a degree verboten to primes.

But hey, that's just me.

]]>This is a quick story about today's thing that I discovered that is already in Wiki. I must be up in the triple digits at this point. That said, this is one of the more obvious ones.

I was reading a thread about which sorting algorithm could be considered "best," and someone mentioned a couple of algorithms which allegedly run in %O(n log log n)% time. This came as a shock, since I'd thought %n log n% was the brick wall for an honest sort, and I wondered if these other sorts were for real, whether that would imply the existence of a linear time sort.

I tried to imagine what the lower bound would be, and figured there must be a minimum number of comparisons required for some number of elements. Didn't take long to get from there to realizing that %n% integers can be arranged in %n!% different permutations, which I reasoned meant that one must gather enough information (read: take comparisons) from the data to uniquely identify which of %n!% different arrangements the sort is in.

That, in turn, screams out for a log, specifically %log_2(n!)%. If the sort permutation is truly random, then on average, we should expect to be able to identify it from %log_2(n!)% bits (read: comparisons, again.) To be a little more precise, I guess it'd be more like \[ \frac{\lceil{n \log_2 (n!)}\rceil}{n} . \]

I cheated a little here and plugged %lim_{n->\infty} [log_2(n!)]% into Wolfram Alpha, and it was clear the dominating factor is, surprise surprise, %n log n%. As for those mystery %n log log n% algorithms, they were tough to track down too, and there seemed to be a lack of final consensus on the issue. Due to the math described herein if nothing else, they must operate with some limitation or another on domain or input, assuming they do work.

Later, seeing the bottom of the Wikipedia page on sorting algorithms, I saw all of this done similarly, with the weird surprise that once you get to %n=12%, the maximum number of comparisons required suddenly goes up by one inexplicably, requiring %30% comparisons where we predict %29%. Sadly, the footnote links explaining those aberrations were in journals behind paywalls, but the gist seemed to be that it was a bit of an open (or at least highly non-trivial) question.

]]>So here's how I went down the information theory rabbit hole this morning.

**How much information is in a bit?**

You might say '%1% bit.' I am inclined to agree.

**How much information is in two bits?**

My answer: it depends.

Disregarding any other considerations, and *on average*, I would agree that two bits stores two bits of information. However, let's say that the first bit you receive from some black box is a %0%. (Or it could be a %1%, and after inverting the rest of the following argument, it would yield identical results, which perversely *almost* makes it seems as though that first bit carries no information after all.)

But I digress. So, you have your %0%. Now, out comes another %0% as the second bit.

A word on this black box. You don't know what's inside it. It could be a friend connected through the internet, it could be a random number generator, it could be sequential bits from a vacation picture on your hard drive, it could be aliens trying to make contact, you name it.

Now, the specific details of the type of distribution and manner of bit-generating agents that might be in there is actually an absolutely critical point which is not to be glossed over, and so I will cheerfully completely disregard it, except to say that for our purposes, we will allow in a hand-wavy way that all the possibilities listed each represent valid non-negligible possibilities, as that is the spirit of this thought experiment.

Moving on. *A priori*, the second bit (%b_2%) could have been a %0% or a %1%. To be able to estimate which was more likely, without knowing the details about what the probability distribution of possible bit-generating entities actually is, is arguably impossible, but with the barest of assumptions, this is no longer the case. I maintain that a second %0% was indeed the more likely bit to arrive, and by a significant margin, too.

Consider it through the lens of entropy. *Whatever* is in the box, it will be emitting bits following some kind of overarching guidelines, which you can consider ranging from "perfectly ordered" to "perfectly entropic", the latter being more or less synonymous with "completely random." If it is a true RNG in there, then your odds are 50/50 for each bit that comes out, including %b_2%.

However, if it is something highly ordered, say a simple computer program, it may well be printing an endless stream of zeros. As was strongly indicated by my previous excursion into the cartography of computation space, repeated streams of bits are the single easiest thing to generate with any kind of reasonable computational framework; note that computational framework here can refer to anything from a programming language to the behavior of an ant colony on down to physics itself. It simply takes less work to generate some numbers than others. So, all things being roughly equal, I submit that you'll find a second 0 more often than not. This represents the more ordered possibility, while %b_2=1% would be the more entropic possibility.

Because of these probabilities, the underlying information content of that second bit is changed. After all, if something very likely happens, you've gained very little information; if the sun rises tomorrow, you can't conclude much from it. If the sun were not to rise, well, that contains a LOT of information, information you'd get to enjoy for maybe a month or two before the atmosphere freezes over.

Let's try a more mathematical but equally stark demonstration of the same essential argument. Consider: you watch the box for days, and all it does it spit out %0%s, one after another. Millions of them. Each new zero says even less than the one before it (or, arguably, devalues all bits received retroactively—probably moot.)

And then, one day, you wake up and see a single %1% came through. Being a man or woman of science, you eventually think to count exactly how many zeros came out before the %1%, and to your amazement, there are %3,141,592,654% zeros, which you naturally immediately recognize as %\pi \times 10^9%. Even setting aside any deeper implications about what that might portend, it is now obvious that the odds are extremely good that there is an agent or process which is highly structured at work in the box.

All that from a single bit. Clearly that %3,141,592,655^(th)% bit contained more information than the millions before it. You might say it shares its information content among all the bits before it, but it has still, by itself, made an enormous net change to the Shannon entropy and Kolmogorov complexity of the data. As a lower bound, you might say that in such a pattern, where you have %n% %0%s followed by a single %1%, that %1% contains at least a ballpark figure of %log_2 n% bits of actual information, that being the amount of information inherent in the number generated. E.g., %0000 0000 1% gives you %8% zeros, and %log_2 8 = 3%, which is the number of bits you'd require to generate a number between %1% and %8% inclusive.

**So what?**

I don't know. Stuff is neat.

(Maybe I'll show how primes fit into all this tomorrow!)

]]>A couple of years back, I wrote a little program named Skynet. It uses a configurable set of assembly-like instructions to spawn tiny programs, cobbled together completely at random. The original idea was to be able to feed it a set of input and output integer lists representing some function you were trying to find, and have it then blindly fumble its way through millions or billions of those micro-programs until it hit on one that satisfied your conditions.

For example, if you wanted some code that would exponentiate, you might feed it a configuration file signifying

```
Input 1: 2 3
Output 1: 8
Input 2: 4 3
Output 2: 256
```

and it would run forever until it stumbled upon a solution, or you killed it.

And it *worked*, and it was fun to screw around with, if you like that sort of thing. Feel free to grab it from the link above, but it was a quicky that I wrote for myself, so it might be rough to figure out. It's Turing-complete, but it doesn't do anything very clever—there's no fancy machine learning algorithms here, just straight random goodness—but you can specify a few parameters to try to guide its efforts, such as minimum and maximum program length to try, or how many instructions to execute before giving up on a microprogram as a runaway and moving on. I think it also tries to spot and discard overtly obvious dead ends and NOOPs.

So yes, it works as intended, but the problem is that the realm of all possible computable functions is one of those seriously immense virtual universes that tend to pop up when dealing with combinatorics or math in general. You might have 16 instructions enabled at one time, but instructions at the assembly level don't accomplish a hell of a lot, and when you consider a branching factor that big, all of a sudden you're looking at %16^10% possible microprograms for even a modest 10-instruction snippet of code.

That's just over a trillion, many of which are likely to be non-halting and thus run several hundred instructions before bailing. The upshot is that to exhaustively try all of them, you're probably talking days or even months, depending on how optimized it all is. After quickly discovering the futility of systematically enumerating them, I said "fuck it" and resdesigned things with its current random approach, perfectly happy to try out 50-instruction programs if the pRNG requests it. Of course, if there were actually some program out in the ether that needed to be 50 instructions long, the odds of running into it through chance are almost nothing—you could expect to hit it around the same time the last protons and neutrons are decaying.

But I digress. A lot of simpler algorithms can be squeezed into a tractable space.

- Fibonacci sequence
- factorials
- doubling, cubing
- for loops
- adding pairs of input numbers on the stack
- finding the mean of input numbers
- determining divisibility of input numbers by 7
- reversing order of stack

- primality detection
- prime generation
- misc. other NP problems
- bootstrapping into sentience

Some of those successes were much easier than others. Fibonacci, for example, seems to be about the next easiest thing for it to do besides count. The other day I got GitLab CI going which I'm abusing mercilessly by having it spawn Skynet jobs instead of actually testing anything. But manually, this is how it rolls:

```
[12:00] ~/progs/Skynet> time ./4bit --file fib.skynet -r
fscanfs return 6111111111
low=4 up=8 exact=0 stopAfter=1 maxSteps=50

]]>Input 1:

Output 1: 1 1 2 3 5 8 13

File loaded successfully.

Randomizing instructions...

[ INS SWAB MULT PUSH POP INC ADD JNE NAND DEC PEEK SKIP IFF NOT LOCO_1 MOD ]

Satisficing program (length 5): INC PUSH ADD SWAB JNE -=- Tracing operation -=-

```
[init a=0 b=0 c=0 pc=0 sp=1] ( __ )
INC [1 a=1 b=0 c=0 pc=1 sp=1] ( __ )
PUSH [2 a=1 b=0 c=0 pc=2 sp=2] ( __ 1 )
ADD [3 a=1 b=0 c=0 pc=3 sp=2] ( __ 1 )
SWAB [4 a=0 b=1 c=0 pc=4 sp=2] ( __ 1 )
JNE [5 a=0 b=1 c=0 pc=0 sp=2] ( __ 1 )
INC [6 a=1 b=1 c=0 pc=1 sp=2] ( __ 1 )
PUSH [7 a=1 b=1 c=0 pc=2 sp=3] ( __ 1 1 )
ADD [8 a=2 b=1 c=0 pc=3 sp=3] ( __ 1 1 )
SWAB [9 a=1 b=2 c=0 pc=4 sp=3] ( __ 1 1 )
JNE [10 a=1 b=2 c=0 pc=0 sp=3] ( __ 1 1 )
INC [11 a=2 b=2 c=0 pc=1 sp=3] ( __ 1 1 )
PUSH [12 a=2 b=2 c=0 pc=2 sp=4] ( __ 1 1 2 )
ADD [13 a=4 b=2 c=0 pc=3 sp=4] ( __ 1 1 2 )
SWAB [14 a=2 b=4 c=0 pc=4 sp=4] ( __ 1 1 2 )
JNE [15 a=2 b=4 c=0 pc=0 sp=4] ( __ 1 1 2 )
INC [16 a=3 b=4 c=0 pc=1 sp=4] ( __ 1 1 2 )
PUSH [17 a=3 b=4 c=0 pc=2 sp=5] ( __ 1 1 2 3 )
ADD [18 a=7 b=4 c=0 pc=3 sp=5] ( __ 1 1 2 3 )
SWAB [19 a=4 b=7 c=0 pc=4 sp=5] ( __ 1 1 2 3 )
JNE [20 a=4 b=7 c=0 pc=0 sp=5] ( __ 1 1 2 3 )
INC [21 a=5 b=7 c=0 pc=1 sp=5] ( __ 1 1 2 3 )
PUSH [22 a=5 b=7 c=0 pc=2 sp=6] ( __ 1 1 2 3 5 )
ADD [23 a=12 b=7 c=0 pc=3 sp=6] ( __ 1 1 2 3 5 )
SWAB [24 a=7 b=12 c=0 pc=4 sp=6] ( __ 1 1 2 3 5 )
JNE [25 a=7 b=12 c=0 pc=0 sp=6] ( __ 1 1 2 3 5 )
INC [26 a=8 b=12 c=0 pc=1 sp=6] ( __ 1 1 2 3 5 )
PUSH [27 a=8 b=12 c=0 pc=2 sp=7] ( __ 1 1 2 3 5 8 )
ADD [28 a=20 b=12 c=0 pc=3 sp=7] ( __ 1 1 2 3 5 8 )
SWAB [29 a=12 b=20 c=0 pc=4 sp=7] ( __ 1 1 2 3 5 8 )
JNE [30 a=12 b=20 c=0 pc=0 sp=7] ( __ 1 1 2 3 5 8 )
INC [31 a=13 b=20 c=0 pc=1 sp=7] ( __ 1 1 2 3 5 8 )
PUSH [32 a=13 b=20 c=0 pc=2 sp=8] ( __ 1 1 2 3 5 8 13 )
ADD [33 a=33 b=20 c=0 pc=3 sp=8] ( __ 1 1 2 3 5 8 13 )
SWAB [34 a=20 b=33 c=0 pc=4 sp=8] ( __ 1 1 2 3 5 8 13 )
JNE [35 a=20 b=33 c=0 pc=0 sp=8] ( __ 1 1 2 3 5 8 13 )
INC [36 a=21 b=33 c=0 pc=1 sp=8] ( __ 1 1 2 3 5 8 13 )
PUSH [37 a=21 b=33 c=0 pc=2 sp=9] ( __ 1 1 2 3 5 8 13 21 )
ADD [38 a=54 b=33 c=0 pc=3 sp=9] ( __ 1 1 2 3 5 8 13 21 )
SWAB [39 a=33 b=54 c=0 pc=4 sp=9] ( __ 1 1 2 3 5 8 13 21 )
JNE [40 a=33 b=54 c=0 pc=0 sp=9] ( __ 1 1 2 3 5 8 13 21 )
INC [41 a=34 b=54 c=0 pc=1 sp=9] ( __ 1 1 2 3 5 8 13 21 )
PUSH [42 a=34 b=54 c=0 pc=2 sp=10] ( __ 1 1 2 3 5 8 13 21 34 )
...
```

Total=357252 Timeouts=206816 Redundant=6694

real 0m0.130s

user 0m0.127s

sys 0m0.002s

[12:00] ~/progs/Skynet>

```
</div>
As may or may not be clear, it spits out its working program at the top, and then walks you through a trace execution.
One of my discoveries was that program length matters a lot less than I would have guessed. You can set your program length bounds to look for stuff 30-50 instructions long, and it'll still pop out a Fibonacci program after a second. This is presumably because when you throw shit randomly together, you end up with an abundance of NOOPs (at least with my instruction set.)
To put that in slightly more concrete perspective, there are one or two 5-length versions of Fibonacci that exist, out of something like a million possible programs; put another way, you could say ~0.0002% of 5-lengthers are Fibonacci programs. But as it turns out, if you go out to 20 instructions, that percentage doesn't drop by much—in fact, for some things it climbs. The point I'm trying to make, which surprised me, is that regardless of the program length you're searching for, you'll hit anything provided it's simple enough to fit in that length or simpler, and these types of algorithms seem to occupy a pretty constant market share at any depth.
### Present day
(Well, yesterday. )
I discovered [PyPy](http://pypy.org/) and [Cython](http://cython.org/), things which I should have known about a long time ago but didn't. I wrote Skynet in C because it's second to none for this sort of thing, probably a 30x speed increase over the same thing in vanilla Python. But Cython lets you statically type variables, embed C code, and other wily stuff, while PyPy appears to be a magical JIT compiler that makes your pure Python run 5-10x faster (if it's this sort of project).
Anyway, I wanted to try them out, so last night I ported Skynet back to Python, yielding [skython](https://code.by.tc/machine_learning/skython), a name which I'm half proud of but made me die a little inside to use. After I got the basics working, I started to adapt it for Cython, but realized I would effectively be porting it awkwardly back to C-ish if I continued, so I've stuck with Python and am using PyPy, which seems to run it at maybe 25% of the speed of the C version. In other words, close enough to be an acceptable trade-off. It's not up to snuff yet, but it is working, and it's certainly faster to add features than it was with C.
To wit: I was thinking about the program-market-share concept I mentioned, and it led me to add a method which creates programs randomly *without* looking for any input or output, but rather keeping track of everything it churns out on its own. And this is where we actually rejoin the title and original thematic intent of this post, because the idea there was to explore the distribution of algorithmic information in computation space, by which I mean seeing just how often it opts for some sequences over others when selecting out of the vast set of all computable functions.
#### And?
And it *was* interesting. I gave it some pretty neutral parameters and had it run a billion microprograms. There's lots I could have done (and perhaps will do) to clean up and solidify the results, but there was enough to get a sense of things.
###### A few results
* Half the programs do nothing. Give or take.
* Most of the rest, percentage-wise, print out endless strings of 0s or 1s.
* The last few percent, though, starts to get interesting.
* When you get down to 0.5% or so, you start to get 0101010... and 123456... sequences.
* They're followed close behind by the 22222s and the 110110s and the 012012s... etc.
The first eye-catching result is down at 0.00272% computation-space-share (but this is still in the top percent of frequency):
```

0,1,3,7,15,31,63,127,255,511,1023,2047,4095,8191,16383,32767,65535,131071,262143,524287,1048575,2097151,4194303,8388607,16777215

```
The pattern's clear enough. A little ways on, down at 0.00045%, I spot a 1, 4, 7, 10, ... along with the first Fibonacci. Down further, I recognize a pattern that came up in some math problem I worked on months ago:
```

0,1,2,5,10,21,42,85,170,341,682,1365,2730

```
But as I went deeper, I found sequences that clearly had structure but were foreign to me:
```

1,3,8,20,49,119,288,696,1681,4059,9800,23660,57121,137903

```
I started looking [them](http://oeis.org/A048739) up on [OEIS](http://oeis.org), and found that most of the interesting-looking patterns were indeed in there, and at least frickin' half of them were related in some way to the Fibonacci sequence. Not that this is especially surprising or perplexing in hindsight, I'd just never given much thought to how fundamental the principle is upon which it's built. No wonder that it pops up in nature everywhere; God must have noticed the same thing. So yeah, there are some interesting patterns when you get deep, but I didn't find anything earth-shattering.
#### Zip 'em up!
When I got tired of that, I moved on to a test of algorithmic information theory. The expectation is that the more frequently a program appears, the less complexity it is apt to have. A quick'n'dirty test for complexity is compressibility—the more complex a thing, the less compressible, to the point where any data sufficiently close to random (which is 'perfectly' complex, in this sense) will only get bigger.
I had my big honkin' 88mb data file, sorted by output frequency, so first I split it into equal parts, going from least to most common outputs:
```

[13:17] ~/progs/skython> split -da 2 -n 25 1bil.txt

[13:18] ~/progs/skython> ls -l

-rw-r--r-- 1 tc tc 88516009 Aug 1 09:46 1bil.txt

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x00

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x01

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x02

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x03

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x04

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x05

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x06

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x07

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x08

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x09

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x10

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x11

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x12

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x13

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x14

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x15

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x16

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x17

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x18

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x19

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x20

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x21

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x22

-rw-r--r-- 1 tc tc 3540640 Aug 1 13:18 x23

-rw-r--r-- 1 tc tc 3540649 Aug 1 13:18 x24

```
And then I zipped 'em:
```

[13:18] ~/progs/skython> gzip -9 x??; ll

-rw-r--r-- 1 tc tc 88516009 Aug 1 09:46 1bil.txt

-rw-r--r-- 1 tc tc 851286 Aug 1 13:18 x00.gz

-rw-r--r-- 1 tc tc 850941 Aug 1 13:18 x01.gz

-rw-r--r-- 1 tc tc 851373 Aug 1 13:18 x02.gz

-rw-r--r-- 1 tc tc 852635 Aug 1 13:18 x03.gz

-rw-r--r-- 1 tc tc 849508 Aug 1 13:18 x04.gz

-rw-r--r-- 1 tc tc 857673 Aug 1 13:18 x05.gz

-rw-r--r-- 1 tc tc 853534 Aug 1 13:18 x06.gz

-rw-r--r-- 1 tc tc 852645 Aug 1 13:18 x07.gz

-rw-r--r-- 1 tc tc 854754 Aug 1 13:18 x08.gz

-rw-r--r-- 1 tc tc 854989 Aug 1 13:18 x09.gz

-rw-r--r-- 1 tc tc 855178 Aug 1 13:18 x10.gz

-rw-r--r-- 1 tc tc 848454 Aug 1 13:18 x11.gz

-rw-r--r-- 1 tc tc 846578 Aug 1 13:18 x12.gz

-rw-r--r-- 1 tc tc 844529 Aug 1 13:18 x13.gz

-rw-r--r-- 1 tc tc 849767 Aug 1 13:18 x14.gz

-rw-r--r-- 1 tc tc 849013 Aug 1 13:18 x15.gz

-rw-r--r-- 1 tc tc 846908 Aug 1 13:18 x16.gz

-rw-r--r-- 1 tc tc 852737 Aug 1 13:18 x17.gz

-rw-r--r-- 1 tc tc 847214 Aug 1 13:18 x18.gz

-rw-r--r-- 1 tc tc 821187 Aug 1 13:18 x19.gz

-rw-r--r-- 1 tc tc 760842 Aug 1 13:18 x20.gz

-rw-r--r-- 1 tc tc 752602 Aug 1 13:18 x21.gz

-rw-r--r-- 1 tc tc 727900 Aug 1 13:18 x22.gz

-rw-r--r-- 1 tc tc 692860 Aug 1 13:18 x23.gz

-rw-r--r-- 1 tc tc 605098 Aug 1 13:18 x24.gz

```
...and as expected, they *all* got smaller, because these were text dumps which are always gonna be highly compressible. But that aside, it's evident that gzip had a much easier time with the more common stuff. Again, no surprise, but cool. Note the implicit curve that drops off sharply until you get to the top 75 or 80% of them, where it levels off and stays effectively flat.
#### One last thing to check
I'm beginning to suspect that primes are inherently complex, not just from number theory but in a [Kolmogorov](https://en.wikipedia.org/wiki/Kolmogorov_complexity)-y, robust sense. More Linux tricks:
```

[10:24] ~/progs/skython> for i in {1..30}; do cut -s --delimiter=, -f $i 1bil.txt >> 1bil_vals.txt; done

[10:27] ~/progs/skython> sed 's/(\s|-)//g' 1bil_vals.txt | uniq -c | sort -gr > 1bil_sorted_abs_final.txt

```
which gives a sorted text file of how many times every individual term came up overall, ignoring patterns:
```

[10:30] ~/progs/skython> head -20 1bil_sorted_abs_final.txt

16245701

11338986 1

8543839 0

5732203 2

3037257 3

1661122 4

980248 5

838860 6

582802 7

457519 8

338796 9

289842 10

226633 12

208586 11

161242 13

154787 14

149209 15

140625 16

90908 18

90094 17

```
It's my suspicion that primes will be disproportionately low on this list—as they should be, given how computers work—and a first glance confirmed that somewhat, but then I stopped to write all this.
Okay. That's it. Here's the [big ugly data file](http://test.by.tc/1b_skython.txt.gz) if you want it. Maybe I'll write up my thoughts on an approach to %P=^{?}NP% via transfinite induction tomorrow. Peace out.
```

Shit, I made the mistake of thinking about them, and now I'm back on it.

We'll try the opposite tack this time, though: proving that it's impossible to solve in polynomial time.

Stream of consciousness begins:

- Consider the unconstrained case where %S% is the set of integers provided, of finite

Shit, I made the mistake of thinking about them, and now I'm back on it.

We'll try the opposite tack this time, though: proving that it's impossible to solve in polynomial time.

Stream of consciousness begins:

- Consider the unconstrained case where %S% is the set of integers provided, of finite but arbitrarily large length.
- Likewise, let each element be a randomly chosen finite integer from %\mathbb{Z}%.
- That opens shit up. If we content ourselves for the moment with only considering large sets of large numbers, we see that (well, I'm assuming that)...
- All possible subsets %S' \subseteq S% will sum to between %1% and %2^|S| - 1% unique values, inclusive.
- For example, %S = {0, 0, 0}% can only yield %0%, and %S = {1, 4, 10}% could yield %{1, 4, 5, 10, 11, 14, 15}%.

- Granted, this doesn't narrow things down much. But recall, we're looking at the big case. The question becomes:
- Given %S%, and from that, %|S|% and %max(S) - min(S)%, how can we approximate our expected number of unique sums?

- I don't know. But I
*do*know that the further the range of %S% outstrips the length of %S%, the number of unique sums will asymptotically approach %2^|S|%.- And that seems valuable, so long as we assume that our integers are randomly, normally distributed.

- All possible subsets %S' \subseteq S% will sum to between %1% and %2^|S| - 1% unique values, inclusive.

For example. Let us select a more well-defined %S% of length %10%, and declare that its elements are selected at random from the range %\pm 10^100%. It is then essentially a mathematical certainty that the set of all possible sums will be a set of length %~= 2^|S| - 1 \approx 1023%. Clearly still not able to make a statistically significant dent in our range, and since the numbers were by definition chosen randomly to begin with, their unique sums should also be random. [*N.B. This raises the side-investigation for later of how many times one would have to repeat this process, iteratively generating sets from sums of power sets, beginning with random input, before everything levels out...*]

I don't know if that necessarily follows, but it feels right, especially when dealing with absurdly large numbers. Moreover, if it doesn't follow, that fact likely has implications at least as interesting as anything we're heading toward now.

But let us suppose you can safely assume that the set of all subset sums here is itself truly random (or close enough so as not to matter..., well, I get the sense there's something there.

Ah. Yeah, I think I see it. If we were allowed to cheat and declare that as truly and necessarily random, we could go on to show that the problem becomes indistinguishable from being handed a sequence of random integers and being asked if any of them are a certain value, and doing it *more* efficiently than %O(N)%, which I take as self-evident to be impossible. Note that even with %O(N)% efficiency there, it back-equates to %O(2^N)% in our actual case. I think.

Okay, my reasoning must be flawed, because writing that big-O reminded me that the current best algorithm solves the general case in %O(2^(N//2))% time; thus, either I read that wrong on Wikipedia, or the 40-year-old well-established paper establishing that result is wrong, my understanding of the potential assumptions of the problem are wrong, or my logic chain in the past few paragraphs was wrong. Let's see...

Yep, never mind. I was wildly off about this. The last entry, too.

]]>I have spent time working on almost every impossible problem there is.

This is a lie, but it is rooted in a kernel of truth. I remember my very first impossible problem. When I was eight or nine, my dad gave me a word problem about firemen and houses or some bullshit, but what it boiled down to was:

Draw three points wherever you want on a paper. Label them %A, B, C%.

Draw three other points, also wherever you want. Label them %1, 2, 3%.

Try to draw lines so that %A, B,% and %C% each have three lines going out to %1, 2,% and %3%—and have none of them cross.

And I spent days drawing nonsense like that (thanks Google Images), even after he tried to convey with a straightforward topological proof why it was impossible. But I wasn't convinced until I tried and tried and tried. It was especially painful because as it turns out, you can get all but one last little line to connect.

I'm older now, and have some modest understanding of math, so one might think I no longer fall for traps like this. Not so. I still have to try. But at least now I know on a rational level that it's a lost cause.

The most recent example is a couple of nights ago when I was reading about the subset sum problem. It's about as easy to explain as the problem above; in layman's terms, it's "Figure out how to quickly look through a bunch of positive and negative numbers and see if you can pick some out that add up to 0."

In a slightly more rigorous sense, it's a problem known to be NP-complete (you would think *that* might tip me off). Given an arbitrary set of integers, determine whether or not there is any subset thereof that sums to %N%, where %N% is some given integer (but classically %0%.) The trick, of course, is that you have to do it in polynomial time; the most efficient algorithm offered on Wikipedia requires %O(2^(N//2))%.

After thinking on it, my essentially random approach was to mod everything. The logic went like this.

You have %N% numbers. A naive approach would be to check the sum of every possible combination of your numbers, which is a combinatoric nightmare on the order of %O(2^N * N)%. No good.

Well, if we're shooting for %0%, then we can divide our set into positives and negatives. Surely this is an improvement; now instead of dealing with the power set of %N%—that is, the number of subsets in a set of size %N%, which is %2^N%—we've made some progress, since we won't waste time trying sets that are all positive numbers, for example. But it turns out this by itself buys us very little.

Then I go, hey, what if you take every integer modulo %2%? Or, in other words, go through and mark down which numbers are even and which are odd.

Why? Because we know that our potential sum is going to be even (i.e. %0%). We already knew that we'd have to take one or more numbers from each the positives and the negatives. Put those truths together, and it means we know that any group that might work will have a certain property: if the sum of our positive numbers is odd, then the sum of our negatives will have to be odd too, or else the sum of both together would be odd, which would mean not zero. Likewise, if our positives sum is even, the negs have to be even too. Or,

% \forall_P {P \subseteq S, P_i \geq 0}, \sum\P \equiv x \ ("mod"\ 2) \leftrightarrow \forall_Q {Q \subseteq S, Q_i < 0}, \sum\Q \equiv x \ ("mod"\ 2) .%

This is surely good, since on average, half of all possible combinations won't meet that constraint and thus can be discarded out of hand.

But wait, it gets better. Why not take everything modulo %3%? The same principle applies, which is that our overall sum must equal %0 \ ("mod"\ 3)%. So if the sum of the positives %("mod"\ 3)% is %1%, we know we only have to check those groups of negatives which sum to %2 \ ("mod"\ 3)%. And now it seems like we're getting a threefold rather than twofold improvement on our deal.

But wait, it gets better. Why not throw everything so far into a big pot and see what happens? It's not too hard to show that if you take %mod 2%, note which possible combinations remain, and then do the same thing with %mod 3%, you'll generally multiplicatively reduce your potential viable combinations. Then continue this with a handful of ascending integers (maybe primes?) and with almost no work at all, you'll all but perfectly sieve out any invalid combinations.

And so far as I could tell, this is all true. The insidious problem, and this is where I stopped working on this, is that the long cold years have taught me that things are never easy, particularly extremely difficult mathematical things. I have absolutely no doubt that if I sat down and did the math carefully, I would find that I was using up at least as many operations in logistical overhead as I was saving in the sieving process.

That's how these deceptive problems work. You feel like you're making progress, but all you've really done is paint a horse like a zebra; the nature of the thing itself remains unchanged.

Primes are especially brutal about that. Watch out for primes.

]]>There's this longstanding observation that we live in a "fine-tuned universe," which is to say that several fundamental physical constants (e.g. strength of gravity, electromagnetism) appear to be set *perfectly* for us. If they were any different, even by one part in a billion, the universe would

There's this longstanding observation that we live in a "fine-tuned universe," which is to say that several fundamental physical constants (e.g. strength of gravity, electromagnetism) appear to be set *perfectly* for us. If they were any different, even by one part in a billion, the universe would have evolved very differently and failed to give rise to sufficient complexity to eventually enable life as we know it. Which, at first glance, seems awfully suspicious.

One proposed explanation for this is the **Anthropic Principle**, which tries to couch the whole thing as a selection effect. The claim is that obviously the cosmological constants had to be set that way, because if they weren't, nobody would be around to notice, and therefore no meaningful conclusions can be drawn. While the basic premise makes sense, arguing that there's nothing to be inferred strikes me as a load of horseshit.

If there were only one universe (ours), with one set of laws and constants, the probability of it just happening to work out like this is so close to 0 as to be negligible, barring the possibility that it was deliberately engineered that way (which is a whole different discussion.) So, it seems extremely likely to me that this should be taken as strong evidence in support of a multiverse of one form or another, consisting of a vast number (or more likely, an infinite number) of universes with varying laws of physics and initial conditions. This would neatly account for our seemingly extraordinarily unlikely circumstances, and incidentally explains and recolors Occam's Razor not just as a heuristic, but a de facto selection pressure in its own right.

So what?

Well, now let's look at consciousness and experience as computation. If you discount the notion of a soul, or other hand-wavy quantum mechanical effects, all we are is our own brain, fed appropriate input. The brain is presumably Turing-complete, essentially a computer— or more to the point, operates on principles that could be precisely modeled on a computer. You could be running on a PC right now, and if the model and coding were all correct, there'd be no subjective difference for you.

Of course, that's just The Matrix 101. But the problem is, things get weirder. It turns out that a huge variety of systems can be made to be Turing-complete, which is to say capable of carrying out any computation that a PC (or a brain) could execute. Bored CS students have designed Turing machines out of tinkertoys and Minecraft levels— hell, you could even have a hundred Tibetan monks pushing around beads on abacuses, and so long as they're conducting computation that can be isomorphically mapped bijectively to neural wetware, the end result is that you, your whole life and apparent consciousness and free will could be no more than a consequence of those beads getting pushed. Although this sounds a little out there, it all logically follows, and I think a growing consensus is emerging about it among people working in these fields.

So what's the first thing got to do with the second?

If we assume all possible universes exist and the level of algorithmic complexity of each universe is uniformly distributed over that infinite set (which seems like a plausible assumption), then universes dominated by the simplest of laws will be infinitely dense relative to more rarefied and complexity-heavy universes like our own. I expect that many of those universes will be appropriately configured such that they end up running a huge amount of computation, whether through a physical substrate (think an infinitely physically large universe consisting of random quantum fluctuations affecting clumps of molecules that tend to form NAND gates) or directly in some kind of even more abstract mathematical formalism.

If that's true, then every person that does exist or could exist and every life they could live would be inevitably simulated by just a single one of these Turing-verses. Which would mean, in turn, that it is essentially mathematically certain that you're not really here in the sense that you think, but are instead one of those simulations in a completely different and alien universe.

Now the question is how one could falsify this whole hypothesis, or better yet, if it turns out to be true, how one could arrange things in our universe such that we tunnel our way out into some control of the underlying Turing-verse.

And it's also conceivable that carrying out computation itself is not necessary to give rise to our subjective reality. It may be that just the pattern itself, in static form, is sufficient. That would be a beautiful thing, because it would open up the possibility that the only thing that exists in all of creation, across all universes and all time, is a single number encoding all things, in exactly the same way that any phone number, or book, or data representations of entire lives and worlds are contained within the digits of the number pi.

]]>*I discovered a new (to me) way of approaching the Goldbach problem; this is a brief overview.*

**Trying to prove there are always primes %p, q% such that %p+q = N% for any even %N > 2%, where %p <= q%.**

N.B. *Largely ignoring 2*

*I discovered a new (to me) way of approaching the Goldbach problem; this is a brief overview.*

**Trying to prove there are always primes %p, q% such that %p+q = N% for any even %N > 2%, where %p <= q%.**

N.B. *Largely ignoring 2 as a prime as it seems irrelevant here. There are a couple of 2-related edge cases (%N=4=2+2%) which I'm setting aside as trivial for now. Also disregarding meta-knowledge that this solution is too simple.*

*Update 2015.09.04*

For generating %R%, %\sqrt(N)% insufficient except for small %N%. %N/2% works as an upper bound, but conceivably something as low as %N/3%.

Other thoughts: setting %N//2% to a primorial may be useful for generating primes. For example, letting %N//2 = #(3) = 2 \cdot 3 \cdot 5 = 30% and running through the below with %R = {3,5,7}% gives you %S = {7, 13, 17, 19, 23, 29}% yielding %Q = {53, 47, 43, 41, 37, 31}%, all the primes in %[N//2, N]% save %59%.

I assert that you can obtain all values of %p% for any %N% by the following:

%"Let " S = "set of all primes" <= n 2.%

%"Let "R = "set of all primes" <= sqrt(n).% < p>

For each prime %r∈R%, create a set constrained within the bounds %[3, N//2 − 1]%, starting with the value %N mod r% and incrementing by %r%. Even and non-prime values may be omitted.

e.g. If %N=42%, %r=3%, then %42 -= 0" (mod 3)"%, and generate %{3, 6, 9, 12, 15, 18}%, or simply %{3}.%

Repeat this for every value of %r% in the domain.

In the above example, the only other value is %r=5%, yielding %42 -= 2" (mod 5)"%, and consequently %{7, 12, 17}% or just %{7, 17}%.

Once complete, remove the generated terms from %S%, your list of all primes in %2 < s < N//2%. We should have:

%S = {3, 5, 7, 11, 13, 17, 19}" \ "{3, 7, 17} = {5, 11, 13, 19}%.

And we have obtained all values for %p%:

%5+37=42%

%11+31=42%

%13+29=42%

%19+23=42%

I further assert that one can use this technique to prove Goldbach's Strong Theorem if it is possible to demonstrate that a) this algorithm is universally sound within its domain, and that b) it will always return at least one value.

(Except *maybe* the conclusion.)

For a), it seems to work and I'm taking it on faith for now pending further scrutiny.

For b), I've had several thoughts.

I suspect there may be a basic line of reasoning that is sufficient. As I understand it, the only time an %r% value can be removed is if %r | N%, as happened with %3% in the example. Attempts to remove things straightforwardly appear self-defeating, as in %N = 2*3*5*7 = 210%, which does remove %{3,5,7}% but lets in loads more terms.

Here are some example reference calculations, the terms (up to 50) that would be removed in each case with the given %r% and modulus.

%N mod "__"%

%0=>3% %1=>7, 13, 19, 31, 37, 43% %2=>5, 11, 17, 23, 29, 41, 47%

%0=>5% %1=>11, 31, 41% %2=>7, 17, 37, 47% %3=>3, 13, 23, 43% %4=>19, 29%

%0=>7% %1=>29, 43% %2=>23, 37% %3=>3, 17, 31% %4=>11% %5=>5, 19, 47% %6=>13, 41%

Each value has the same number of primes, a unique and full set! Let's see…

%3+2% gives %7%, %5+2 | 3% gives %4%, %7+3|5% gives %3%

3+2, 5+2, 7+3

5, 7, 10

10, 24, 38, 52, 66, 80, 94, 108, 122

3,5,7,11,17,23,29,31,37,41,47

3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47

3, 7, 13, 19, 31, 37, 43

7, 13, 19, 37, 43

13, 19, 43

6's, 10's, 14's

This all raises the question… if you take the primes from √N and offset each one however you like, can you hit all the primes up to N/2??

N √N Offense π(N∕2)−1 Defense Gaps

10 3 3 2 3, 5 1

12 3 3 2 3, 5 1

26 5 3, 5 5 3, 5, 7, 11, 13 1, 1, 2, 1

50 7 3, 5, 7 8 3, 5, 7, 11, 13, 1, 1, 2, 1, 17, 19, 23 2, 1, 2

3, 5, 7, 11, 13, 17, 19, 23

3, 7, 13, 19 = 2, 3, 3

5 eats 3, 13

7 eats 7 (or 19)

Surely it's too random. Contradiction proof available?

%3k+A%

%5k+B%

%7k+C%

They can never start on their own spot.

What if we go big?

Let N=1000.

10 attackers, 3 through 31.

94 p-side, 73 q-side.

Or start small:

%N = 10, 12%

%R={3}%

%S={3, 5}%

%N=14, 16, 18, 20%

%S={3, 5, 7}%

%N=22, 24%

%S={3, 5, 7, 11}%

%N=26%

%R={3, 5}%

%S={3, 5, 7, 11, 13}%

%S% terms are generated as the holes made by the %R% sieve.

How can you 'invert' the %R% offsets systematically to ensure ideal coverage of those previous holes? (And then prove that's not sufficient to get them all.)

%N=170%

%R={3, 5, 7, 11, 13}%

%S={3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83}%

%N=173% gives %... mod 5 -= 3%

%S={5, 7, 11, 17, 19, 29, 31, 37, 41, 47, 59, 61, 67, 71, 79}%

then 3

%S={7, 19, 31, 37, 61, 67}%

then 7

%S={7, 31, 37, 67}%

and 11 and 13 do nothing.

%N=122

R={3,5,7,11}%

S={3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61}

3@61 --> {7, 13, 19, 31, 37, 43, 61}

S={3,5,11,17,23,29,41,47,53,59}

5@59 --> {29, 59}

S={3,5,11,17,23,41,47,53}

7@57 --> {┤

11@53 --> {53}

S={3,5,11,17,23,41,47}

In order not to return some successful value, one would have to craft an %N% such that the union of any applicable sets chosen from the relatively limited domain in %R% cover every single prime up to %N//2%, which I am convinced is an impossibility but haven't yet proven.

If nothing else, there is a strong case to be made where the number of primes in %S% grows strictly faster than the maximum worst-case size of the removal sets (i.e. % |S| ~= (2/3) * (4/5) * (6/7) * ... * |S| % with increasing size of %R%, but seems to lose to growth of % pi (N // 2) % ).

]]>