It turns out there isn’t. Not to my satisfaction, anyway. The gist, in very broad terms, is that a sufficiently powerful system can use computation to figure out anything that’s figure-outable. There is also the notion of being **Turing complete**, which describes such systems. The **Church-Turing thesis**, which is widely believed to be true, says that Turing-complete systems are all interchangable, in the sense that any one of them can do anything that a different one could do. Informally, this includes people, although strictly speaking, people (probably) and computers are both finite state machines, which actually means they’re less powerful than a proper Turing-complete system.

That’s all well and good, but it seems like a vague way to answer the question of what computation actually *is*.

The best answer I can give is perhaps:

**Computation is the process of redefining a problem as multiple easier problems.**

Although perhaps ‘easier’ should just be ‘other’. Basically, you start with a question. You don’t have the answer to the question, but you wish you did. So you rephrase the problem in another way, ideally as one or more simpler problems. For those problems, you repeat the process and further subdivide them, until you either have a problem you *do* know the answer to, or give up on the whole thing.

If you’ve ever done any coding, you’ll recognize this as **recursion**, and for good reason. Computing is not exactly the same as recursion, but it’s goddamned close.

There are plenty of different models of computation—that is, Turing-complete systems which can be used to compute stuff. By the Church-Turing thesis, all of these systems are effectively equivalent. By ‘equivalent’, I don’t mean they’re identical when it comes to stuff like speed or efficiency, just that *eventually* any one of these systems can do anything that another one of them can. And of course the converse (inverse?) applies: anything that one of them fails at, they’ll all fail at.

**Turing machines** are the standard, invented by Alan Turing nearly a century ago. A very brief description is that a hypothetical Turing machine has an infinitely long tape which it can move back and forth, and read from and write to. It churns along reading and writing symbols and moving the tape in such a way that some computation gets done.

A little more precisely, it reads one symbol from its current tape location, and then chooses what symbol to change it to, whether to move the tape right or left, and what state it should be in. “State” here essentially just refers to some value that can be changed, a single integer variable if you like. The way a TM decides 1) which symbol to write, 2) how to move the tape, and 3) what state to be in, is by consulting a rule table, where it looks up the entry for its current state and the symbol read from the current tape location, and finds what to do in that situation.

But for a more in-depth description, just read Wikipedia. I won’t belabor them and their myriad derivations because I don’t like them. They seem too complicated to me, both to explain and to use, and cumbersome to work with and reason about. Superficially, the whole idea of a “tape” and a “read/write head” strike me as uncharmingly anachronistic.

For a much easier to understand approach, let’s try…

These get my vote for the simplest model of computation to understand, explain, and use. The quick and dirty version is you keep doing find-and-replace on something until you can’t anymore, and then you’re done.

In slightly more detail:

1. You start with some input string of symbols (which can be bits, or letters, or whatever you want).

2. You make some list of substrings to search for, and what to replace them with, so you have a bunch of rules that say “replace A with B”.

3. You go through your list in order, one at a time, checking the current rule.

3a. If A is in your main string, you replace the first occurrence of it with B. Then you go back to the start of the list and repeat.

3b. If A isn’t in your main string, you move on to the next rule. You keep moving on until you find one you can use, or you run out of rules.

4. If you run out of rules, that is, none of the A’s you’re looking for are in your string, you’re done.

This approach is similar to Markov algorithms, and comprises *m*-tag systems and cyclic tag systems.

In an *m*-tag system, you have a string you loop through, and a list to check. You look at the first symbol in your string, find a matching rule in your table. You add something to the end of the string, and you remove the first *m* symbols from the front, and then you repeat. Note *m* must be at least 2.

A great example is checking a value of the **Collatz conjecture**. With Collatz, you start with some number. If the number is even, divide it by 2; otherwise, multiply by 3 and add 1. Then repeat. The conjecture says that eventually you’ll get to 1 no matter what number you start with, but it is as yet unproven (and thought to be *extremely* difficult to solve).

Collatz is easily implemented by a 2-tag system with the following rules:

* a -> bc

* b -> a

* c -> aaa

That’s it. You start with a string of a’s, and whenever you have a string of only a’s, it corresponds to a Collatz value. You can try it yourself or check Wikipedia for a little walkthrough, but it works a treat.

While Markov gets my vote for *easiest* model, the Lambda calculus gets my vote for *best*, in the sense of theoretical grounding and true formal simplicity. It’s closest to what I think I was looking for when I started trying to pin down computation.

Very roughly, the idea is, you have two different forms things can take:

is a function that returns M after replacing anywhere shows up with whatever you give it as input. So if your function is , and you pass it 4, you get back .

will pass to , assuming is a function. It’s like saying .

So the first form defines functions (this is called **abstraction**), and the second form uses them (this is called **application**). And there are technically three forms a term can have, if you count the variables like as a form. Everything is a mapping, and to specify a mapping, we can use mappings (abstraction) and we can plug things into mappings (application).

And this is a beautiful thing. Suppose you wanted to argue that everything is made of numbers. You could start to try, but even if you take the existence of the integers as given, there’s no next step since the only thing you have is numbers. You can’t say, like, “5 6”; it’s meaningless without the introduction of some sort of relation or operation. But if we take mappings as the fundamental basis comprising everything, it’s all we need to do anything.

What works for me is thinking of everything as arrows.

- Start with nothing.
- Let there be one arrow. An arrow must link two arrows, so the only arrow it can link to is itself.
- We’d like to have another arrow. An arrow must link two arrows, and there’s just the one arrow, so it there’s nothing for it to link to or from. The only possible arrow already exists.
- But we need something else, so let’s back up. To start, just once, let’s allow an arrow to map from itself to nothing. Call it the null arrow. Very weird, but there it is.
- Now we can have a different arrow, one that maps to itself as before, call it the 1 arrow.
- Now we have enough complexity that we can start to build on it.

Oh man. Alternatively, say there’s an arrow from nothing to itself. That’s the primordial arrow. Since that’s impossible, reality is based on a contradiction, which is where Godel’s incompleteness and Turing’s halting and everything else comes from.

I just spent some time thinking this through, and eventually tried googling, and discovered that what I was trying to describe is called category theory and functors.

In a sense, computation is simplification. The program and algorithm that determines the output is an expression, and computation is the act of simplification until it can no longer be simplified. E.g. 2 * 3 + 5 = 11. You’re taking a tangled map and untwisting it and applying isomorphisms until you have a usably simple representation of whatever it is.

is a runaway computation. It repeatedly applies itself to itself. It is analogous to . It is what it is.

Turns out this is Turing-complete, and although I’ll admit I found it all confusing, the payoff was worth it when it started to click. The actual computation really happens when you take some expression composed of these forms and attempt to simplify it. I say “attempt to” because, as with any Turing-complete system, it is possible that it the computation will go on forever without yielding a result, but more on that later.

I find the syntax a little less than ideal. It’s annoying to type those lambdas, and the period seems like a strange choice to my modern sensibilities. I would much prefer something like:

and , with the understanding that in the first form should be a variable and in the second form should be a function. I’m a little unclear on what happens in general with violations of syntax, but I don’t think it matters for our purposes.

If we clean up currying too, we have e.g. , which means or FALSE, rewritten as .

I guess there’s no way to distinguish between meaning or .

Or better yet, plus is written as , which we write as .

Church numerals manage to give semantic grounding to the natural numbers. How do you define 3? You could say it’s the successor function applied 3 times to 0, but that uses 3 itself in the answer, not to mention 0 or successor. Church numerals say that 3-ness is, given f and x, f(f(f(x))).

If asked what 3 is, we’d like to point to 3 things and say “that”. But 3 what? So we can say if asked what 3 is, and given a thing to use, we point to 3 of the thing and say “that”.

In a sense, the concept of “mapping” is itself the notion of a unit. So given a function and a thing , is , is , and is just .

My take is that this captures the idea that, at the core of reality, everything can be thought of as mapping. It’s self-evident that some entities exist at some level, and they differ from one another somehow, or they’d be the same entity. Then, how can different entities combine in a useful way to become more entities? Well, there must be a way to relate them, and we can call that mapping. Then, since it’s wasteful to have there be a distinction between “things” and “mappings between things”, we let mappings themselves be the “things” which mappings map between. From there, the concept of mapping can bootstrap into the basis for the natural numbers, predicate logic, and the universe at large.

Let’s start over with the expectation that mappings are the only type of entity we’ll use. Start with nothing. Let there be something. What can there be? It’s a mapping from something to nothing. A mapping that, given ‘something’ and ‘nothing’ to work with, has something correspond to nothing. This is analogous to the Church numeral for , which is . This is a function that maps from function and thing as inputs, to the thing , discarding the function .

More precisely, paying attention to the currying of the variables, it’s a function that maps from the identity function to its argument. Given the function , it returns . This is not the identity function itself, because the identity function, given , would return . Instead, this strips the functionality, the mapping. So while makes sense in everyday math, in this context, it’s sort of like the equivalent of writing with nothing on the right side of the equation, thus mapping to nothing (as opposed to 0) or declaring undefined.

This is in comparison to the Church numeral for , which is , sort of a higher-order identity function. So is defined as “a function that maps from to “. This captures the concept of identity/unit-ness/self-reference and cements the nature of as the unit within the numeral system.

Consider the example of trying to rewrite in conjunctive normal form so that you can easily evaluate it. There’s a map in the ether which untangles all possible propositional statements. There are arbitrarily many other statements that share a truth value with this statement, meaning transformation done to them is the same as if it’s done to this in terms of the relative mappings it creates. The goal is to use conditionals to assess where in the space of all possible statements you are, in that each check you perform will narrow down the possibilities, until when it’s finally in normal form, it can be simplified no further.

The syntax of Lambda calculus lets us set up an arbitrarily complex map. The computation itself isn’t really done until you apply -reduction, which is basically a fancy name for trying to clean up all the search-and-replace operations to simplify things, or in other words, replacing variables in function bodies with the values supplied to them.

…under construction…

]]>Or, why is it so big?

This is related to the Axiom of Choice. The question I have is whether one can really comment in any meaningful way on any given integer, when it was selected seemingly arbitrarily from the natural numbers. You might reason that 2 trillion galaxies seems pretty small compared to all possible numbers: but then, what number would you not say that about? ? That’s still pretty tiny. In fact, I don’t think you can meaningfully say anything in comparison between the two.

What if there were a dozen galaxies, maybe thirty total stars? (Somehow.) Would that be a small amount? Why?

]]>I know, I was surprised too.

To be a little more precise, it is—regrettably—possible and probably inevitable to go through the process of *dying*, to have it seem like everything is ending… but it never actually will.

My argument is based on two key assumptions which I’ve held for a long time now, and which I don’t think require much of an act of faith.

If we set aside theories with no basis in observable fact (e.g. God, quantum mechanical “magic”), it appears that what we think of as life and consciousness is actually the result or byproduct of the computation going on in each of our brains, in one way or another.

Whether it be parallel universes in a many-worlds quantum mechanical sense, or every possible universe was created somehow from the outset, it seems like this has to be the case. There are several reasons for this, but the most compelling is the cosmological fine-tuning problem, the fact that various parameters are set perfectly for life to arise. The only real counterargument to this (other than “God/aliens made everything”) is the anthropic principle, but that isn’t really an argument. It just means that we’re either living in a multiverse where our universe was bound to exist, or we’re living in a one-off and got **insanely** lucky that it happened to be perfect for us, which is effectively impossible.

So, let’s accept the idea that consciousness is actually just the particular pattern known as you continuing to exist from moment to moment. Suppose you get nuked. Literally, you get vaporized by a nuclear blast. North Korea or whomever. The argument works in any case, but I think this one helps illustrate it best.

Your pattern—meaning, the specific arrangement of computation that thinks its you—is cheerfully existing right up until it suddenly ceases to exist. The ‘you’ that actually existed here, in the strictest sense, is indeed dead. However, there’s no ‘you’ to experience being dead. There is, however, presumably a virtually identical you in a parallel universe, or different version of ours, or however it’s set up. A version of you where everything was the same right up until the moment the bombs fell, but in this version, they don’t. Your consciousness—your pattern—doesn’t care if you view it as thirty years of existence in one universe and then thirty years of existence in another. So long as all the right computational frames were executed to form a smooth transition from moment to moment, the-pattern-that-is-you experiences nothing unusual at all when hopping between realities. So, if you die here, you’re bound not to have died somewhere else, and from your point of view, that’s where you will continue existing.

That doesn’t actually stop you from dying, it just stops you from ceasing to exist. The good news is that the experience of death is necessarily transient, while immortality is forever.

]]>

The bracketed string on the left expands as shown, with each element included until reaching and repeating from , indefinitely.

The first 25 terms of for reference:

ConFrac | Derivation form and notes | ||

1 | 1 | [a] Perfect squares () are trivial cases. | |

2 | 1 | [b] The form is the next simplest; the periodic part is just . | |

3 | 1 | [c] Iff , the periodic part is . | |

4 | 2 | [a] | |

5 | 2 | [b] | |

6 | 2 | [c] | |

7 | 2 | [d] | |

8 | 2 | [c] | |

9 | 3 | [a] | |

10 | [b] | ||

11 | [c] | ||

12 | [c] | ||

13 | |||

14 | [d] | ||

15 | [c] | ||

16 | 4 | [a] | |

17 | [b] | ||

18 | [c] | ||

19 | |||

20 | [c] | ||

21 | |||

22 | |||

23 | [d] | ||

24 | [c] | ||

25 | [a] |

To clarify forms [c] and [d], a couple quick examples should cut through the math:

At each perfect square, you just run through its factors. Take with :

(already covered by [b])

which gives us our middle 4 in

, which are co-prime, so you end up with a nasty

.

Then you stop at the next (being ) and work backwards:

(covered by [c])

and you subtract the constant 2 from your 5 to get your 3 in .

Likewise, working backwards from , you hit where

gives you . And same idea goes for for .

A long-known, famous and straightforward algorithm exists to find the terms of a periodic continued fraction; the process is essentially identical to Euclid’s algorithm, which runs in polynomial time in the worst case. Unfortunately, generating the terms of a periodic continued fraction with a closed-form expression appears to be an open problem. There *are* patterns—I’ve detailed the simplest ones I’ve found in the notes above.

A few more general observations:

- Every continued fraction representing has the form , where is a possibly empty finite palindromic sequence.
- There is an upper bound on the number of terms; I’m not sure yet, but it seems to grow at something like .

- There can be an odd or even number of terms, but either way the terms are always palindromic (e.g. for ).
- Every term is an integer, and .
- Odd terms of any reduce the overall value of the expression, while the even ones increase it.
- Each consecutive term of has much less of an impact on the overall value than the previous term. If , its change on the overall value could only be equaled as .

- I strongly suspect that any sequence that occurs once occurs an infinite number of times as . For any given , it is possible to derive a quadratic equation yielding its representative values of .
- In the forms I analyzed, every had a discriminant . Those with a positive discriminant appeared to yield only composite numbers (possibly excepting its real roots), while those with negative discriminants seemed to have plenty of primes through arbitrarily large . However, the second-degree coefficient grows drastically in proportion to .

- It remains unclear what determines whether a given is valid for some . For example, is a valid for odd but not for even.

**Okay!** On to the fun speculation!

Another unsolved problem is devising a formula for creating an arbitrarily large prime. The current large-prime record is held by a Mersenne Prime with over twenty million digits. There is at least one extant quasi-formula (not totally closed-form) that is believed to generate only primes, but it is impractical for primes of any size, as it requires a lot of calculation for very little payoff (see Fortunate numbers). It’s unclear where exactly to draw the line between formula and algorithm, as in in one sense, we can theoretically generate every last prime with a simple sieve if we had the computing cycles for it.

In the course of investigating all this, I discovered a simple method that appears to generate arbitrarily large primes with no false positives. Before going on, I’ll sketch out the process:

- Begin with and proceed to step through odd integers. (This step could be heavily optimized.)
- Calculate the periodic continued fraction for .
- Keep track of the length of that period, i.e. . If current period length is more than or equal to the previous longest period (this check is also unnecessarily strict, can be optimized), update your longest period length and:
- if is even [this is speculative but not yet disproven], or
- if contains the term where … (N.B. this term will only appear once; it will be the pivot/center element of )

- … then is prime!
- this also appears to work for monotonically increasing period length when is even.
- in this case, is guaranteed
**not**to appear. (maybe the same goes for second-highest term also?)

- in this case, is guaranteed
- even and odd lengths have an interesting relationship. at first glance, it appears that the only composite terms appearing in even are those that , while anything with a term ends up in odd . However, this is no longer true as of , followed by and and others nearby. My only guess so far as to why is that and , but I haven’t been able to reason any further than that.

The following issues/questions stand between this and substantive progress.

- Brute force checking through confirms that it holds for that interval, and the data suggest that it will probably continue to hold. However, this would need to be proven.
- Related questions:
- why does it work?
- why do the terms have to be arranged that way?
- why palindromic? clearly these are cancelling terms, with the exception of the center pivot term and the term…
- ah… so if IS an even-length period, that implies odd-length palindrome, implies pivot and term, but crucially!, as the period repeats, it alternates and they ALL “cancel” after full infinite regression… except they don’t really cancel, of course. so what are they doing??
- how can one calculate the relative value between consecutive terms? is that as simple as checking adjoining convergents?

- if not palindromic, you end up with terms outside of the radical.
- discovery! the random is not random after all. if you write the full form as a recursive fraction e.g. for , in order to make it work cleanly, you need to eliminate the leading constant . this can be done by noting that as is, the in the fraction is almost perfect, but it too has that extra constant term, so we get away with ignoring it by replacing the in the fraction with , and then subtract the constant also, yielding

- what’s with the quasi-geometric inverse progression starting from ? (early factors of and )
- it all has to do with fibonacci/golden ratio! the more 1s, the more fib-type numbers will populate, and the longer it will be besides.
- N.B. this gives us very loose bounds on ratios of successive periods… period length times somewhere between all 1s (yielding as a lower bound) to … well, it’s probably more like the sqrt(k), but as a generous upper bound, let’s say k was every integer somehow, yielding … ? maybe? right ballpark even? both near 0.62 or 0.63, right above the highest case I’ve seen (which was 0.61ish)

- are all even-length prime? why?
- what is the cutoff ratio (if any) which guarantees primes? (early guesses: or times maxLength)
- is there a closed-form expression for finding terms of an arbitrary ?
- if not, method is unlikely to be useful for generating appreciably large primes
- however, it is possible that there’d be a way to pick a large number at random, determine the continued fraction, and modify it suitably to find a qualifying prime
*investigate whether Gosper could be applicable here*

- it appears highly effective to take the smallest entry for each of even and odd length periods. try huge primes this way. the only remaining difficulty is really just figuring out how to reach a suitable period length.
- if you take just the odds, excepting 10 and 97, all prime through the first 5000
**if there is some way to calculate just the number of terms for a given , that may be sufficient to do what we want!**- Another speculation/question: suppose you are given an arithmetical tool which will return term of a PCF but gives you no other information.
- The goal is to determine what the period is as efficiently as possible.
- You know the PCF is the square root of a natural number.
- Although knowing the value of would give you a pretty tight upper bound (oscillating but absolutely converging) which I think goes on the order of , let’s pretend you don’t know that fact or can’t use it.
- You do, however, know and . If you didn’t know that or the period, it would be mathematically impossible to know if you ever truly found a period. Anyway, if you see either of them, you have reached a midpoint or endpoint of a period (or possibly multiples of a period.) From there, it’s straightforward to track down the answer.

- The naive way to do this is simply start from the first value and increment until you hit —or, if it’s an odd-length period, you would (potentially but not necessarily) have to wait for the full to be certain. At any rate, you’re looking at an expected lookup argument between and .
- My question: can this be improved? How?
- Note that all other considerations aside, it should be preferable to reveal those term indices which could probabilistically suggest a reflection, especially when taken with other pairs of terms around the same (?) midpoint.
- Note also that perfect squares have no period.
- If choose :
- k@1 -> k or 2k@2, you win.
- k@2 also win.
- Suppose then.

- Choosing 3 is madness. Choosing 4 seems plausible since it gives you a possible reflection about . But I’ll caution here that you can’t do anything as simple as stick to picking even numbers, as it’s completely possible that… wait, no it’s not. You’re guaranteed still to run into or . Picking only odd numbers would be a little hairier, but maybe still workable in practice up until the very end.
- Moving on, choosing 5 is what I’m hoping is somehow smart. If it turned out primes were good candidates for this, well, that’s why math is great. Let’s think…
- You could have at or , or and odd-length period with at 3
*and*4 and at 7. The immediate downside to a 5 choice is that it’s co-prime to your previous guesses so you can’t possibly glean any useful information from that; however, I’m harboring a suspicion here. - The logic goes that realistically, you need to reveal either a or term (which don’t necessarily immediately solve it, but will almost automatically thereafter). These terms are (if period is even) present with collective frequency or (if period is odd and pivot repeats) . Note the latter happens
*substantially*more rarely so far as I’ve seen. These PCFs seem to prefer having the single pivot card by a hefty margin. - This appears to be because of all the composites with factors of 3, 7, 11, 19, 23 etc. which in almost always lead to single center card and even overall , while you only see primes and a few composites made up completely of 5, 13, 17, 29. (2s seem neutral.) The offset and fortuitous early gaps end up having a large impact on overall distribution (no idea whether it evens out asymptotically, but probably.)
- Back to the point, it seems the goal should therefore be to maximize terms with exposing its double or half as well. Incidentally, I think it would actually be sound just to pick as you’re guaranteed to hit (for even ) or (for odd ). Ooh, but wait… my investigations taught me that while a single center pivot term will often have as its value, this is not so (in fact, I think it’s impossible) for odd , so you won’t know when you hit it. Plus, then you’ll either over- or undershoot by one. I mean, you’ll still eventually catch up with it cause co-prime and cycles and all, but it’ll take forever and a day. And come to think of it, if you keep doubling, perhaps even that’s not guaranteed.
- That does bring to mind the point that it’s a valid strategy to pick any prime number out of the gate and just check evey term. You’ll see after … guesses? Whatever, something like that.
- Anyway, this is all very beside-the-point. Staying on mission…

- You could have at or , or and odd-length period with at 3

**N.B.**

even/odd | A diff | Notes | |

1 | odd | 2 | |

even | 1 | ||

2 | all | 12 (+8) | only legal form: |

3 | 1,1,1 | ||||||

4 | 1,2,1 | ||||||

2,2,2 | 1,1,2,1,1 | ||||||

2,4,2 | 1,1,4,1,1 | ||||||

3,1,3 | 1,2,1,2,1 | 2,1,2,1,2 | 1,1,1,2,1,1,1 | ||||

3,2,3 | 1,2,2,2,1 | ||||||

3,3,3 | 1,2,3,2,1 | 2,1,6,1,2 | 1,1,1,6,1,1,1 | ||||

3,4,3 | 1,2,4,2,1 | 2,1,1,1,2 | 1,1,1,1,1,1,1 | ||||

3,7,3 | 1,2,7,2,1 (!) | ||||||

4,2,4 | 1,3,2,3,1 | ||||||

5,3,5 | 2,4,4,2 | 6,1,1,6 | 1,4,3,4,1 | 2,2,6,2,2 | 1,1,4,4,1,1 | 1,5,1,1,5,1 | 1,1,2,6,2,1,1 |

5,4,5 | 1,4,4,4,1 | 2,1,1,1,1,1,2 | 1,1,1,1,1,1,1,1,1 | ||||

5,5,5 | 1,4,5,4,1 | ||||||

5,7,5 | 4,3,3,4 | 1,4,7,4,1 | 1,3,3,3,3,1 | 2,1,2,1,1,2,1,2 | 1,1,1,2,1,1,2,1,1,1 | ||

7,4,7 | 1,6,4,6,1 | 2,1,1,3,1,1,2 | 3,1,1,1,1,1,3 | 1,1,1,1,3,1,1,1,1 | 1,2,1,1,1,1,1,2,1 | ||

2,3,5,3,2 | 4,3,1,3,4 | 1,1,3,5,3,1,1 (!) | 1,3,3,1,3,3,1 | 2,1,1,1,1,2,1,1,1,1,2 | 1,1,1,1,1,1,2,1,1,1,1,1,1 | ||

5,1,1,1,1,5 | 1,4,1,1,1,1,4,1 |

So we have factorials, denoted with a suffix, e.g. , or more generally

Among many other things, represents the number of possible permutations of a set of unique elements, that is, the number of different ways we can order a group of things.

We’ve also got , the “power set” of . If we have a set containing items, is the total number of unique subsets into which it could be partitioned, including the null set. The notion of a power set plays a significant role in transfinite math.

is the “smallest” of the infinities, representing the cardinality of the countable numbers (e.g. the set of integers ). If we assume the Continuum Hypothesis/Axiom of Choice, the next smallest cardinality is the power set , which corresponds to the cardinality of the real set . Under , there is believed to be no cardinality between consecutive power sets of aleph numbers.

And there’s your background. So, I wondered whether or grows faster; it does not take too much thought to realize it’s the the former. While is merely growing by a factor of with each index, grows by an ever-increasing factor. In fact, it follows that even for an arbitrarily large constant , you still end up with . (The limit also holds for division: .)

But here is where my understanding falters. We’ve seen that , in the limit, is infinitely larger than ; I would think it follows that it is therefore a higher cardinality. But when you look at vs. , some obscure paper I just found (and also Wolfram Alpha) would have me believe they’re one and the same, and consequentially both equal to .

Unfortunately, I can’t articulate exactly why this bothers me. If nothing else, it seems counter-intuitive that on the transfinite scale, permutations and subsets are effectively equivalent in some sense.

…but I suddenly I realize I’m being dense. One could make the same mathematical argument for as for insofar as growing faster, and in any case, all of these operations are blatantly bijective with the natural numbers and therefore countable. Aha. Well, if there was anything to any of this, it was that bit about permutations vs. subsets, which seems provocative.

Well, next time, maybe I’ll put forth my interpretation of as a definition of . Whether it is *the* definition, or one of infinitely many differently-shaded definitions encodable in various reals (see ), well, I’m still mulling over that one…

Near as I can figure, the go-to framework for mathematics these days is Zermelo-Fraenkel set theory, with or without the Axiom of Choice (denoted as **ZF** or **ZFC**, respectively). The Axiom of Choice (**AC**) was contentious in its day, but the majority of mathematicians now accept it as a valid tool. It is independent of **ZF** proper, and many useful results require that you either adopt or reject it in your proof, but you’re allowed to do either.

The simplified version of **AC** is that it allows you to take an element from any number of non-empty sets, thereby forming a new set. Where it diverges from **ZF** is that it allows for rigorous handling of infinite sets, something that is impossible within **ZF**. Imagine you have an infinite number of buckets, each one containing an infinite number of identical marbles; **AC** finds it acceptable for you to posit, in the course of a proof, that you somehow select one or more marbles from each of those buckets into a new infinite pile.

Unsurprisingly, this means that **AC** is equivalent to a number of other famous results, notably the well-ordering principle, which counter-intuitively states that there is a way to totally order any set given the right binary operator, even e.g. . Worse yet, it lets you (via the Banach-Tarski paradox) disassemble a tennis ball into five or six pieces and reassemble them into something the size of the sun. In other words, **AC** is meant to be an abstract mathematical tool rather than a reflection of anything in reality.

Now, I could spend forever and a day discussing this sort of stuff, but let’s move along to my thought: what it has to do with the Big Bang.

As you all know, the Big Bang ostensibly kicked off our universe some billion years ago. However, some of the details are still kind of hazy, especially the further back you look. Past the chronological blinder of the Planck Epoch—the first seconds of our universe—there is essentially a giant question mark. All of our physics models and equations break down past that point. Theory says that the big bang, at its moment of inception, was an infinitely dense zero-dimensional point, the mother of all singularities. One moment it’s chillin’, the next it explodes, and over time becomes our universe.

I don’t love this theory, but it’s the best fit we’ve got so far for what we’ve observed, particularly with red-shifting galaxies and the cosmic microwave background radiation, so we’ll run with it for the moment. So where did all of these planets and stars actually come from, originally? There is a long and detailed chronology getting from there to here, but let’s concern ourselves with looking at it from an information-theoretic point of view, or rather, ‘How did all of this order and structure come out of an infinitesimal point?’

It’s unclear, but the leading (though disputed) explanation is inflation, referring to an extremely rapid and sizable burst of expansion in the universe’s first moments. There is apparently observational data corroborating this phenomenon, but a lot of the explanation sounds hand-wavy to me, and as though it were made to fit the evidence. The actual large structure of the universe is supposed to have arisen out of scale-invariant quantum fluctuations during this inflation phase, which is a cute notion.

Note, by the way, that entropy was also rapidly increasing during this step. In fact, my gut feeling on the matter is that since entropy is expected to strictly increase until the end of time (maximum entropy), it makes plenty of sense that the Big Bang kernel would have had zero entropy—hell, it’s already breaking all the rules. While thermodynamic and information-theoretic entropy are ostensibly two different properties, they’re one and the same principle underneath. Unless I’m gravely mistaken, no entropy would translate to complete order, or put another way, absolute uniformity.

If that was indeed the case, its information content may have been nothing more than one infinitely-charged bit (or bits, if you like); and if *that* were the case, there must have been something between that first nascent moment and the subsequent arrival of complex structure that tore that perfect node of null information asunder. Whether it was indeed quantum fluctuations or some other phenomenon, this is an important point which we will shortly circle back to.

It’s far from settled, but a lot of folks in the know believe the universe to be spatially infinite. Our observable universe is currently limited to something just shy of billion light years across; however, necessary but not sufficient for the Big Bang theory is the cosmological principle, which states that we should expect the universe to be overwhelmingly isotropic and homogeneous, particularly on a large scale (+ mil. light years). This is apparently the case, with statistical variance shrinking down to a couple percent or much less, depending on what you’re examining.

That last bit is icing on the cake for us. The real victory took place during whatever that moment was where a uniform singularity became perturbed. (Worth noting that if the uniformity thing is true, that mandates that it really was an outside agency that affected it in order to spawn today’s superstructure, which of course makes about as little sense as anything else, what with there presumably being no ‘outside’ from which to act.)

So here’s the punchline. If you assume an infinite universe, that means that the energy at one time trapped featureless in that dimensionless point has since been split apart into an infinitude of pieces. “But it’s still one universe!” you say. Yes, but I think it’s equally valid to recognize that our observable universe is finite, and as such, could be represented by (if discrete) numbers, or (if not), or if things are crazier than we know. *Regardless*, it could be described mathematically, as could any other of the infinitely many light cones which probably exist, cozy in their own observable (but creepily similar) universe.

Likewise, we could view each observable universe as a subset of the original Big Bang kernel, since that is from whence everything came. It must be representable as a set of equal or larger cardinality to the power set of all observable universe pockets, and therefore *the act of splitting it into these subsets was a physical demonstration of the Axiom of Choice in reality!*

I’m not sure what exactly that implies, but I find it awfully spooky. I feel like it either means that some things thought impossible are possible, or that this violation happened when the Big Bang kernel was still in its pre-Planck state; but if that’s the case, not only do our physical models break down, our fundamental math would be shown not to apply in that realm either, which means it could have been an anti-logical ineffable maelstrom of quasi-reality which we have no hope of ever reasoning about in a meaningful way.

]]>