math

On generating primes with periodic continued fractions


Pardon the mess. I wrote this with a different markup tool and it looks like it doesn't want to play nice with Ghost.

I started writing this a few days ago, and have since ironed out some things which I hope to come back and flesh out. I leave the rest of the post as-is for now, half for my Dear Reader, and half as a note to self.

TL;DR: I'm virtually positive I'm right about most of this, and have come up with several different ways to theoretically exploit continued fractions to generate primes of arbitrary size. Unfortunately, and naturally, PCFs are just as deeply nuanced as most prime-connected stuff, meaning same ol' story, different tune.

If it were possible to cheaply generate large PCFs, it would be a very different story, and I haven't completely ruled out the possibility yet. In fact, it would be sufficient merely to develop an algorithm which could spit out the length of the period of the PCF for a given radical, assuming it ran in reasonable time. What I did find in the literature about it suggests that it's an open problem, but there also wasn't very much literature compared to some of your more mainstream areas of number theory.

If for whatever reason this doesn't load or looks messed up, try reading here (or here!)



Factorials vs. Power Sets

Wanted to jot this down as food for thought before I forgot. And so I did.

So we have factorials, denoted with a %!% suffix, e.g. %4! = 1 \times 2 \times 3 \times 4 = 24%, or more generally \[n! := \prod_{k=1}^{n} k=1 \cdot 2 \cdot 3 \cdot \ldots \cdot (k-1) \cdot k.\] Among many other things, %k!% represents the number of possible permutations of a set of unique elements, that is, the number of different ways we can order a group of things.

We've also got %2^x%, the "power set" of %x%. If we have a set %\mathbf{S}% containing %|\mathbf{S}|% items, %2^{|\mathbf{S}|}% is the total number of unique subsets into which it could be partitioned, including the null set. The notion of a power set plays a significant role in transfinite math.

% \aleph _ {0} % is the "smallest" of the infinities, representing the cardinality of the countable numbers (e.g. the set of integers %\mathbb{Z}%). If we assume the Continuum Hypothesis/Axiom of Choice, the next smallest cardinality is the power set %2^{\aleph _ {0}} = \aleph _ {1}%, which corresponds to the cardinality of the real set %\mathbb{R}%. Under %\mathsf{CH}%, there is believed to be no cardinality between consecutive power sets of aleph numbers.

And there's your background. So, I wondered whether %n!% or %2^n% grows faster; it does not take too much thought to realize it's the the former. While %2^n% is merely growing by a factor of %2% with each index, %n!% grows by an ever-increasing factor. In fact, it follows that even for an arbitrarily large constant %C%, you still end up with %\lim _ {n\rightarrow\infty} n! - C^n = \infty%. (The limit also holds for division: %\lim _ {n\rightarrow\infty} \frac{n!}{2^n} = \infty%.)

But here is where my understanding falters. We've seen that %n!%, in the limit, is infinitely larger than %2^n%; I would think it follows that it is therefore a higher cardinality. But when you look at %2^{\aleph _ {k}}% vs. %\aleph _ {k} !%, some obscure paper I just found (and also Wolfram Alpha) would have me believe they're one and the same, and consequentially both equal to %\aleph _ {k+1}%.

Unfortunately, I can't articulate exactly why this bothers me. If nothing else, it seems counter-intuitive that on the transfinite scale, permutations and subsets are effectively equivalent in some sense.

...but I suddenly I realize I'm being dense. One could make the same mathematical argument for %3^n% as for %n!% insofar as growing faster, and in any case, all of these operations are blatantly bijective with the natural numbers and therefore countable. Aha. Well, if there was anything to any of this, it was that bit about permutations vs. subsets, which seems provocative.

Well, next time, maybe I'll put forth my interpretation of %\e% as a definition of %\mathbb{Z}%. Whether it is the definition, or one of infinitely many differently-shaded definitions encodable in various reals (see %\pi%), well, I'm still mulling over that one...

Ideal combinatoric strings

This is something I worked on a year ago, so I'll keep it (relatively) brief.

Inspiration

There's a keypad on my apartment building which accepts 5-digit codes. One day on the way in, I started thinking about how long it would take to guess a working code by brute force. The most obvious answer is that with five digits, you have %10^5 = 100,000% possible codes; since each one of these is 5 digits long, you're looking at %500,000% button pushes to guarantee finding an arbitrary working code.

Realization

But then I realized you could be a little smarter about it. If you assume the keypad only keeps track of the most recent five digits, you can do a rolling approach that cuts down your workload. For example, say I hit 12345 67890. If the keypad works as described, you're not just checking two codes there, you're checking 6: 12345, 23456, 34567, 45678, 56789, 67890.

Foundation

The next natural question was how many button-pushes this could actually save. After doing some work, I satisfied myself that you could cut it down by a factor of %~D%, where %D% is the number of digits in a code. So if you typed the right digits in the right order, instead of the %500,000% before, you're only looking at %100,004% (you need an extra 4 to wrap the last few codes up).

Experimentation

The next natural question was: how do you actually come up with that string of digits? It has to be perfect, in that it contains every possible 5-digit combination without repeating any of them. As with almost any exploratory problem, the best approach is to simplify as much as possible. For instance, consider a binary code three digits long, which only has %2^3 = 8% different codes: %{000, 001, 010, 011, 100, 101, 110, 111}%.

My formula suggested you should be able to hit all eight of those in a %2^3 + 3 - 1 = 10% digit string, and it's easy enough to put one together by a little trial and error: 0 0 0 1 1 1 0 1. I found it was easiest to treat these strings as cyclical, so the 0 1 at the end wrap around to give you 0 1 0 and 1 0 0. As a bonus, any rotation of this string will work just as well.

Perspiration

As I scaled the problem up, however, more and more things became clear. First, above a certain point, you start getting multiple viable optimal strings that are not simple transformations of one another. Second, finding an elegant way to generate these strings was not turning out to be easy.

I found one mechanical way of generating a valid string that worked, but I didn't love it. If you list all the combinations you have to cover, and then slot each combination into a buffer greedily, meaning the earliest spot where it can fit (potentially with overlap), it works out.

E.g.:
mechanical approach

Beautification

At some point I realized the generative process could be viewed as a directed graph, the nodes representing an N-length code, its successors delineating alternatives for continuing the string. After a few attempts, I got a pretty clear-looking one down (the node labels 0-7 are standing in for their binary counterparts):

directed graph

As it turns out, you can start on any node and if you trace a Hamiltonian cycle—touching each vertex only once and ending back at the start—the numbers you hit along the way form a valid optimal string. This approach also scales with the parameters of the problem, but requires a messier or multi-dimensional graph.

Confirmation

Whenever I stumble into open-ended problems like this, I avoid Googling for the answer at first, because what's the fun in that? I was pretty sure this would be a solved problem, though, and after spending a while working this through myself, I looked for and found Wikipedia's page on De Bruijn sequences. As usual, I was beaten to the punch by somebody a hundred years earlier. Hilariously, however, in this case the results matched up better than expected. Check out the Wiki page, notably a) the graph on the right, and b) the first line of the section "Uses".

Notation

If you want to see the wild scratch pad which brought me from point A to point B, by all means, enjoy.

Goldbach as bitfield

I've been trying to visualize the problem in many different ways, tables and graphs and geometry and anything else that seems plausible. Here's one I nailed down this morning that's concise and readily graspable for any coder.


First, ignore the %2%. It just gets in the way.

And then we get going. Starting with %3%, we'll build up a list of primes and I'll show how you can easily check for Goldbach numbers while you do. As you build a string of bits, on each new number, you reverse the bits you have so far and slide it over one.

Much easier understood with a demonstration, starting with %n=6 => n/2=3% and incrementing %n=n+2%, as it must be even.

n=6   3
prm?  1
rev   1

n=8   3 5
prm?  1 1
rev   1 1

n=10  3 5 7
prm?  1 1 1
rev   1 1 1

n=12  3 5 7 9
prm?  1 1 1 0
rev   0 1 1 1

n=14  3 5 7 9 11
prm?  1 1 1 0 1
rev   1 0 1 1 1

n=16  3 5 7 9 11 13
prm?  1 1 1 0  1  1
rev   1 1 0 1  1  1

n=18  3 5 7 9 11 13 15
prm?  1 1 1 0  1  1  0
rev   0 1 1 0  1  1  1

n=20  3 5 7 9 11 13 15 17
prm?  1 1 1 0  1  1  0  1
rev   1 0 1 1  0  1  1  1

All we're doing with rev is reversing the order of the "prime?" bits. If you watch from one to the next, you can see that this is actually resulting in a kind of ongoing shift each time. Observe:

prm?   1  1  1  0  1  1  0  1  1  0  1  0  0  1
n      3  5  7  9 11 13 15 17 19 21 23 25 27 29
6      1
8      1  1
10     1  1  1
12     0  1  1  1
14     1  0  1  1  1
16     1  1  0  1  1  1
18     0  1  1  0  1  1  1
20     1  0  1  1  0  1  1  1
22     1  1  0  1  1  0  1  1  1
24     0  1  1  0  1  1  0  1  1  1
26     1  0  1  1  0  1  1  0  1  1  1
28     0  1  0  1  1  0  1  1  0  1  1  1
30     0  0  1  0  1  1  0  1  1  0  1  1  1
32     1  0  0  1  0  1  1  0  1  1  0  1  1  1

Notice all we are doing, for each new line, is taking %prm?(n)%, sticking it at the beginning of the new line, and shoving over all the rest. Same as above, but easier to see here. Note also the prime bits read down vertically in the regular order.

So?

The thing is, you can check for any and all Goldbach numbers by simply bitwise ANDing two corresponding strings together. (This may actually be an efficient way to code this, should such a need arise.) Take %n=28%:

       3  5  7  9 11 13 15 17 19 21 23 25
prm?   1  1  1  0  1  1  0  1  1  0  1  0
rev    0  1  0  1  1  0  1  1  0  1  1  1
-----------------------------------------
  &    0  1  0  0  1  0  0  1  0  0  0  1

The resulting bits identify which numbers are "Goldbach numbers," primes which will sum to %n%. In this case we get two pairs:

%5 + 23 = 28%
%11 + 17 = 28%

This process will give you all valid Goldbach numbers for a given %n%.

  prm? 1  1  1  0  1  1  0  1  1  0  1  0  0  1
n      3  5  7  9 11 13 15 17 19 21 23 25 27 29
6     (1)                                            3+3=6
8     (1  1)                                         3+5=8
10    (1 (1) 1)                                      3+7=10   5+5=10
12     0 (1  1) 1                                    5+7=12
14    (1  0 (1) 1  1)                                3+11=14  7+7=14
16    (1 (1  0  1  1) 1)                             3+13=16  5+11=16
18     0 (1 (1  0  1) 1) 1                           5+13=18  7+11=18

...and so on. I added the parentheses to emphasize the pairs, but it's just matching up by working from the outsides to the center. It did make me notice something I should have seen before, which is that if you don't want to bother with all the reversing and stuff, you accomplish the same thing by taking the prime bit string, counting in from both sides, and watching for matches that are both 1. Same exact thing.


The astute observer may point out that I'm not actually doing anything in this write-up, which is true. Especially viewed this most recent way, all we're doing is literally taking the Goldbach Conjecture by its definition, and looking manually for pairs of primes that add up. If there is something to be gained from this approach, I suspect it lies in study of the "sliding" nature of the reversal version, and seeing how it conspires not to leave holes.

This is almost certainly biased on my hunch that Goldbach's undoing lies in a thorough examination of the mechanism driving its avoidance of symmetry; to violate Goldbach, there would have to be an instance of a certain symmetry, which would in turn be representative of a periodic pattern to a degree verboten to primes.

But hey, that's just me.

Lower bound of big-O for sort

This is a quick story about today's thing that I discovered that is already in Wiki. I must be up in the triple digits at this point. That said, this is one of the more obvious ones.

I was reading a thread about which sorting algorithm could be considered "best," and someone mentioned a couple of algorithms which allegedly run in %O(n log log n)% time. This came as a shock, since I'd thought %n log n% was the brick wall for an honest sort, and I wondered if these other sorts were for real, whether that would imply the existence of a linear time sort.

I tried to imagine what the lower bound would be, and figured there must be a minimum number of comparisons required for some number of elements. Didn't take long to get from there to realizing that %n% integers can be arranged in %n!% different permutations, which I reasoned meant that one must gather enough information (read: take comparisons) from the data to uniquely identify which of %n!% different arrangements the sort is in.

That, in turn, screams out for a log, specifically %log_2(n!)%. If the sort permutation is truly random, then on average, we should expect to be able to identify it from %log_2(n!)% bits (read: comparisons, again.) To be a little more precise, I guess it'd be more like \[ \frac{\lceil{n \log_2 (n!)}\rceil}{n} . \]

I cheated a little here and plugged %lim_{n->\infty} [log_2(n!)]% into Wolfram Alpha, and it was clear the dominating factor is, surprise surprise, %n log n%. As for those mystery %n log log n% algorithms, they were tough to track down too, and there seemed to be a lack of final consensus on the issue. Due to the math described herein if nothing else, they must operate with some limitation or another on domain or input, assuming they do work.

Later, seeing the bottom of the Wikipedia page on sorting algorithms, I saw all of this done similarly, with the weird surprise that once you get to %n=12%, the maximum number of comparisons required suddenly goes up by one inexplicably, requiring %30% comparisons where we predict %29%. Sadly, the footnote links explaining those aberrations were in journals behind paywalls, but the gist seemed to be that it was a bit of an open (or at least highly non-trivial) question.