This week’s been spent investigating properties of continued fractions, specifically the periodic species which arise when dealing with $\sqrt{n}$ where $n \in \mathbf{N}$.

Notation

$$[a_0;\overline{a_1,a_2,a_3,...,a_{i-1},a_i}] := a_0+\cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{\ddots+\cfrac{1}{a_{i-1}+\cfrac{1}{a_i+\cfrac{1}{a_0+\cfrac{1}{\ddots}}}}}}}$$

The bracketed string on the left expands as shown, with each element $a_j$ included until reaching $i$ and repeating from $a_1$, indefinitely.

Values

The first 25 terms of $\sqrt{n}$ for reference:

$n$ $k := \lfloor\sqrt{n}\rfloor$ ConFrac Derivation form and notes
$1$ $1$ $[1]$ [a] Perfect squares ($\sqrt{n^2}$) are trivial cases.
$2$ $1$ $[1;\overline{2}]$ [b] The form $n^2+1$ is the next simplest; the periodic part is just $2k_n$.
$3$ $1$ $[1;\overline{1,2}]$ [c] Iff $(n-k_n^2) \mid 2k_n$, the periodic part is $\left\{\frac{2k_n}{n-k_n^2}, 2k_n\right\}$.
$4$ $2$ $[2]$ [a]
$5$ $2$ $[2;\overline{4}]$ [b]
$6$ $2$ $[2;\overline{2,4}]$ [c]
$7$ $2$ $[2;\overline{1,1,1,4}]$ [d] $((k_n+1)^2-n) \mid 2(k_n+1) \Leftrightarrow \left\{1,\frac{2(k_n+1)}{(k_n+1)^2-n}-2,1,2k_n\right\}.$
$8$ $2$ $[2;\overline{1,4}]$ [c]
$9$ $3$ $[3]$ [a]
$10$ $\vdots$ $[3;\overline{6}]$ [b]
$11$ $[3;\overline{3,6}]$ [c]
$12$ $[3;\overline{2,6}]$ [c]
$13$ $[3;\overline{1,1,1,1,6}]$
$14$ $[4;\overline{1,2,1,6}]$ [d]
$15$ $[4;\overline{1,6}]$ [c]
$16$ $4$ $[4]$ [a]
$17$ $[4;\overline{8}]$ [b]
$18$ $[4;\overline{4,8}]$ [c]
$19$ $[4;\overline{2,1,3,1,2,8}]$
$20$ $[4;\overline{2,8}]$ [c]
$21$ $[4;\overline{1,1,2,1,1,8}]$
$22$ $[4;\overline{1,2,4,2,1,8}]$
$23$ $[4;\overline{1,3,1,8}]$ [d]
$24$ $[4;\overline{1,8}]$ [c]
$25$ $5$ $[5]$ [a]

Examples

To clarify forms [c] and [d], a couple quick examples should cut through the math:

Form [c]

At each perfect square, you just run through its factors. Take $n=16$ with $k=4$:

$$ 2(4) = 8\\ \star\\ n=17 - 16 = 1 → 8 / 1 = 8 \textrm{(already covered by [b])}\\ \star\\ n=18 - 16 = 2 → 8 / 2 = 4\\ \textrm{which gives us our middle 4 in} [4;\overline{4,8}]\\ \star\\ n=19 - 16 = 3 → 8 / 3, \text{ which are co-prime, so you end up with a nasty } [4;\overline{2,1,3,1,2,8}]\\ \star\\ n=20 - 16 = 4 → 8 / 4 = 2 → [4;\overline{2,8}]\\ ...\\ n=24 - 16 = 8 → 8 / 8 = 1 → [4;\overline{1,8}]. $$

Form [d]

Then you stop at the next $k$ (being $5$) and work backwards:

$$ 2(5)=10\\ \star\\ n=25-24=1 \textrm{(covered by [c])}\\ \star\\ n=25-23=2 → 10 / 2 = 5\\ \textrm{and you subtract the constant 2 from your 5 to get your 3 in} [4;\overline{1,3,1,8}].\\ $$

Likewise, working backwards from $n=16$, you hit $n=14$ where $$2(4)=8 → 8/(16-14)=4 → 4-2=2$$ gives you $[4;\overline{1,2,1,6}]$. And same idea goes for for $n=7$.

Terms

A long-known, famous and straightforward algorithm exists to find the terms of a periodic continued fraction; the process is essentially identical to Euclid’s algorithm, which runs in polynomial time in the worst case. Unfortunately, generating the terms of a periodic continued fraction with a closed-form expression appears to be an open problem. There are patterns—I’ve detailed the simplest ones I’ve found in the notes above.

A few more general observations:

  • Every continued fraction representing $\sqrt{n}$ has the form $[k; \overline{S_{i ∈ I}, 2k}]$, where $S$ is a possibly empty finite palindromic sequence.
    • There is an upper bound on the number of terms; I’m not sure yet, but it seems to grow at something like $O\left(\left|S\right|\right)=C\sqrt{n}\log{n}$.
  • There can be an odd or even number of terms, but either way the terms are always palindromic (e.g. $\left\{2,1,3,1,2\right\}$ for $n=19$).
  • Every term $S_i$ is an integer, and $1 \le S_i \le k$.
  • Odd terms of any $S$ reduce the overall value of the expression, while the even ones increase it.
    • Each consecutive term of $S$ has much less of an impact on the overall value than the previous term. If $S_x=1$, its change on the overall value could only be equalled as $S_{x+1} → ∞$.
  • I strongly suspect that any sequence $S$ that occurs once occurs an infinite number of times as $n → \infty$. For any given $S$, it is possible to derive a quadratic equation $f(t)$ yielding its representative values of $n$.
    • In the forms I analyzed, every $f$ had a discriminant $b^2-4ac ∈ \left\{±1,±4\right\}$. Those with a positive discriminant appeared to yield only composite numbers (possibly excepting its real roots), while those with negative discriminants seemed to have plenty of primes through arbitrarily large $t$. However, the second-degree coefficient grows drastically in proportion to $\left|S\right|$.
  • It remains unclear what determines whether a given $S$ is valid for some $\sqrt{n}$. For example, $\left\{1,x,x,1\right\}$ is a valid $S$ for odd $x$ but not for even.

Prime generation

Okay! On to the fun speculation!

Another unsolved problem is devising a formula for creating an arbitrarily large prime. The current large-prime record is held by a Mersenne Prime with over twenty million digits. There is at least one extant quasi-formula (not totally closed-form) that is believed to generate only primes, but it is impractical for primes of any size, as it requires a lot of calculation for very little payoff (see Fortunate numbers). It’s unclear where exactly to draw the line between formula and algorithm, as in in one sense, we can theoretically generate every last prime with a simple sieve if we had the computing cycles for it.

In the course of investigating all this, I discovered a simple method that appears to generate arbitarily large primes with no false positives. Before going on, I’ll sketch out the process:

  • Begin with $n=3$ and proceed to step through odd integers. (This step could be heavily optimized.)
  • Calculate the periodic continued fraction for $\sqrt{n}$.
  • Keep track of the length of that period, i.e. $\left|S\right|$. If current period length is more than or equal to the previous longest period (this check is also unnecessarily strict, can be optimized), update your longest period length and:
    • if $\left|S\right|$ is even [this is speculative but not yet disproven], or
    • if $S$ contains the term $k$ where $k = \left\lfloor\sqrt{n}\right\rfloor = a_0$ … (N.B. this term will only appear once; it will be the pivot/center element of $S$)
  • … then $n$ is prime!
  • this also appears to work for monotonically increasing period length when $S$ is even.
    • in this case, $k$ is guaranteed not to appear. (maybe the same goes for second-highest term also?)
  • even $S$ and odd $S$ lengths have an interesting relationship. at first glance, it appears that the only composite terms appearing in even $S$ are those that $≡1 \pmod{4}$, while anything with a term $≡3 \pmod{4}$ ends up in odd $S$. However, this is no longer true as of $205=5⋅41$, followed by $221=13⋅17$ and $301=5⋅61$ and others nearby. My only guess so far as to why is that $205=p_4\#-5$ and $221=p_4\#+11$, but I haven’t been able to reaso9944234n any further than that.

The following issues/questions stand between this and substantive progress.

  • Brute force checking through $n=2×10^8$ confirms that it holds for that interval, and the data suggest that it will probably continue to hold. However, this would need to be proven.
  • Related questions:

    • why does it work?
      • why do the terms have to be arranged that way?
      • why palindromic? clearly these are cancelling terms, with the exception of the center pivot term and the $2k$ term…
        • ah… so if IS an even-length period, that implies odd-length palindrome, implies pivot and $2k$ term, but crucially!, as the period repeats, it alternates and they ALL “cancel” after full infinite regression… except they don’t really cancel, of course. so what are they doing??
        • how can one calculate the relative value between consecutive terms? is that as simple as checking adjoining convergents?
      • if not palindromic, you end up with terms outside of the radical.
        • discovery! the random $2k$ is not random after all. if you write the full $[k;S,2k]$ form as a recursive fraction e.g. $x=1+\cfrac{1}{1+\cfrac{1}{2+x}}$ for $\sqrt{3}$, in order to make it work cleanly, you need to eliminate the leading constant $k$. this can be done by noting that as is, the $x$ in the fraction is almost perfect, but it too has that extra constant term, so we get away with ignoring it by replacing the $x$ in the fraction with $x-1$, and then subtract the constant also, yielding $x-1=\cfrac{1}{1+\cfrac{1}{2+(x-1))}} → x=1+\cfrac{1}{1+\cfrac{1}{1+x}}=\sqrt{3}$
      • what’s with the quasi-geometric inverse progression starting from $k$? (early factors of $2/3$ and $1/2$)
        • it all has to do with fibonacci/golden ratio! the more 1s, the more fib-type numbers will populate, and the longer it will be besides.
          • N.B. this gives us very loose bounds on ratios of successive periods… period length times somewhere between all 1s (yielding $φ^p$ as a lower bound) to … well, it’s probably more like the sqrt(k), but as a generous upper bound, let’s say k was every integer somehow, yielding … $k^p$? maybe? right ballpark even? both near 0.62 or 0.63, right above the highest case I’ve seen (which was 0.61ish)
    • are all even-length $S$ prime? why?
    • what is the cutoff ratio (if any) which guarantees primes? (early guesses: $\varphi^{-1}=[\frac{1+\sqrt{5}}{2}]^{-1}$ or $1-1/e$ times maxLength)
    • is there a closed-form expression for finding terms of an arbitrary $n$?
      • if not, method is unlikely to be useful for generating appreciably large primes
      • however, it is possible that there’d be a way to pick a large number at random, determine the continued fraction, and modify it suitably to find a qualifying prime
        • investigate whether Gosper could be applicable here
    • it appears highly effective to take the smallest $k$ entry for each of even and odd length periods. try huge primes this way. the only remaining difficulty is really just figuring out how to reach a suitable period length.
      • if you take just the odds, excepting 10 and 97, all prime through the first 5000
    • if there is some way to calculate just the number of terms for a given $\sqrt{n}$, that may be sufficient to do what we want!

    • Another speculation/question: suppose you are given an arithmetical tool which will return term $n$ of a PCF but gives you no other information.

      • The goal is to determine what the period $d$ is as efficiently as possible.
        • You know the PCF is the square root of a natural number.
        • Although knowing the value of $k$ would give you a pretty tight upper bound (oscillating but absolutely converging) which I think goes on the order of $\sqrt{n}*\log{n}$, let’s pretend you don’t know that fact or can’t use it.
        • You do, however, know $k$ and $2k$. If you didn’t know that or the period, it would be mathematically impossible to know if you ever truly found a period. Anway, if you see either of them, you have reached a midpoint or endpoint of a period (or possibly multiples of a period.) From there, it’s straightforward to track down the answer.
      • The naive way to do this is simply start from the first value and increment until you hit $k$—or, if it’s an odd-length period, you would (potentially but not necessarily) have to wait for the full $2k$ to be certain. At any rate, you’re looking at an expected lookup argument between $d/2$ and $d$.
      • My question: can this be improved? How?
        • Note that all other considerations aside, it should be preferable to reveal those term indices which could probabilistically suggest a reflection, especially when taken with other pairs of terms around the same (?) midpoint.
        • Note also that perfect squares have no period.
          • If choose $n=2$:
            • k@1 -> k or 2k@2, you win.
            • k@2 also win.
            • Suppose $3 \le k \le 8 then.
          • Choosing 3 is madness. Choosing 4 seems plausible since it gives you a possible reflection about $n=3$. But I’ll caution here that you can’t do anything as simple as stick to picking even numbers, as it’s completely possible that… wait, no it’s not. You’re guaranteed still to run into $k$ or $2k$. Picking only odd numbers would be a little hairier, but maybe still workable in practice up until the very end.
          • Moving on, choosing 5 is what I’m hoping is somehow smart. If it turned out primes were good candidates for this, well, that’s why math is great. Let’s think…
            • You could have $(k, 2k)$ at $(3, 6)$ or $(4, 8)$, or and odd-length period with $k$ at 3 and 4 and $2k$ at 7. The immediate downside to a 5 choice is that it’s co-prime to your previous guesses so you can’t possibly glean any useful information from that; however, I’m harboring a suspicion here.
            • The logic goes that realistically, you need to reveal either a $k$ or $2k$ term (which don’t necessarily immediately solve it, but will almost automatically thereafter). These terms are (if period is even) present with collective frequency $2/d$ or (if period is odd and pivot $k$ repeats) $3/d$. Note the latter happens substantially more rarely so far as I’ve seen. These PCFs seem to prefer having the single pivot card by a hefty margin.
              • _This appears to be because of all the composites with factors of 3, 7, 11, 19, 23 etc. which in $n$ almost always lead to single center card and even overall $d$, while you only see primes and a few composites made up completely of 5, 13, 17, 29. (2s seem neutral.) The $n+2$ offset and fortuitous early gaps end up having a large impact on overall distribution (no idea whether it evens out asymptotically, but probably.)
            • Back to the point, it seems the goal should therefore be to maximize terms with exposing its double or half as well. Incidentally, I think it would actually be sound just to pick $n=1,2,4,8,16,...,$ as you’re guaranteed to hit $2k$ (for even $d$) or $k$ (for odd $d$). Ooh, but wait… my investigations taught me that while a single center pivot term will often have $k$ as its value, this is not so (in fact, I think it’s impossible) for odd $d$, so you won’t know when you hit it. Plus, then you’ll either over- or undershoot $2k$ by one. I mean, you’ll still eventually catch up with it cause co-prime and cycles and all, but it’ll take forever and a day. And come to think of it, if you keep doubling, perhaps even that’s not guaranteed.
              • That does bring to mind the point that it’s a valid strategy to pick any prime number out of the gate and just check evey $p^{th}$ term. You’ll see $2k$ after … $\lcm{p,2k} / p$ guesses? Whatever, something like that.
            • Anyway, this is all very beside-the-point. Staying on mission…

Mathematica dumps

*N.B. $∀A:A_i←\sqrt{A_i}$

$ S $ even/odd A diff
1 odd 2
even 1
------- ---------- -------- ----------
2 all 12 (+8) only legal form: $∀n∈\mathbf{N}:{2n, 2n}$
------- ---------- -------- ----------

Equivalencies

$S_α$ $≡S_β$
3 1,1,1
4 1,2,1
2,2,2 1,1,2,1,1
2,4,2 1,1,4,1,1
3,1,3 1,2,1,2,1 2,1,2,1,2 1,1,1,2,1,1,1
3,2,3 1,2,2,2,1
3,3,3 1,2,3,2,1 2,1,6,1,2 1,1,1,6,1,1,1
3,4,3 1,2,4,2,1 2,1,1,1,2 1,1,1,1,1,1,1
3,7,3 1,2,7,2,1 (!)
4,2,4 1,3,2,3,1
5,3,5 2,4,4,2 6,1,1,6 1,4,3,4,1 2,2,6,2,2 1,1,4,4,1,1 1,5,1,1,5,1 1,1,2,6,2,1,1
5,4,5 1,4,4,4,1 2,1,1,1,1,1,2 1,1,1,1,1,1,1,1,1
5,5,5 1,4,5,4,1
5,7,5 4,3,3,4 1,4,7,4,1 1,3,3,3,3,1 2,1,2,1,1,2,1,2 1,1,1,2,1,1,2,1,1,1
7,4,7 1,6,4,6,1 2,1,1,3,1,1,2 3,1,1,1,1,1,3 1,1,1,1,3,1,1,1,1 1,2,1,1,1,1,1,2,1
2,3,5,3,2 4,3,1,3,4 1,1,3,5,3,1,1 (!) 1,3,3,1,3,3,1 2,1,1,1,1,2,1,1,1,1,2 1,1,1,1,1,1,2,1,1,1,1,1,1
5,1,1,1,1,5 1,4,1,1,1,1,4,1