We can easily solve for the first appearance of a number in the 400000 range in less than four seconds:
Prelude Diatomic> firstDiatomic 400000 363490989 (0.03 secs, 26265328 bytes) Prelude Diatomic> map firstDiatomic [400000 .. 400100] [363490989,323659475,580472163,362981813,349334091,355685483,346478235,355707595 ,291165867,346344083,347155797,316314293,576398643,315265835,313171245,355183267 ,315444051,315970205,575509833,311741035,340569429,313223987,565355925,296441165 ,361911645,312104147,557145429,317106853,323637939,324425077,610613547,311579309 ,316037811,311744107,342436533,348992869,313382235,325406123,355818699,312128723 ,347230875,324752171,313178421,312841811,313215645,321754459,576114987,325793195 ,313148763,558545581,355294101,359224397,345462093,307583675,355677549,312120731 ,341404245,316298389,581506779,345401947,312109779,316315061,315987123,313447771 ,361540179,313878107,304788843,325765547,316036275,313731751,355635795,312035947 ,346756533,313873883,349358379,357393763,559244877,313317739,325364139,312128107 ,580201947,358182323,314944173,357403987,584291115,312158827,347448723,363246413 ,315935571,349386085,315929427,312137323,357247725,313207657,320121429,356954923 ,557139285,296392013,576042123,311726765,296408397] (2.45 secs, 3201358192 bytes)
The key to it is the Culkin-Wilf tree.
Starting with fraction 1/1 , it is constructed according to the rule that for a node with a/b share, its left child carries the fraction a/(a+b) , and its right child carries fractions (a+b)/b .
1/1 / \ / \ / \ 1/2 2/1 / \ / \ 1/3 3/2 2/3 3/1
etc .. The diatomic sequence (starting from index 1) is a sequence of numerators of fractions in the Calkin-Wilf tree, when this level passes through the level, each level from left to right.
If we look at the index tree
1 / \ / \ / \ 2 3 / \ / \ 4 5 6 7 / \ 8 9 ...
it is easy to verify that the node index in the index k in the Calkin-Wilf tree by induction carries fractions a[k]/a[k+1] .
This is obviously true for k = 1 ( a[1] = a[2] = 1 ), and from now on
for k = 2*j we have a left child element node with index j ; therefore, the fraction a[j]/(a[j]+a[j+1]) and a[k] = a[j] and a[k+1] = a[j] + a[j+1] are the governing equations of the sequence.
for k = 2*j+1 , we have the correct child node element with index j , so the fraction (a[j]+a[j+1])/a[j+1] and again a[k]/a[k+1] defining equations.
All positive reduced fractions occur exactly once in the Calkin-Wilf tree (on the left as an exercise for the reader), therefore all positive integers are found in a diatomic sequence.
We can find the node in the Calkin-Wilf tree from the index, following the binary representation of the index, from the most significant bit to the minimum, for 1-bit we go to the right child and for 0-bit to the left. (To do this, itβs nice to expand the Calkin-Wilf tree with node 0/1 , whose right child is 1/1 node, so we need a step for the most significant bit of the index set.)
Now this still does not help much to solve the problem.
But first, let's solve the related problem: for the reduced fraction p/q we define its index.
Suppose that p > q . Then we know that p/q is the right child, and its parent element is (pq)/q . If also pq > q , we again have a right child, whose parent is (p - 2*q)/q . Continuing if
p = a*q + b, 1 <= b < q
then we will reach the p/q node from the b/q node by going to the right child a time.
Now we need to find a node whose numerator is less than its denominator. This, of course, is the left child of his parent. Then the parent of b/q is b/(qb) . If a
q = c*b + d, 1 <= d < b
we need to go to the left child c times from node b/d to reach b/q .
And so on.
We can find the path from the root ( 1/1 ) to the p/q node using a continued fraction (here I consider only simple continued fractions) p/q decomposition. Let p > q and
p/q = [a_0, a_1, ..., a_r,1]
continued fractional extension p/q ending in 1 .
- If
r even, go to the right child a_r times, then left a_(r-1) times, then to the right child ... then a_1 times to the left, and finally a_0 times to the right. - If
r is odd, first go to the left child a_r times, then a_(r-1) times to the right ... then a_1 times to the left, and at the end a_0 times to the right.
For p < q , we must end the left movement, so we start the left movement for even r and begin the right movement for odd r .
Thus, we found a close relationship between the binary representation of the index and the continuation of the fractional fraction of the fraction carried by the node along the path from the root to node.
Let the coding of an index of length k be equal
[c_1, c_2, ..., c_j] (all c_i > 0)
i.e. the binary representation of k begins with c_1 units, followed by c_2 zeros, then c_3 , etc., and ends with c_j
- if
k is odd - therefore, j also odd; - zeros, if
k even, then j is even.
Then [c_j, c_(j-1), ..., c_2, c_1] is a continuation of the fractional part a[k]/a[k+1] , the length of which has the same parity as k (each rational has exactly two extended decompositions in fractions, one with an odd length, the other with an even length).
RLE passes the path from 0/1 node above 1/1 to a[k]/a[k+1] . Path length
- the number of bits required to represent
k , and - the sum of partial fractions in the decomposition of continued fractions.
Now, to find the index of the first occurrence n > 0 in the diatomic sequence, we first notice that the smallest index must be odd, since a[k] = a[k/2] for even k . Let the smallest index k = 2*j+1 . Then
- the length of RLE
k odd, - fraction in node with index
k is a[2*j+1]/a[2*j+2] = (a[j] + a[j+1])/a[j+1] , therefore, it is correct child.
Thus, the smallest index k with a[k] = n corresponds to the very last end of all shortest paths to node with numerator n .
The shortest paths correspond to decompositions of the continuation of fractions n/m , where 0 < m <= n is reciprocal to n [fraction should be reduced] with the smallest sum of partial fractions.
How long should we expect? Given the continuous fraction p/q = [a_0, a_1, ..., a_r] with a_0 > 0 and sum
s = a_0 + ... + a_r
the numerator p bounded by F(s+1) and the denominator q on F(s) , where F(j) is the number j -th Fibonacci. The boundaries are sharp, for a_0 = a_1 = ... = a_r = 1 fraction is F(s+1)/F(s) .
So, if F(t) < n <= F(t+1) , the sum of the partial relations of the decomposition of the continued fraction (either of the two) is >= t . Often there exists m such that the sum of the partial decomposition ratios along the fractions n/m is exactly t , but not always:
F(5) = 5 < 6 <= F(6) = 8
and the continued decompositions of fractions of two reduced fractions 6/m with 0 < m <= 6 are equal
6/1 = [6] (alternatively [5,1]) 6/5 = [1,4,1] (alternatively [1,5])
with the sum of the partial coefficients 6. However, the smallest possible sum of the partial coefficients is never much larger (the largest of which I know is t+2 ).
The ongoing decompositions of the fractions n/m and n/(nm) closely related. Suppose that m < n/2 , and let
n/m = [a_0, a_1, ..., a_r]
Then a_0 >= 2 ,
(nm)/m = [a_0 - 1, a_1, ..., a_r]
and since
n/(nm) = 1 + m/(nm) = 1 + 1/((nm)/m)
the continued fractional extension n/(nm) is
n/(nm) = [1, a_0 - 1, a_1, ..., a_r]
In particular, the sum of the partial relationships is the same for both.
Unfortunately, I do not know how to find m with the smallest sum of partial coefficients without brute force, therefore the algorithm (suppose n > 2
for 0 < m < n/2 coprime to n , find the continuation of the fractional fraction n/m , collecting units with the smallest sum of partial fractions (the usual algorithm creates expansions whose last partial coefficient is > 1 , suppose that).
Correct the found disparate decompositions of the fractions [the number of which is small] as follows:
- if CF
[a_0, a_1, ..., a_r] has an even length, convert it to [a_0, a_1, ..., a_(r-1), a_r - 1, 1] - otherwise use
[1, a_0 - 1, a_1, ..., a_(r-1), a_r - 1, 1]
(which selects a value between n/m and n/(nm) leading to a lower index)
undo continued fractions to get encodings of the execution length of the corresponding indices
Choose the smallest among them.
In step 1, it is useful to use the smallest amount found so far for a short cut.
Code (Haskell, as this is easiest):
module Diatomic (diatomic, firstDiatomic, fuscs) where import Data.List strip :: Int -> Int -> Int strip p = go where go n = case n `quotRem` p of (q,r) | r == 0 -> go q | otherwise -> n primeFactors :: Int -> [Int] primeFactors n | n < 1 = error "primeFactors: non-positive argument" | n == 1 = [] | n `rem` 2 == 0 = 2 : go (strip 2 (n `quot` 2)) 3 | otherwise = go n 3 where go 1 _ = [] go mp | m < p*p = [m] | r == 0 = p : go (strip pq) (p+2) | otherwise = go m (p+2) where (q,r) = m `quotRem` p contFracLim :: Int -> Int -> Int -> Maybe [Int] contFracLim = go [] where go acc lim nd = case n `quotRem` d of (a,b) | lim < a -> Nothing | b == 0 -> Just (a:acc) | otherwise -> go (a:acc) (lim - a) db fixUpCF :: [Int] -> [Int] fixUpCF [a] | a < 3 = [a] | otherwise = [1,a-2,1] fixUpCF xs | even (length xs) = case xs of (1:_) -> fixEnd xs (a:bs) -> 1 : (a-1) : bs | otherwise = case xs of (1:_) -> xs (a:bs) -> 1 : fixEnd ((a-1):bs) fixEnd :: [Int] -> [Int] fixEnd [a,1] = [a+1] fixEnd [a] = [a-1,1] fixEnd (a:bs) = a : fixEnd bs fixEnd _ = error "Shouldn't have called fixEnd with an empty list" cfCompare :: [Int] -> [Int] -> Ordering cfCompare (a:bs) (c:ds) = case compare ac of EQ -> cfCompare ds bs cp -> cp fibs :: [Integer] fibs = 0 : 1 : zipWith (+) fibs (tail fibs) toNumber :: [Int] -> Integer toNumber = foldl' ((+) . (*2)) 0 . concat . (flip (zipWith replicate) $ cycle [1,0]) fuscs :: Integer -> (Integer, Integer) fuscs 0 = (0,1) fuscs 1 = (1,1) fuscs n = case n `quotRem` 2 of (q,r) -> let (a,b) = fuscs q in if r == 0 then (a,a+b) else (a+b,b) diatomic :: Integer -> Integer diatomic = fst . fuscs firstDiatomic :: Int -> Integer firstDiatomic n | n < 0 = error "Diatomic sequence has no negative terms" | n < 2 = fromIntegral n | n == 2 = 3 | otherwise = toNumber $ bestCF n bestCF :: Int -> [Int] bestCF n = check [] estimate start where pfs = primeFactors n (step,ops) = case pfs of (2:xs) -> (2,xs) _ -> (1,pfs) start0 = (n-1) `quot` 2 start | even n && even start0 = start0 - 1 | otherwise = start0 eligible k = all ((/= 0) . (k `rem`)) ops estimate = length (takeWhile (<= fromIntegral n) fibs) + 2 check candidates lim k | k < 1 || n `quot` k >= lim = if null candidates then check [] (2*lim) start else minimumBy cfCompare candidates | eligible k = case contFracLim lim nk of Nothing -> check candidates lim (k-step) Just cf -> let s = sum cf in if s < lim then check [fixUpCF cf] s (k - step) else check (fixUpCF cf : candidates) lim (k-step) | otherwise = check candidates lim (k-step)