Converting a base base as a stream operation

Is there a way in a permanent workspace to do arbitrary size and arbitrary basic transformations. That is, to convert a sequence of numbers n in the range [1,m] into a sequence of numbers ceiling(n*log(m)/log(p)) in the range [1,p] using a 1 to 1 mapping that (preferably, but not necessarily) lexicographic custodians and gives consistent results?

I am particularly interested in solutions that are viable as a function of the pipe, for example, ei can process a larger data set than can be stored in RAM.

I found a number of solutions that require a "workspace" proportional to the size of the input, but so far this cannot go away with a permanent "workspace".


Does limiting consecutive constraints eliminate? That is: they allow lexicographically sequential inputs to lead to non-lexicographically sequential outputs:

 F(1,2,6,4,3,7,8) -> (5,6,3,2,1,3,5,2,4,3) F(1,2,6,4,3,7,9) -> (5,6,3,2,1,3,5,2,4,5) 

some thoughts:

Could this work?

streamBase n -> convert ( n , lcm(n,p) ) -> convert ( lcm(n,p) , p ) -> streamBase p

(where lcm is the smallest common multiple)

+7
math algorithm complexity-theory
source share
3 answers

I do not think this is possible in the general case. If m is a power of p (or vice versa), or if they are both powers of a common base, you can do this, since each group of the m ( p ) journal is then independent. However, in the general case, suppose you convert the number a 1 a 2 a 3 ... a n . The equivalent number in the base p is

sum(a i * m i-1 for i in 1..n)

If we processed the first digits i , we get i th a partial sum. To calculate the i+1 '-th partial sum, we need to add a i+1 * m i . In the general case, this number has nonzero digits in most places, so we will need to change all the digits that we have processed so far. In other words, we will have to process all input digits before we know what the final output digits will be.

In the particular case when m are both powers of a common base or equivalent, if log m ( p ) is a rational number, then m i will have only a few nonzero digits in the base p near the front, so we can safely print most of the digits that we have calculated so far.

+6
source share

I think there is a way to do the conversion of the radix in a streaming style in lexicographical order. However, what I came up with is not enough to do this, and has several assumptions:

  • The length of the positional numbers is already known.
  • The numbers described are integers. I did not think about what was happening with math and indexes.

We have a sequence of values ​​a of length p, where each value is in the range [0, m-1]. We need a sequence of b values ​​of length q in the range [0, n-1]. We can work out the kth digit of our output sequence b as follows:

b k = floor [sum (a i * m i for i from 0 to p-1) / n k ] mod n

Allows you to rearrange this sum into two parts, splitting it at an arbitrary point z

b k = floor [(sum (a i * m i for i in z to p-1) + sum (a i * m i for i from 0 to z-1)) / n k ] mod n

Suppose that we do not yet know the values ​​of a between [0, z-1] and cannot calculate the second member of the sum. We have to deal with ranges. But it still gives us information about b k .

The minimum value of b k may be:

b k > = floor [sum (a i * m i for i in z to p-1) / n k ] mod n

and the maximum value of b k may be:

b k <= floor [(sum (a i * m i for i in z to p-1) + m z - 1) / n k ] mod n

We should be able to complete this process:

  • Initialize z for p. We will count from p, getting each character a.
  • Initialize k to the index of the most significant value in b. If my brain is still working, ceil [log n (m p )].
  • Read the meaning of a. Decrease z.
  • Calculate the min and max values ​​for b k .
  • If the values ​​of min and max are the same, print b k and decrease k. Goto 4. (Perhaps it is possible that we already have enough values ​​for several consecutive values ​​of b k )
  • If z! = 0, then we expect more a values. Go to 3.
  • I hope we are done with this.

I have not considered how to efficiently calculate range values, but I am sure that calculating the sum of the input characters can be done much more wisely than saving everything. However, without doing mathematics, I will not make any hard claims to this, though!

+2
source share

Yes maybe

For each character (s) you are reading, you will write O (s) characters based on the ceiling (length * log (In) / log (output)).

Allow enough space

 Set x to 1 Loop over digits from end to beginning # Horner method Set a to x * digit Set t to O - 1 Loop while a > 0 and t >= 0 Set a to a + out digit Set out digit at position t to a mod to base Set a to a / to base Set x to x * from base Return converted digit(s) 

Thus, for a base from 16 to 2 (which is easy), using β€œ192FE”, we read β€œ1” and convert it, then repeat β€œ9”, then β€œ2”, etc., giving us β€œ0001”, 1001 ',' 0010 ',' 1111 'and' 1110 '. Note that for bases that are not common powers, such as base 17 to base 2, will mean reading 1 character and writing 5.

0
source share

All Articles