How can a sequence of numbers be converted to a single number?

I want to convert a sequence of numbers into one number, which will save the individual values, as well as their position. for example, the following sequence is provided -

1,6,7,8,9,45,67

here, take, for example, if I apply a simple addition, i.e. 1 + 6 + 7 + 8 + 9 + 45 + 67, then a number will be generated. But this doesn’t. we cannot extract individual numbers with their ordering [v. 1,6,7,8,9, ...].

Is there any way to achieve this function without any ambiguous inference [i.e. only one unique set of numbers will be extracted from the number.]? Is there any mathematical function that will be useful for returning individual elements from this number?

+6
source share
8 answers

You can convert this to base number-N, where N is greater than the largest value that will be displayed in your input sequence.

UPDATE

Based on various comments, I would like to propose an alternative solution that may be easier to implement. You can view the sequence as a UTF-8 encoded string and use the Huffman encoding with a custom dictionary to achieve a compact representation.

A user dictionary allows you to store very common characters with a very small number of bits (for example, the sequence separator "," and individual characters "0" .. "9" can be stored in just 3 bits, but other numbers that, in your opinion, will be statistically likely, can be stored in a short sequence of bits, for example, if you find that “42” is common, you can save “42” in just a few bits.

If you assign special codes only "," and "0" - "9", in the input line you will average less than 4 bits per character, while maintaining a comma separating the members of the sequence. Finding common, multi-character substrings and adding them to the dictionary will only improve in this regard.

Using a custom dictionary also means that you do not need to store the dictionary in the header of the compressed data, since it is well known to you.

I did something like this using SharpZipLib

http://www.icsharpcode.net/opensource/sharpziplib/

http://community.sharpdevelop.net/forums/p/8255/23219.aspx

It is also easy to do with zlib

Small data compression

+9
source

Mathematically, this can be done for finite sequences, but not very practical, because the required numbers grow very quickly: 67 7 (about 2 42 ) of different lengths-7 sequences of integers from 1 ... 67, not to mention longer sequences and large integers.

For a simple example of such a function, compare the sequence [1,6,7,8,9,45,67] with the value 2 1 * 3 6 * 5 7 * 7 8 * 11 9 * 13 45 * 17 67 . The bases are primes, forces are elements in a sequence.

Inverse matching is calculated by dividing - the number of times you can divide your value by 2 is the first element in the sequence, etc. The biggest simple value factor tells you how long the sequence will be.

If you want to allow 0 in the sequence, as well as positive numbers, add 1 to all elements when you raise primes to powers. Or use power 2 to indicate the length of the sequence, then start coding the elements starting at 3 .

Godel used such encodings in the proof of his incompleteness theorems.

As Kendall Frey says, it is impossible to define a function that maps every infinite sequence of integers to another integer. This is a consequence of Cantor’s proof that many degrees of natural numbers are uncountable: you cannot even inject all infinite sequences of elements from {true, false} into integers, not to mention all infinite sequences of elements from integers.

For more practical approaches, think of encoding a sequence of integers as a sequence of bytes, rather than as a number. The final sequence of bytes can easily be considered a binary value, therefore, this is a number, you just do not use it as such. The general idea of ​​the sequence of your example is the sequence of bytes: [1,6,7,8,9,45,67] used, for example, in JSON. This is a 136-bit number. The mathematical function for the inverse mapping includes arithmetic modulo a degree of 256, subtracting the number 48, multiplying by 10, etc. :-)

+4
source

Let's say your sequence is called s and I define len(n) as the number of digits in n.

Then the first digit of your result is len(s[0]) , and the next digits of len(s[0]) are the number s[0] ; then you add len(s[1]) and s[1] etc.

This works for numbers with no more than 9 digits.

+2
source

You cannot if the range of your numbers is infinite.

Many natural numbers in uncountable. This means that you cannot provide a mapping between sets of numbers and numbers.

What could you do if your numbers are limited to, say, 32 bits, combine numbers into a long binary number and store them as a sequence of bytes, perhaps like BigNum.

+1
source

Here is the basic implementation of Gödel's PHP numbering described above by Steve Jessop:

 <?php $sec = array(5,9,8,4); $n = count($sec); $max = max($sec); $enc = encode($sec); $dec = decode($enc, $n, $max); echo "Input sequence: " . implode(",", $sec) . "\n"; echo "Output sequence: " . implode(",", $dec) . "\n"; echo "Godel number: " . $enc; echo (PHP_INT_MAX/$enc < 20 ? " - too big to decode.\n" : "\n"); function encode($sec) { $primes = array(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53); $enc = 1; $i = 0; foreach ($sec as $v) { $enc = $enc * pow($primes[$i], $v+1); $i++; } return $enc; } function decode($enc, $n, $max) { $primes = array(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53); $sec = array(); for ($i = 0; $i < $n; $i++) { for ($v = 2; $v <= $max+1; $v++) { if ($enc/pow($primes[$i], $v) != round($enc/pow($primes[$i], $v))) { break; } } $sec[] = $v-2; } return $sec; } ?> 

This script only works for small numbers in small sequences. It cannot handle very large numbers, I assume this is the same as any other implementation. It’s even easier and more efficient to store numbers concatenating them as they are: [01,05,15,17] → 1051517.

+1
source

Updated to check case 0.1.

Divide the different numbers by 001.

To avoid confusion with 00 inside your numbers, every time 0 appears in your numbers, replace it with 01.

To decode, divide by 001. Replace all 01 with 0.

0
source

It uses a generic code such as Elias omega coding (or any prefix code, but generic codes are prefix codes with some desirable properties). The prefix code encodes a sequence of bits (i.e., a number) as a prefix, which basically provides the necessary information to determine the number of bits that make up the rest of the number.

1) Use code to represent the number of elements in a sequence. 2) Then use the code to represent each element.

0
source

Another answer that came up with me. Encode each number into a balanced triple with two bits per trit (for example, 0 = 00; + 1 = 01; -1 = 10). The remaining pair of bits (for example, 11) is the end of the element marker repeated to complete the sequence. Con: less space than prefix code when large values ​​are expected; Pros: 1) a more efficient space with small values; 2) encoding / decoding is easier; 3) directly represents negative values.

0
source

Source: https://habr.com/ru/post/924346/


All Articles