An alternative to packed bitmaps and wheels - but equally effective in certain contexts - keeps the differences between successive strokes. If you leave number 2, as usual, all differences will be even. By storing the / 2 difference, you can get up to 2 ^ 40ish areas (just before 1999066711391) using byte variables.
Prime numbers up to 2 ^ 32 require only 194 MB, compared with 256 MB for a bitmap with a coefficient only. Iterating over delta-stored numbers is much faster than for wheeled storage, which includes the modulo-2 wheel, known as a bitmap for coefficients only.
For ranges from 1999066711391 and above, a larger cell size or variable-length storage is required. The latter can be extremely effective even if very simple schemes are used (for example, continue to add until bytes <255 are added, as in LZ4 - style compression), due to the extremely low frequency of gaps longer than 510/2.
For efficiency, it is best to divide the range into sections (pages) and manage them in a B-Tree style.
Entropy coding for differences (Huffmann or arithmetic coding) reduces the need for persistent storage by just under half, which is close to the theoretical optimum and better than lists or disks compressed using the best packers available.
If the data is stored uncompressed, then it is still much more compact than binary or text number files, an order of magnitude or more. With the B-Tree style index in place, it's easy to simply map sections to memory as needed and scroll through them at an incredible speed.
DarthGizka Nov 10 '14 at 17:38 2014-11-10 17:38
source share