Well, if MongoDB and CouchDB do not work for you, then you basically have one problem: not enough power .
Look at the laundry list:
It should scale to O markers (10 ^ 8).
How much RAM do you have? You are talking about hundreds of millions of tokens, and you are talking about the flow of a 7zip file. If you want to quickly "increase", you must be able to store the entire data structure in memory, or it will all go very slowly.
The end result must be requested very quickly!
How fast? Microseconds, milliseconds, hundreds of milliseconds? If you want to request 500M records on a computer with 8 GB of RAM, you pretty much pounced. The data just doesn't fit, it doesn't matter which database you use.
Dataset> 2Tb
OK, let's say that your computer can average about 50 MB / s of continuous bandwidth and that your proc can actually decompress data at that rate. At this pace, you're talking about 11+ hours of processing, just to transfer data (did you want to do this on the weekend?)
50 MB / s throughput for 11 hours is not a small potato, but a real drive. And if you try to write something to disk while this happens (or OS swaps), it will deteriorate quickly.
Look from the point of view of the database, MongoDB can handle both updating the interface and the internal request. But he must click on the disk every time, and this will significantly expand your 11-hour work time.
This total runtime will only worsen and worsen if you cannot process the entire database in memory and the entire stream in memory.
My point ...
quite simply, you need more energy.
If you do not use this operation with 24 GB + RAM, then everything you do will be slow. If you do not have 24 GB + RAM, then your final data set will not be "lightning fast", at best it will be "200 ms fast." You can index 500M rows and expect to find a record if you cannot save the index in RAM.
If you do not run this operation with huge hard drives, then the decree will seem slow. I mean, you're talking about hours and hours of high-performance, steady readings (and probably writing).
I know that you need help, I know that you put generosity on this question, but it is very difficult to solve the following problem:
I am trying to use CouchDB and MongoDB without too good results.
when it sounds like you didn't put together the right gear to solve the problem.