Since you take 100,000 samples from a total of 10,000 lines, most lines will be sampled. Read the entire file in the data structure of the array, and then arbitrarily try the array. This allows you to completely avoid file searches.
A more common case is to select only a small subset of the file data. To do this, assuming the lines are of different lengths, find random points in the file, go to the next new line (for example, cin.ignore( numeric_limits< streamsize >::max(), '\n' ) , and then analyze the subsequent text.
Potatoswatter
source share