One thing that I found faster for using grep for searching (especially for changing patterns) in one large file is to use split + grep + xargs with its parallel flag. For example:
Having the identifier file that you want to find in a large file called my_ids.txt The name of the file is bigfile.txt.
Use split to split the file into parts:
In my case, this would reduce what would be 17-hour work to 1 hour of 20-minute work. I am sure that there is some kind of bell-shaped efficiency curve, and it is obvious that access to the available kernels will not do you any good, but it was a much better solution than any of the above comments for my requirements, as mentioned above. This has an additional advantage over the script parallel when using mostly (linux) native tools.
user6504312 Jun 23 '16 at 12:59 2016-06-23 12:59
source share