I have a large text file (475,000,000 lines). I would like to quickly get the number of lines in a file without reading it.
freadfrom data.tableactually actually prints out the line number (~ 10 seconds) before it goes on to read the whole file:
fread('D:/text_file.txt',select=1,colClasses="character")
Read 7.1% of 472933221 rows
Is there any way to extract this line number without reading the whole file after that? For writing, reading the entire file takes 36 seconds.
I tried countLinesout R.utils, but takes 53 seconds. The difference may be that it freadhas the ability to select only one column, while countLines reads everything.
R.utils::countLines("D:/text_file.txt")
I also tried other Windows methods, such as:
find /v /c "" "D:\text_file.txt"
grep "^" D:\text_file.txt | wc -l
These work, but they are not as fast as fread. I am on Windows.