Extract line count from fread without reading the whole file

I have a large text file (475,000,000 lines). I would like to quickly get the number of lines in a file without reading it.

freadfrom data.tableactually actually prints out the line number (~ 10 seconds) before it goes on to read the whole file:

fread('D:/text_file.txt',select=1,colClasses="character")
Read 7.1% of 472933221 rows #number of rows appears after 10 seconds

Is there any way to extract this line number without reading the whole file after that? For writing, reading the entire file takes 36 seconds.

I tried countLinesout R.utils, but takes 53 seconds. The difference may be that it freadhas the ability to select only one column, while countLines reads everything.

R.utils::countLines("D:/text_file.txt") #53 seconds

I also tried other Windows methods, such as:

find /v /c "" "D:\text_file.txt" #takes 1 minute 50 seconds
grep "^" D:\text_file.txt | wc -l #takes 2 minutes

These work, but they are not as fast as fread. I am on Windows.

+6
1

@d.b . @G. , wc, Rtools, R Microsoft Windows.

, C:\Rtools\bin PATH Windows.

wc R, system shell:

shell('wc -l "D:/text_file.txt"',intern =TRUE)
+5

All Articles