I have a very large data file in R (in Giga). If I try to open it with R, I will get an error from memory.
I need to read the file line by line and do some analysis. I found a previous question on this issue when the file was read in n-lines and jumped to specific lines with clump. I used the answer "Nick Sabbe" and added some changes according to my needs.
Please note that I have the following test.csv file:
ABC 200 19 0.1 400 18 0.1 300 29 0.1 800 88 0.1 600 80 0.1 150 50 0.1 190 33 0.1 270 42 0.1 900 73 0.1 730 95 0.1
I want to read the contents of a file line by line and do my analysis. So I created the following reading cycle based on the code posted by "Nick Sabbe". I have two problems: 1) The title is printed every time I print a new line. 2) Column โXโ by R is also printed, although I am deleting this column.
Here is the code I'm using:
test<-function(){ prev<-0 for(i in 1:100){ j<-i-prev test1<-read.clump("file.csv",j,i) print(test1) prev<-i } }
The output I get is:
ABC 1 200 19 0.1 NA 1 1 1 1 2 400 18 0.1 NA 1 1 1 1 3 300 29 0.1 NA 1 1 1 1 4 800 88 0.1 NA 1 1 1 1 5 600 80 0.1
I want to get rid of the rest of the reading:
NA 1 1 1
Also, is there a way to make a for stop loop when the end of the file is such an EOF in another language ???
source share