Entering a single column of information in the data frame R.

I am currently using this code to enter data from multiple files in R:

library(foreign) setwd("/Users/ericbrotto/Desktop/A_Intel/") filelist <-list.files() #assuming tab separated values with a header datalist = lapply(filelist, function(x)read.table(x, header=T, sep=";", comment.char="")) #assuming the same header/columns for all files datafr = do.call("rbind", datalist) 

The headers are as follows:

 TIME ;POWER SOURCE ;qty MONITORS ;NUM PROCESSORS ;freq of CPU Mhz ;SCREEN SIZE ;CPU LOAD ;BATTERY LEVEL ; KEYBOARD MVT ; MOUSE MVT ;BATTERY MWH ;HARD DISK SPACE ;NUMBER PROCESSES ;RAM ;RUNNING APPS ;FOCUS APP ;BYTES IN ;BYTES OUT ;ACTIVE NETWORKS ; IP ADDRESS ; NAMES OF FILES ; 

and the sample data is as follows:

  2010-09-11-19:28:34.680 ; BA ; 1 ; 2 ; 2000 ; 1440 : 900 ; 0.224121 ; 92 ; NO ; NO ; NULL ; 92.581558 ; 57 ; 196.1484375 ; +NULL ; loginwindow-#35 ; 5259 ; 4506 ; en1 : ; 192.168.1.3 ; NULL ; 

Instead, enter all the columns in the data frame, I would just grab one, say FOCUS APP.

+4
source share
3 answers

If you just want to read in a specific column from your files, then colClasses is the way to go. For example, suppose your data looked like this:

 a,b 1,2 3,4 

Then

 ## Use colClasses to select columns ## "NULL" means skip the column ## "numeric" means that the column is numeric ## Other options are Date, factor - see ?read.table for more ## Use NA to let R decide data = read.table("/tmp/tmp.csv", sep=",", colClasses=c("NULL", "numeric"), header=TRUE) 

gives only the second column.

 > data b 1 2 2 4 
+3
source

maybe just adding the column name to the row of the read table in order, for example:

 datalist = lapply(filelist, function(x)read.table(x, header=T, sep=";", comment.char="")["FOCUS APP"]) 
0
source

If you just do it once, then colClasses answer is probably the best (although it is still read in all the data, it only processes one column). If you do such things so often, you can use the database instead. Take a look at the RSQLite, sqldf, and SQLiteDF packages, as well as RODBC for some features.

0
source

All Articles