Error writing to csv

I am trying to write a data frame to csv, but it seems to be complaining because the columns contain lists.

I want to be able to access this data frame and call it in R later. I don’t care how to do it (save as a text file, etc.). This is a fairly large data set n = 182305. Any ideas to write it to a file that I can read pretty quickly in R (I'm not married to a csv file)

DATA Frame and the code I tried

DF2<-structure(list(word = c("3-D", "4-F", "4-H'er", "4-H", "A battery", "a bon march"), pos.code = c("AN", "N", "N", "A", "h", "v"), pos = list(c("A", "N"), "N", "N", "A", "h", "v"), noun = list( TRUE, TRUE, TRUE, FALSE, FALSE, FALSE), plural = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), noun.phrase = list( FALSE, FALSE, FALSE, FALSE, TRUE, FALSE), verb.usually.participle = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), transitive.verb = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), intransitive.verb = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), adjective = list( TRUE, FALSE, FALSE, TRUE, FALSE, FALSE), adverb = list( FALSE, FALSE, FALSE, FALSE, FALSE, TRUE), conjunction = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), preposition = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), interjection = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), pronoun = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), definite.article = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), indefinite.article = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE), nominative = list( FALSE, FALSE, FALSE, FALSE, FALSE, FALSE)), .Names = c("word", "pos.code", "pos", "noun", "plural", "noun.phrase", "verb.usually.participle", "transitive.verb", "intransitive.verb", "adjective", "adverb", "conjunction", "preposition", "interjection", "pronoun", "definite.article", "indefinite.article", "nominative"), row.names = c(NA, 6L), class = "data.frame") write.table(DF2, file = "mobyPOS.csv", sep = " ", col.names = TRUE,qmethod = "double") 

The error message I received is:

 > write.table(DF2, file = "mobyPOS.csv", sep = " ", col.names = TRUE,qmethod = "double") Error in write.table(x, file, nrow(x), p, rnames, sep, eol, na, dec, as.integer(quote), : unimplemented type 'list' in 'EncodeElement' 
+7
source share
3 answers

Try

 save(DF2, file = "mobyPOS.Rdata") 

Note that you do not need to use the "Rdata" extension, but it or "RData" is similar to the convention.

Then you can load the data with

 load("mobyPOS.Rdata") 

Note that this is different from reading an external file format, where you usually do something like

 your_object <- read.csv(...) 

Using the load command, it loads the object directly so that after executing the load command, your DF2 object is there.

+7
source

This is simply intended to solve the problem of lists as columns in the data frames mentioned in the comments.

In the specific instance of your example data, the only place where the lists are "required" is the first element in DF2$pos , which is a vector of length two. This can be removed using the following code:

 DF2$pos[[1]] <- paste(DF2$pos[[1]],collapse = "") newDF <- as.data.frame(lapply(DF2,unlist)) 

Typically, a metaphor for a data frame is that rows correspond to cases or units of observation, and columns correspond to variables. In addition, this metaphor claims that there is only one value for a single observational element for each variable. In this sense, it is the same as the matrix, only it can store columns of different classes.

Obviously, R allows you to break up this metaphor as you discovered. The question of whether it is a good idea to do this will be area and data. Not every data set fits perfectly into the data metaphor; sometimes you will have a variable where the "values" you measure are not easily collapsed into a single expression.

You will have a choice: in your case, when using newDF instead, you might need to use parsing strings ( strsplit , etc.) every time you access this value. Sometimes this can be inconvenient, and it may not match your mental model of your data.

On the other hand, most of R is built around things that are stored in data frames in ways that correspond to the metaphor of data frames. As you discovered with write.csv , if you do not adhere to these expectations, some parts (in fact, many parts) of R will not behave as you expect. This will require additional work and awkwardness.

In my experience, it is usually best to sacrifice the purity of your preconceived idea of ​​how your data should be structured, and instead do whatever it takes to fit in somehow with the data structure. At least less work was needed for this route. But nothing is ever perfect.

But, as I said at the beginning, it will be extremely specific to data and domains. YMMV.

+15
source

can be converted to characters and then saved? DF2$pos <- as.character(DF2$pos)

0
source

All Articles