The question is not as stupid as some people think. I know a lot of people struggling with this difference, and what to use where. Summarizing:
Lists are by far the most flexible data structure in R. They can be considered as a collection of elements without any restrictions on the class, length or structure of each element. The only thing you need to take care of is that you are not giving the two elements the same name. This can cause a lot of confusion, and R gives no errors for this:
> X <- list(a=1,b=2,a=3) > X$a [1] 1
Data frames are also lists, but they have several limitations:
- you cannot use the same name for two different variables
- all elements of the data frame are vectors
- all elements of the data frame are of equal length.
Due to these limitations and the resulting two-dimensional structure, data frames can mimic some of the properties of matrices. You can select rows and perform operations on rows. You cannot do this with lists, since there is an undefined string.
All of this means that you must use a data frame for any data set that fits into this two-dimensional structure. In essence, you use data frames for any data set where the column matches a variable and the row matches one observation in the broad sense of the word. For all other structures, lists are the way to go.
Note that if you want a nested structure, you need to use lists. Since list items can be lists themselves, you can create very flexible structured objects.
Joris Meys Apr 09 '13 at 13:14 2013-04-09 13:14
source share