How to save shortcuts when SPSS file (.sav) is imported into pandas via rpy?

I am looking to work with SPSS files (.sav) using pandas . In the absence of the SPSS program, here is what a typical file looks like when converting to .csv:

enter image description here

When examining what the first two lines mean (I don't know SPSS), it seems that the first line contains Label s and the second line contains VarName s.

enter image description here

When I list the file in pandas this way:

 import pandas.rpy.common as com def savtocsv(filename): w = com.robj.r('foreign::read.spss("%s", to.data.frame=TRUE)' % filename) w = com.convert_robj(w) return w 

and then run the head () command, the first line (Label) is missing:

enter image description here

How can I save tags?

  • Link: Is there a Python module for opening SPSS files?
  • Python: 2.7.10
  • Pandas: 0.17.1
+5
python pandas r rpy2 spss
source share
1 answer

Labels in the sav file are stored in the variable.labels attribute of the returned object from the read.spss function.

You can get variable labels with the following:

 import pandas.rpy.common as com def get_labels(filename): w = com.robj.r('attr(foreign::read.spss("%s"), "variable.labels")' % filename) w = com.convert_robj(w) return w 

If you want to set labels as column names of your data frame:

 import pandas.rpy.common as com def savtocsv(filename): w = com.robj.r('foreign::read.spss("%s", to.data.frame=TRUE)' % filename) cols = list(com.robj.r("attr")(w, "variable.labels")) w = com.convert_robj(w) w.columns = cols return w 
+3
source share

All Articles