Convert Pandas DataFrame to R data frame using Rpy2

I have a pandas framework that I convert to an R data file using the convert_to_r_dataframe method from pandas.rpy.common. I configured it as such:

self.event = pd.read_csv('C://' + self.event_var.get() + '.csv') final_products = pd.DataFrame({'Product': self.event.Product, 'Size': self.event.Size, 'Order': self.event.Order}) r.assign('final_products', com.convert_to_r_dataframe(final_products)) r.assign('EventName', self.event_var.get()) r.assign('EventTime', self.eventtime_var.get()) r.source('application.r') 

where self.event_var.get () retrieves user input in the GUI (I am building an application using Tkinter). Product, size and order are columns from the CSV file.

Since Rpy2 installs the R environment in Python, I expect the final R-data_product to be understood by the R environment. Unfortunately, while the R script is running, it does not give the correct results (I create graphs using the R script, but they are just empty when the program ends). However, the EventName and EventTime variables work. Is there something I'm missing here? Any ideas why the purpose of the R-frame in Python is misinterpreted by the R framework?

Received error:

 Exception in Tkinter callback Traceback (most recent call last): File "C:\Python27\lib\lib-tk\Tkinter.py", line 1470, in __call__ return self.func(*args) File "G:\Development\workspace\GUI\GUI.py", line 126, in evaluate r.source('application.r') File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 86, in __call__ return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs) File "C:\Python27\lib\site-packages\rpy2\robjects\functions.py", line 35, in __call__ res = super(Function, self).__call__(*new_args, **new_kwargs) 
+7
python r dataframe rpy2
source share
2 answers

Unfortunately, this will be tricky because the Python → R transformation is better than before, but not perfect, and it is still tricky on Windows now , which is similar to what you are using.

This is a bit hacky, but in the process, you can try to set the name and time variables, while you assign pd.DataFrame to , you convert the DataFrame to R.

After that, in R, you will need to use the R functions to work with the data frame, and not with your python functions - even your getter and setter will have to be transferred to the R environment so that it looks more than that:

 myfunct = robjects.r(''' f <- function(r, verbose=FALSE) { if (verbose) { cat("I am calling f().\n") } 2 * pi * r } f(3) ''') 

from here .

But in order to verify the correctness of the DataFrame conversion, you can first start your debugging by doing this:

 import pandas as pd import numpy as np import pandas.rpy.common as com from datetime import datetime n = 10 df = pd.DataFrame({ "timestamp": [datetime.now() for t in range(n)], "value": np.random.uniform(-1, 1, n) }) r_dataframe = com.convert_to_r_dataframe(df) print(r_dataframe) 

Is this creating something that looks like an R print instruction for a data frame like

 >>> timestamp value 0 2014-06-03 15:02:20 -0.36672.... 1 2014-06-03 15:02:20 -0.89136.... 2 2014-06-03 15:02:20 0.509215.... 3 2014-06-03 15:02:20 0.862909.... 4 2014-06-03 15:02:20 0.389879.... 5 2014-06-03 15:02:20 -0.80607.... 6 2014-06-03 15:02:20 -0.97116.... 7 2014-06-03 15:02:20 0.376419.... 8 2014-06-03 15:02:20 0.848243.... 9 2014-06-03 15:02:20 0.446798.... 

The example is cleared from here and here .

+2
source share

Great answer @Mittenchops. Since convert_to_r_dataframe is deprecated. Updating the above example using the rpy2 interface

 from rpy2.robjects import pandas2ri pandas2ri.activate() import pandas as pd import numpy as np from datetime import datetime n = 10 df = pd.DataFrame({ "timestamp": [datetime.now() for t in range(n)], "value": np.random.uniform(-1, 1, n) }) r_dataframe = pandas2ri.py2ri(df) print(r_dataframe) 
+6
source share

All Articles