Pandas - cdecimal.ConversionSyntax invalidOperation on read_sql_query ()

I had a problem with the Pandas read_sql_query () function when working with decimal database data types. I have no problem with the identical code below when working with varchar or integer types.

Version Information:

 CentOS 6.6 Python 2.7.10 :: Anaconda 2.3.0 (64-bit) # packages in environment at /opt/anaconda: pandas 0.16.2 np19py27_0 cdecimal 2.3 py27_0 pyodbc 3.0.10 py27_0 sqlalchemy 1.0.8 py27_0 

Below, my code has been reduced as much as possible to replicate the error. I also tried through sqlalchemy and got the same error. (There is no sqlalchemy mechanism for netezza, so it still needs to rely on pyodbc.)

 import pyodbc import pandas as pd connection = pyodbc.connect("Driver={NetezzaSQL};servername=nzserver;database=MASTER;username=USER_GUY;password=****") sql = "select cast(0.0 as decimal(6,2)) as testing " data = pd.io.sql.read_sql_query(sql, connection, index_col=None, coerce_float=True) #Also tried this, same error data = pd.io.sql.read_sql_query(sql, connection, index_col=None, coerce_float=False) --------------------------------------------------------------------------- InvalidOperation Traceback (most recent call last) <ipython-input-217-ba167303e6b2> in <module>() 1 ----> 2 data = pd.io.sql.read_sql_query(sql, connection, index_col=None, coerce_float=True) 3 # InvalidOperation: [<class 'cdecimal.ConversionSyntax'>] 4 
+3
python pandas
Sep 29 '15 at 14:53
source share
2 answers

OK, so I got Pandas read_sql_query() to work with the Netezza decimal data type. I did this by switching from pyodbc to pypyodbc (described here ).

I noticed some strange things with unicode in the results, so I passed a few extra parameters to the connection object in pypyodbc , and now everything works correctly. I needed to change unicode_results=False and ansi=true (for DB output / messages) ... but I think I'm fine with this, since my data should fit in ASCII, and I'm not even sure how much NZ supports unicode anyway.

The working code below is to get a request from data.frame :

 import pypyodbc import pandas as pd connection = \ pypyodbc.connect( "Driver=NetezzaSQL;SERVERNAME=nzserver;DATABASE=MASTER;PORT=5480;USERNAME=user_guy;PASSWORD=******", ansi=True, unicode_results=False) sql = """select cast(0.0 as decimal(6,2)) as decimal_test, cast(0 as integer) as int_test, cast('aosenuth' as varchar(5)) as varchar_test , current_timestamp as timestamp_test """ data = pd.io.sql.read_sql_query(sql, connection, index_col=None, coerce_float=True) print data.shape 
0
Sep 29 '15 at 15:55
source share

This seems to be a known issue with pyodbc. There is a patch that you can try. There's even a comment from netezza.

+2
Sep 29 '15 at 15:06
source share



All Articles