I am using Perl with DBI / DBD::ODBC to retrieve data from a SQL Server database and have some problems with character encoding.
The database has a default SQL_Latin1_General_CP1_CI_AS , so the data in the varchar columns is encoded in the version of Microsoft Latin-1, AKA windows-1252 .
There seems to be no way to handle this transparently in DBI / DBD :: ODBC. I get data still encoded as windows-1252 , for example, β¬ "" is encoded as bytes 0x80, 0x93 and 0x94. When I write them to a UTF-8 encoded XML file without first decoding them, they are written as Unicode characters 0x80, 0x93 and 0x94 instead of 0x20AC, 0x201C, 0x201D, which is clearly wrong.
My current workaround is to call $val = Encode::decode('windows-1252', $val) for each column after each fetch . This works, but hardly seems the right way to do it.
Is there no way to tell DBI or DBD::ODBC for this conversion for me?
I use ActivePerl (5.12.2 Build 1202), DBI (1.616) and DBD::ODBC (1.29) provided by ActivePerl and updated using ppm; runs on the same server as the database (SQL Server 2008 R2).
My connection string:
dbi:ODBC:Driver={SQL Server Native Client 10.0};Server=localhost;Database=$DB_NAME;Trusted_Connection=yes;
Thanks in advance.
sql-server perl dbi odbc
mscha
source share