How to save Unicode data in oracle?

I am trying to save Unicode (Greek) data in an oracle database (10 g). I created a simple table:

alt text

I understand that NVARCHAR2 always uses UTF-16 encoding, so it should be suitable for all (human) languages.

Then I try to insert a row into the database. I hardcoded the string (“How are you?” In Greek) in the code. Then I try to return it from the database and show it.

class Program { static string connectionString = "<my connection string>"; static void Main (string[] args) { string textBefore = "Τι κάνεις;"; DeleteAll (); SaveToDatabase (textBefore); string textAfter = GetFromDatabase (); string beforeData = String.Format ("Before: {0}, ({1})", textBefore, ToHex (textBefore)); string afterData = String.Format ("After: {0}, ({1})", textAfter, ToHex (textAfter)); Console.WriteLine (beforeData); Console.WriteLine (afterData); MessageBox.Show (beforeData); MessageBox.Show (afterData); Console.ReadLine (); } static void DeleteAll () { using (var oraConnection = new OracleConnection (connectionString)) { oraConnection.Open (); var command = oraConnection.CreateCommand (); command.CommandText = "delete from UNICODEDATA"; command.ExecuteNonQuery (); } } static void SaveToDatabase (string stringToSave) { using (var oraConnection = new OracleConnection (connectionString)) { oraConnection.Open (); var command = oraConnection.CreateCommand (); command.CommandText = "INSERT into UNICODEDATA (ID, UNICODESTRING) Values (11, :UnicodeString)"; command.Parameters.Add (":UnicodeString", stringToSave); command.ExecuteNonQuery (); } } static string GetFromDatabase () { using (var oraConnection = new OracleConnection (connectionString)) { oraConnection.Open (); var command = oraConnection.CreateCommand (); command.CommandText = "Select * from UNICODEDATA"; var erpReader = command.ExecuteReader (); string s = String.Empty; while (erpReader.Read ()) { string text = erpReader.GetString (1); s += text + ", "; } return s; } } static string ToHex (string input) { string bytes = String.Empty; foreach (var c in input) bytes += ((int)c).ToString ("X4") + " "; return bytes; } } 

Here are the different conclusions:

Text before sending to the database in the message box: alt text

The text after receiving from the database in the message box: alt text

Console exit: alt text

Please, could you tell me what I can do wrong here?

+6
c # oracle oracle10g unicode
source share
6 answers

I see five potential areas for problems:

  1. How do you actually get text in your .NET application? If it is hard-coded in a string literal, are you sure that the compiler accepts the correct encoding for your source file?

  2. There may be a problem in the way you submit it to the database.

  3. There may be a problem with how it is stored in the database.

  4. There may be a problem with the way you select it in the database.

  5. There may be a problem with the way you display it again later.

Now, areas 2-4 sound as if they are less likely than problems 1 and 5. How do you display the text afterwards? Are you actually retrieving it from a database in .NET, or are you using Toad or something like that to try to see this?

If you write this again from .NET, I suggest you skip the database completely - if you just display the string itself, what do you see?

I have an article that might be useful for troubleshooting Unicode issues . In particular, concentrate on every place where the encoding may not work correctly, and make sure that whenever you “display” the string, you output exact Unicode characters (like integers) so that you can check them, and not just that, what your current font wants to display.

EDIT: Good, so the database is somewhere involved in the problem.

I highly recommend that you remove something like ASP and HTML from the equation. Write a simple console application that does nothing but inserts a string and extracts it again. Dump individual Unicode characters (as integers) before and after. Then try to see what is in the database (for example, using a toad). I don’t know the Oracle functions for converting strings to sequences of individual Unicode characters and then converting those characters to integers, but it is possible that I will try next time.

EDIT: two more sentences (nice to see the console application, by the way).

  1. Specify the data type for the parameter, not just provide it with an object. For example:

     command.Parameters.Add (":UnicodeString", OracleType.NVarChar).Value = stringToSave; 
  2. Try using the native Oracle driver instead of the built-in in .NET. You can do this anyway, as I believe it is considered faster and more reliable.

+6
source share

You can determine which database characters the NCHAR uses with the query:

 SQL> SELECT VALUE 2 FROM nls_database_parameters 3 WHERE parameter = 'NLS_NCHAR_CHARACTERSET'; VALUE ------------ AL16UTF16 

to verify that the database configuration is correct, you can run the following in SQL * Plus:

 SQL> CREATE TABLE unicodedata (ID NUMBER, unicodestring NVARCHAR2(100)); Table created SQL> INSERT INTO unicodedata VALUES (11, 'Τι κάνεις;'); 1 row inserted SQL> SELECT * FROM unicodedata; ID UNICODESTRING ---------- --------------------------------------------------- 11 Τι κάνεις; 
+2
source share

One more remark.

If you are using the oracle client and want to include Unicode characters in the CommandText, you must add the following line to the top of your application:

 System.Environment.SetEnvironmentVariable("ORA_NCHAR_LITERAL_REPLACE", "TRUE"); 

This will allow you, if necessary, to use the following syntax:

 command.CommandText = "INSERT into UNICODEDATA (ID, UNICODESTRING) Values (11, N'Τι κάνεις;')"; 
+1
source share

After some research, we will go:

string input = "•"; char s = input [0];

  //table kuuku with column kuku(nvarchar2(100)) string connString = "your connection"; //CLEAN TABLE using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString)) { cn.Open(); System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("delete from kuku ", cn); cmd.ExecuteNonQuery(); cn.Close(); } //INSERT WITH PARAMETER BINDING - UNICODE SAVED using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString)) { cn.Open(); System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("insert into kuku (kuku) values(:UnicodeString)", cn); cmd.Parameters.Add(":UnicodeString", System.Data.OracleClient.OracleType.NVarChar).Value = input + " OK" ; cmd.ExecuteNonQuery(); cn.Close(); } //INSERT WITHOUT PARAMETER BINDING - UNICODE NOT SAVED using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString)) { cn.Open(); System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("insert into kuku (kuku) values('" +input+" WRONG')", cn); cmd.ExecuteNonQuery(); cn.Close(); } //FETCH RESULT using (System.Data.OracleClient.OracleConnection cn = new System.Data.OracleClient.OracleConnection(connString)) { cn.Open(); System.Data.OracleClient.OracleCommand cmd = new System.Data.OracleClient.OracleCommand("select kuku from kuku", cn); System.Data.OracleClient.OracleDataReader dr = cmd.ExecuteReader(); if(dr.Read()) { string output = (string) dr[0]; char sa = output[0]; } cn.Close(); } } 

PL SQL look

+1
source share

When reading records, try

 Encoding utf = Encoding.Default; var utfBytes = odatareader.GetOracleString(0).GetNonUnicodeBytes();//OracleDataReader Console.WriteLine(utf.GetString(utfBytes)); 
0
source share

Solution: set NLS_LANG!

More: I had the same problem, and actually had the same situation as described in the study of Sergei Bazarnik. Using bind variables, it works, and without them it is not.

The SOLUTION is to set NLS_LANG in the right place. Since I have a Windows server, I installed it in the Windows HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\ORACLE\KEY_OraClient11g_home1 under HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\ORACLE\KEY_OraClient11g_home1

Please note that the location in the register may vary, so the easiest way is to find the registry for the string "ORACLE_HOME". Also other systems like Linux, Unix may install it differently (export NLS_LANG ...)

In my case, I put "NLS_LANG"="CROATIAN_CROATIA.UTF8" . Since I did not have such a set of variables, it went over to the default value. After changing the registry, you must restart the process. In my case, I restarted IIS.

Regarding the reason why it works with bind variables, it may be because it actually happens on the server side, but in fact it does not happen on the client side. Thus, even this database can insert the correct values ​​- before this happens, the client makes unwanted corrections, since it believes that this should do it. This is because NLS_LANG uses a simpler code page by default. But instead of doing a useful task, it creates a problem that (as shown in the study, is difficult to understand).

If you have several versions of oracle, be sure to fix all the versions in the registry (in my case, Oracle 10 had a valid configuration, but Oracle 11 did not have NLS_LANG at all).

0
source share

All Articles