Why are my Chinese characters not showing correctly in C # line

I store Chinese and English text in a SQL Server 2005 database and display it on a web page, but Chinese is not displayed correctly. I read about the subject and did the following:

  • used N before the text in the INSERT statement
  • set field type to nvarchar
  • set page encoding to UTF-8

Chinese characters appear on the page correctly when I insert them directly on the page, that is, I don’t get them from the database

These are the characters that should be displayed: 全 澳 甲 流 确诊 病例 已 破 100

This is what is displayed when text is extracted from the database: å ... ¨æ¾³ç "²æμ确诊ç -... ä¾ <å · ²ç '1001

This is similar to how the strings are handled in C #, because the Chinese can get and display correctly in classic asp

Is there anything else I need to do to get the data from the database into a string and correctly output to an aspx page?

+4
source share
5 answers

So far information:

  • You are using a direct SQL INSERT script to insert into the database.
  • Data is displayed in the database.

The problem can be in two places:

  • In an INSERT statement, do you prefix an insert value with N?

    INSERT INTO #tmp VALUES (N '全 澳 甲 流 确诊 病例 已 破 100')

  • If you prefix the value with N, does the String object contain the correct data?

    String sql = "INSERT INTO #tmp VALUES (N '" + value + "')"

Here I assume that the value is a String object.

Does this String object contain the correct Chinese characters?

Try printing its value and see.

Update

Suppose an INSERT query is structured as follows:

String sql = "INSERT INTO #tmp VALUES (N' " + value + "')" 

I guess the value contains chinese character.

Have you assigned Chinese characters to a value directly? how

 String value = "全澳甲流确诊病例已破100"; 

The above code should work. However, if you did any intermediate processing, this will cause a problem.

I have done a localized TC project before; the previous architect made several coding transformations that are required in ASP; but they will create a problem in .NET:

  String value = "全澳甲流确诊病例已破100"; Encoding tc = Encoding.GetEncoding("BIG5"); byte[] bytes = tc.GetBytes(value); value = Encoding.Unicode.GetString(bytes); 

The above conversions are not needed. In .NET, direct assignment just works:

  String value = "全澳甲流确诊病例已破100"; 

This is because String constants and the String object itself are Unicode compatible.

A frame library, such as File IO, when reading a file that is not Unicode encoded, converts the external encoding to Unicode; in other words, the structure will do this dirty work for you. You no longer need to perform manual coding.

Update . It is understood that ASP is used to insert data into the SQL server.

I wrote a small part of ASP to insert some Chinese characters into an SQL database and it works.

I have a database called "trans" and I created a table "temp" inside. The ASP page is encoded in UTF-8.

 <html> <head title="Untitled"> <meta http-equiv="content-type" content="text/html";charset="utf-8"> </head> <body> <script language="vbscript" runat="server"> If Request.Form("Button1") = "Submit" Then SqlQuery = "INSERT INTO trans..temp VALUES (N'" + Request.Form("Text1") + "')" Set cn = Server.CreateObject("ADODB.Connection") cn.Provider = "sqloledb" cn.Properties("Data Source").Value = ********* cn.Properties("Initial Catalog").Value = "TRANS" cn.Properties("User ID").Value = "sa" cn.Properties("Password").Value = ********** cn.Properties("Persist Security Info").Value = False cn.Open cn.Execute(SqlQuery) cn.Close Set cn = Nothing Response.Write SqlQuery End If </script> <form name="form1" method="post" action="input.asp"> <input name="Text1" type="text" /> <input name="Button1" value="Submit" type="submit" /> </form> </body> </html> 

The table is defined as belows in my database:

  create table temp (data NVARCHAR(100)) 

Send the ASP page several times, and my table contains the correct Chinese data:

 select * from trans..temp data ---------------- test测试全澳甲流确诊病例已破100 

Hope this helps.

+6
source

How do characters get into the database? Do you enter them through a stored procedure? Make sure that the parameters of your saved proc are also nvarchar AND by the parameters of the command object from which you call proc.

Update: The consensus over the stream is that the database does not have properly encoded NVARCHAR content. Here is my last theory: the database has UTF8 bytes. These bytes remain untouched when they are output from ASP. ASP.NET accepts UTF8 bytes and interprets it as single-byte characters.

Try to output bytes from the database and decode it as UTF8, for example:

 SqlCommand command = new SqlCommand("SELECT zhtext FROM TestTable", connection); byte[] byteArray = (byte[])command.ExecuteScalar(); lblText.Text = Encoding.UTF8.GetString(byteArray); 
+1
source

This is definitely a string encoding problem at some point on your round trip from the database to the C # string, but you are doing everything right from its sounds.

In our database, we store Unicode data in the NVARCHAR () columns, and then read them into regular C # lines; text changes were not required. What types of data objects do you use (e.g. DataSets, only DataReader, LINQtoSQL)?

In our application, we read the results of the stored procedure using FetchDataSet, and then do DataBinder.Eval () to assign a string, which is ultimately the label text.

0
source

Have you installed "support for oriental languages" in your windows? is it XP if so, your data may be all right, only the SQL management studio does not show it properly. (all fonts of the true font show OK even without "support for the Chinese", but system fonts do not)

0
source

The summary for me is as follows:

  • characters are displayed in ASP
  • display characters in SSMS
  • display characters in ASP.Net

conclusion: the data in the database is not encoded correctly, and you need to transfer the data to Unicode in order to deal with them in C #, just like Ryan drew.

0
source

All Articles