Can I recover international characters mistakenly stored in the varchar field?

Question

Can I recover international characters mistakenly stored in the varchar field?

My client has an old MS SQL 2000 database that uses varchar (50) fields to store names. He tried to use this database to capture some data (via a web form). Some of the form fillers came from other countries, and the varchar fields went nuts when some of these people entered their names. Is there any way to recover data? Maybe, if you guess, what character should be based on what he decided in ASCII / varchar and the country from which the person is? Some data:

Name / Country / Name or Surname?
JiÅ ™ Ã / CZE / F
TorbjÃ¶rn / FIN / F
HuszÃ¡r / HUN / L
JÃ¼rgen / DEU / F
Müller / CHE / L
BumbálkovÃ¡ / CZE / L
DoleA¾al / CZE / L
Loïc / DEU / L

By the way, the web form indicated this type of content:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

+5

sql-server unicode utf-8 sql-server-2000

Chris Oct 28 '08 at 12:43

source share

4 answers

. - ?

http://forums.thedailywtf.com/forums/p/7156/133456.aspx

+1

Windows programmer 28 . '08 1:28

libiconv, UTF8.

, . WikiPedia.

: , .

0

staticsan 28 . '08 1:15

In addition to Richard's comments: if the web page containing the form indicates a character set (e.g. iso-8859-1 == unicode) and encoding (e.g. utf-8), then a standards-compliant browser should send form data using character set and encoding. If your web pages are listed in Unicode, then you do not have to deal with random Microsoft code pages in the data - all this must be unicode.

0

Frentos Oct 28 '08 at 3:42

source share

Richard A · Accepted Answer · 2008-10-28T01:17:29+0000

Work with the 5th example.

Ã - ascii # 195 (C3). ¼ - ascii # 188 (BC).

I guess Mueller should be Mueller.

If it's UTF-8, based on http://en.wikipedia.org/wiki/UTF-8#Description

We have C3 BC = 1100 0011 1011 1100

UTF-8 mapping application:

(110) 00011 (10) 11 1100

0000 0000 1111 1100

00FC which is Unicode ü

U + 00FC (. http://en.wikipedia.org/wiki/Latin_characters_in_Unicode)

, .

, :

Jiå ™ ã JiÅ ™ Ã ( ).

Ji, ,

C5 99 c3 AD

(110) 0 0101 (10) 01 1001 (110) 0 0011 (10) 10 1101

0159 00ED

..

, : . , r , . , google Jiří (http://www.google.com/search?q=Ji%C5%99%C3%AD&ie=utf-8&oe=utf-8), . .

, TorbjÃ¶rn, Torbjörn, .

, , , , .

Can I recover international characters mistakenly stored in the varchar field?

More articles: