How to encode Japanese characters

I need to develop a program. This is a coding system.

I have Japanese characters that:

ใค ใ‚Œ ใฅ ใ‚Œ ใช ใ‚‹ ใพ ใ‚ ใซ, ๆ—ฅๆšฎ ใ‚‰ ใ—, ็กฏ ใซ ใ‚€ ใ‹ ใฒ ใฆ, ๅฟƒ ใซ ใ† ใค ใ‚Š ใ‚† ใ ใ‚ˆ ใ— ใช ใ— ไบ‹ ใ‚’, ใ ใ“ ใฏ ใ‹ ใจ ใช ใ ๆ›ธ ใ ใค ใ ใ‚Œ ใฐ, ใ‚ใ‚„ ใ— ใ† ใ“ ใ ใ‚‚ ใฎ ใ ใ‚‹ ใป ใ— ใ‘ ใ‚Œ

I want to convert this string to an encoding as follows:

% 26% 2312388% 3B% 26% 2312428% 3B% 26% 2312389% 3B% 26% 2312428% 3B% 26% 2312394% 3B% 26% 2312427% 3B% 26% 2312414% 3B% 26% 2312445% 3B% 26 % 2312395% 3B% 26% 2312289% 3B% 26% 2326085% 3B% 26% 2326286% 3B% 26% 2312425% 3B% 26% 2312375% 3B% 26% 2312289% 3B% 26% 2330831% 3B% 26% 2312395 % 3B% 26% 2312416% 3B% 26% 2312363% 3B% 26% 2312402% 3B% 26% 2312390% 3B% 26% 2312289% 3B% 26% 2324515% 3B% 26% 2312395% 3B% 26% 2312358% 3B % 26% 2312388% 3B% 26% 2312426% 3B% 26% 2312422% 3B% 26% 2312367% 3B% 26% 2312424% 3B% 26% 2312375% 3B% 26% 2312394% 3B% 26% 2312375% 3B% 26 % 2320107% 3B% 26% 2312434% 3B% 26% 2312289% 3B% 26% 2312381% 3B% 26% 2312371% 3B% 26% 2312399% 3B% 26% 2312363% 3B% 26% 2312392% 3B% 26% 2312394 % 3B% 26% 2312367% 3B% 26% 2326360% 3B% 26% 2312365% 3B% 26% 2312388% 3B% 26% 2312367% 3B% 26% 2312428% 3B% 26% 2312400% 3B% 26% 2312289% 3B % 26% 2312354% 3B% 26% 2312420% 3B% 26% 2312375% 3B% 26% 2312358% 3B% 26% 2312371% 3B% 26% 2312381% 3B% 26% 2312418% 3B% 26% 2312398% 3B% 26 % 2312368% 3B% 26% 2312427% 3B% 26% 2312411% 3B% 26% 2312375% 3B% 26% 2312369% 3B% 26% 2312428% 3B% 26% 2312290% 3B.

How can i do this?

+4
source share
3 answers

I believe that you are looking for HttpUtility.UrlEncode , cannot calculate the encoding to get exactly the same output that you are showing.

 var testString = "ใคใ‚Œใฅใ‚Œใชใ‚‹ใพใ‚ใซใ€ๆ—ฅๆšฎใ‚‰ใ—ใ€็กฏใซใ‚€ใ‹ใฒใฆใ€ๅฟƒใซใ†ใคใ‚Šใ‚†ใใ‚ˆใ—ใชใ—ไบ‹ใ‚’ใ€ใใ“ใฏใ‹ใจใชใๆ›ธใใคใใ‚Œใฐใ€ใ‚ใ‚„ใ—ใ†ใ“ใใ‚‚ใฎใใ‚‹ใปใ—ใ‘ใ‚Œใ€‚"; var encodedUrl = HttpUtility.UrlEncode(testString, Encoding.UTF8); 

You might want to change your question, since you do not need to convert Unicode to ASCII, which is not possible. You will probably need Persent encoding or Percent-encoding URL code.

[EDIT]

I understood:

 var testString = "ใคใ‚Œใฅใ‚Œใชใ‚‹ใพใ‚ใซใ€ๆ—ฅๆšฎใ‚‰ใ—ใ€็กฏใซใ‚€ใ‹ใฒใฆใ€ๅฟƒใซใ†ใคใ‚Šใ‚†ใใ‚ˆใ—ใชใ—ไบ‹ใ‚’ใ€ใใ“ใฏใ‹ใจใชใๆ›ธใใคใใ‚Œใฐใ€ใ‚ใ‚„ใ—ใ†ใ“ใใ‚‚ใฎใใ‚‹ใปใ—ใ‘ใ‚Œใ€‚"; var htmlEncoded = string.Concat(testString.Select(arg => string.Format("&#{0};", (int)arg))); var result = HttpUtility.UrlEncode(htmlEncoded); 

The result will exactly match the encoding you provided. Step by step:

 var inputChar = 'ใค'; var charValue = (int)inputChar; // 12388 var htmlEncoded = "&#" + charValue + ";"; // つ var ulrEncoded = HttpUtility.UrlEncode(htmlEncoded); // %26%2312388%3b 
+8
source

It's impossible. Unicode is much larger than ASCII, and you cannot search every character from Unicode to ASCII. while ASCII has only 256 characters (with control characters), Unicode has tens of thousands (I think).

+3
source

Here is a function that works:

 public static string UrlDoubleEncode(string text) { if (text == null) return null; StringBuilder sb = new StringBuilder(); foreach (int i in text) { sb.Append('&'); sb.Append('#'); sb.Append(i); sb.Append(';'); } return HttpUtility.UrlEncode(sb.ToString()); } 
+1
source

All Articles