How to convert a unicode string to its unicode screens?

Say I have the text "Բարեւ Hello Hello." (I save this code in QString, but if you know another way to save this text in C ++ code, you can welcome it.) How can I convert this text to Unicode-escape files like this? \ U1330 \ u1377 \ u1408 \ u1415 Hello \ u1047 \ u1076 \ u1088 \ u1072 \ u1074 \ u1089 \ u1090 \ u1074 \ u1091 \ u1081 "(see here )?

+4
source share
6 answers

I solved the problem with this code:

BROUGHT TO THE BEST VERSION: (I just do not want to convert Latin characters to Unicode, because it will consume additional space without and advantages for my problem (I want to remind you that I want to generate Unicode RTF)).

int main(int argc, char *argv[]) { QApplication app(argc, argv); QTextCodec::setCodecForTr(QTextCodec::codecForName("UTF-8")); QString str(QWidget::tr("Բարև (1-2+3/15,69_) Hello {} [2.63] ")); QString strNew; QString isAcsii; QString tmp; foreach(QChar cr, str) { if(cr.toAscii() != QChar(0)) { isAcsii = static_cast<QString>(cr.toAscii()); strNew+=isAcsii; } else { tmp.setNum(cr.unicode()); tmp.prepend("\\u"); strNew+=tmp; } } QMessageBox::about(0,"Unicode escapes!",strNew); return app.exec(); } 

Thanks to @Daniel Earwicker for the algorithm and of course +1.

By the way, you need to specify UTF-8 to encode a text editor.

+1
source
 #include <cstdio> #include <QtCore/QString> #include <QtCore/QTextStream> int main() { QString str = QString::fromWCharArray(L"Բարև Hello "); QString escaped; escaped.reserve(6 * str.size()); for (QString::const_iterator it = str.begin(); it != str.end(); ++it) { QChar ch = *it; ushort code = ch.unicode(); if (code < 0x80) { escaped += ch; } else { escaped += "\\u"; escaped += QString::number(code, 16).rightJustified(4, '0'); } } QTextStream stream(stdout); stream << escaped << '\n'; } 

Note that these are loops over the UTF-16 code units, not the actual code points.

+5
source

I assume that you are creating code (JavaScript, maybe?)

QString is similar to the QChar collection. Scroll through the contents, and on each QChar call the unicode method to get the ushort value (16-bit integer).

Then format each character, for example, "\\u%04X" , i.e. \u followed by a 4-digit hexadecimal value.

NB. You may need to change two bytes (two hexadecimal characters) to get the correct result, depending on the platform you are working on.

+3
source
 wchar_t *input; wstring output; for (int i=0; i<str_len; i++) { wchar_t code[7]; swprintf(code, 7, L"\\u%0.4X",input[i]); output += code; } 
+2
source

You must first determine what encoding is used for the text "Բարեւ Hello Hello", it looks like Russian, maybe the code is Win code Page 1251. OR UTF-8 or something else. Then use the MultiByteToWideChar window function with the required inputs, such as the Application Code page, Original Name page, etc.

Hope this helps.

0
source

My decision:

 std::wstring output; QString result; QTextCodec::setCodecForLocale ( QTextCodec::codecForName ( "UTF-8" ) ); for( uint i = 0; wcslen( input ) > i; ++i ) { if( isascii( input[ i ] ) ) { output.reserve( output.size() + 1 ); output += input[ i ]; } else { wchar_t code[ 7 ]; swprintf( code, 7, L"\\u%0.4X", input[ i ] ); output.reserve( output.size() + 7 ); // "\u"(2) + 5(uint max digits capacity) output += code; } } result.reserve( output.size() ); result.append( QString::fromStdWString( output ) ); 

Works with Russian correctly. Transformations

 hello  

in

 hello \\u043F\\u0440\\u0438\\u0432\\u0435\\u0442 
0
source

Source: https://habr.com/ru/post/1315174/


All Articles