Reading Unicode from a redirected STDOUT (C ++, Win32 API, Qt)

I have a C ++ application that dynamically loads plug-in DLL modules. The DLL sends text output through std :: cout and std :: wcout. The Qt-based UI should capture all the text output from the DLL and display it. Replacing the stream buffer does not fully work, since DLLs may have different instances of cout / wcout due to differences in runtime libraries. So I applied the STDOUT redirection for Windows as follows:

StreamReader::StreamReader(QObject *parent) : QThread(parent) { // void } void StreamReader::cleanUp() { // restore stdout SetStdHandle (STD_OUTPUT_HANDLE, oldStdoutHandle); CloseHandle(stdoutRead); CloseHandle(stdoutWrite); CloseHandle (oldStdoutHandle); hConHandle = -1; initDone = false; } bool StreamReader::setUp() { if (initDone) { if (this->isRunning()) return true; else cleanUp(); } do { // save stdout oldStdoutHandle = ::GetStdHandle (STD_OUTPUT_HANDLE); if (INVALID_HANDLE_VALUE == oldStdoutHandle) break; if (0 == ::CreatePipe(&stdoutRead, &stdoutWrite, NULL, 0)) break; // redirect stdout, stdout now writes into the pipe if (0 == ::SetStdHandle(STD_OUTPUT_HANDLE, stdoutWrite)) break; // new stdout handle HANDLE lStdHandle = ::GetStdHandle(STD_OUTPUT_HANDLE); if (INVALID_HANDLE_VALUE == lStdHandle) break; hConHandle = ::_open_osfhandle((intptr_t)lStdHandle, _O_TEXT); FILE *fp = ::_fdopen(hConHandle, "w"); if (!fp) break; // replace stdout with pipe file handle *stdout = *fp; // unbuffered stdout ::setvbuf(stdout, NULL, _IONBF, 0); hConHandle = ::_open_osfhandle((intptr_t)stdoutRead, _O_TEXT); if (-1 == hConHandle) break; return initDone = true; } while(false); cleanUp(); return false; } void StreamReader::run() { if (!initDone) { qCritical("Stream reader is not initialized!"); return; } qDebug() << "Stream reader thread is running..."; QString s; DWORD nofRead = 0; DWORD nofAvail = 0; char buf[BUFFER_SIZE+2] = {0}; for(;;) { PeekNamedPipe(stdoutRead, buf, BUFFER_SIZE, &nofRead, &nofAvail, NULL); if (nofRead) { if (nofAvail >= BUFFER_SIZE) { while (nofRead >= BUFFER_SIZE) { memset(buf, 0, BUFFER_SIZE); if (ReadFile(stdoutRead, buf, BUFFER_SIZE, &nofRead, NULL) && nofRead) { s.append(buf); } } } else { memset(buf, 0, BUFFER_SIZE); if (ReadFile(stdoutRead, buf, BUFFER_SIZE, &nofRead, NULL) && nofRead) { s.append(buf); } } // Since textReady must emit only complete lines, // watch for LFs if (s.endsWith('\n')) // may be emmitted { emit textReady(s.left(s.size()-2)); s.clear(); } else // last line is incomplete, hold emitting { if (-1 != s.lastIndexOf('\n')) { emit textReady(s.left(s.lastIndexOf('\n')-1)); s = s.mid(s.lastIndexOf('\n')+1); } } memset(buf, 0, BUFFER_SIZE); } } // clean up on thread finish cleanUp(); } 

However, this solution seems to have a C runtime library - an obstacle that depends on the language. Thus, any output sent to wcout does not reach my buffer because C runtime truncates strings on non-printable ASCII characters present in UTF-16 encoded strings. A call to setlocale () demonstrates that C runtime performs string re-encoding. setlocale () does not help me for the reason that there is no knowledge of the language or the language of the text, since the included DLL files are read from outside the system, and there may be different languages ​​mixed. After providing the N-thought, I decided to abandon this solution and return to replacing the cout / wcout buffer and demand that the DLLs call the initialization method for two reasons: UTF16 did not go into my buffer, and then the problem of determining the encoding in the buffer. However, I'm still wondering if there is a way to get UTF-16 strings via C runtime in pipe 'as is', without a locale-dependent conversion?

ps any suggestions on redirecting cout / wcout to the user interface are welcome, not the two mentioned approaches :)

Thank you in advance!

+4
source share
3 answers

The problem is that the conversion from wchar_t to char is done entirely inside the plug-in DLL, whatever the cout / wcout (which, as you say, may not match the one used by the main application). Thus, the only way to make him behave differently is to somehow intercept this mechanism, for example, with the replacement of streambuf .

However, as you imply, any code that you write in the main application will not necessarily be compatible with the implementation of the library that the DLL uses. For example, if you implement a stream buffer in the main application, it will not necessarily use the same ABI as the stream buffers in the DLL. So this is risky.

I suggest you implement a DLL shell that uses the same version of the C ++ library as the plugin, so it is guaranteed compatibility, and in this wrapper the DLL performs the necessary intervention in cout / wcout . It can load the plugin dynamically and therefore can be reused with any plug-in that uses this version of the library. Alternatively, you can create some reusable source code that can be compiled specifically for each plug-in, thereby creating a sanitized version of each plug-in.

After the DLL is wrapped, you can replace the stream buffer with cout / wcout , which stores the data in memory, as I think you originally planned, and you should not be confused with file descriptors at all.

PS: If you ever need to make wstream, which converts to and from UTF-8, then I recommend using Boost utf8_codecvt_facet as a very neat way to do this. It is easy to use, and the documentation has sample code. (In this case, you will have to compile the Boost version specifically for the library version that the plugin uses, but not in the general case.)

+1
source

I don't know if this is possible, but perhaps you can run the DLL in a separate process and capture the output of this process with the equivalent of a Windows pipe (whatever it is, but Qt QProcess should take care of this for you). This will be similar to how Firefox disables process plugins (by default in 3.6.6, but this has been done for some time with 64-bit Firefox and 32-bit Flash-plugin). You will have to come up with a way to communicate with the DLL in a separate process, for example, with shared memory, but this should be possible. Not necessarily beautiful, but possible.

0
source

Try:

 std::wcout.imbue(std::locale("en_US.UTF-8")); 

It depends on the flow and is better than using the global setlocale() C library.

However, you may need to tweak the locale name according to what your runtime supports.

0
source

Source: https://habr.com/ru/post/1314935/


All Articles