Problem with std :: codecvt_utf8 facet

Here is a piece of code that uses a facet std::codecvt_utf8<>to convert from wchar_tto UTF-8. In Visual Studio 2012, my expectations are not fulfilled (see the Condition at the end of the code). Are my expectations wrong? What for? Or is it a Visual Studio 2012 library issue?

#include <locale>
#include <codecvt>
#include <cstdlib>

int main ()
{
    std::mbstate_t state = std::mbstate_t ();
    std::locale loc (std::locale (), new std::codecvt_utf8<wchar_t>);
    typedef std::codecvt<wchar_t, char, std::mbstate_t> codecvt_type;
    codecvt_type const & cvt = std::use_facet<codecvt_type> (loc);

    wchar_t ch = L'\u5FC3';
    wchar_t const * from_first = &ch;
    wchar_t const * from_mid = &ch;
    wchar_t const * from_end = from_first + 1;

    char out_buf[1];
    char * out_first = out_buf;
    char * out_mid = out_buf;
    char * out_end = out_buf + 1;

    std::codecvt_base::result cvt_res
        = cvt.out (state, from_first, from_end, from_mid,
            out_first, out_end, out_mid);

    // This is what I expect:
    if (cvt_res == std::codecvt_base::partial
        && out_mid == out_end
        && state != 0)
        ;
    else
        abort ();
}

The function is expected to out()output one byte of UTF-8 conversion at a time, but in the middle of the conditional expression ifabove is false with Visual Studio 2012.

UPDATE

Unable to meet the conditions out_mid == out_endand state != 0. Basically, I expect at least one byte to be created, and the necessary state for the next byte of the UTF-8 sequence to be produced will be stored in a variable state.

+4
2

partial codecvt::do_out :

83:

partial

22.4.1.4.2 [locale.codecvt.virtuals]/5:

: , 83. partial, (from_next==from_end) , , .

() , ( "if" ), , , " ", . , codecvt_utf8.

, :

: C std::wcsrtombs ( codecvt::do_out ) :

[...], , , dst.

, codecvt_utf8: Microsoft, lib++: codecvt_utf8::do_out ucs2_to_utf8 Windows ucs4_to_utf8 , ucs2_to_utf8 ( ):

        else if (wc < 0x0800)
        {
            // not relevant
        }
        else // if (wc <= 0xFFFF)
        {
            if (to_end-to_nxt < 3)
                return codecvt_base::partial; // <- look here
            *to_nxt++ = static_cast<uint8_t>(0xE0 |  (wc >> 12));
            *to_nxt++ = static_cast<uint8_t>(0x80 | ((wc & 0x0FC0) >> 6));
            *to_nxt++ = static_cast<uint8_t>(0x80 |  (wc & 0x003F));
        }

, , - .

+4

, , std::codecvt::out. :

  • std::codecvt::out , , - - (, ) out_buf.
  • out_buf ( std::codecvt::out), ,
  • , buf_mid, , , .
  • , std::codecvt::out (buf_mid, ), out_buf , , / .

, extern_type*& to_next ( std::codecvt::out) , , , - , - , (extern_type* to).

+2

All Articles