Finding utf-8 string using the Gmail X-GM-RAW IMAP command

The Gmail imap extension X-GM-RAW command allows me to search if I use the ascii query string. If utf-8 characters are used in the request, imap returns a bad response.

https://developers.google.com/google-apps/gmail/imap_extensions#extension_of_the_search_command_x-gm-raw

How should the utf-8 input string be encoded so that the X-GM-RAW search works fine. I do not want to lose the flexibility to search for a specific area, for example, "subject" or "rfc833msgid"

thanks

+2
source share
2 answers

Specify CHARSET UTF-8 and send the UTF-8 search query to a literal. For example, to search for δ½ ε₯½, whose length is 6 bytes when encoding in UTF-8:

A SEARCH CHARSET UTF-8 X-GM-RAW {6} + go ahead δ½ ε₯½ * SEARCH 15 a OK SEARCH completed (Success) 

In this example, you really send the 6-byte UTF-8 encoding δ½ ε₯½ on the third line.

This will work for any SEARCH keyword that accepts asthma, including TOPIC and RECORDS.

+1
source

IMAP is not 8-bit clean , so it must use different encodings to represent any 8-bit data.

For things like folders and tags . IMAP4 uses Modified UTF-7 to represent these characters. Conveniently, ascii data encoded in modified utf7 is encoded on its own, so usually nothing special is required.

For heading messages (including topics), the text is encoded as Mime words .

Finally, bindings are usually encoded as Base64 or Quoted-Printable.

My best guess: GMail uses modified utf7 for its X-GM-RAW requests. The best reference implementation for modified utf7 I found is in the python IMAPClient library

Hope this helps!

0
source

All Articles