"missing word in the phrase: charset is not supported" when using the mail package

I try to parse emails and I get such errors using the mail package. Is this an error in the mail package or something I have to deal with?

missing word in phrase: charset not supported: "gb18030"

charset not supported: "koi8-r" missing word in phrase: charset not supported: "ks_c_5601-1987"

How can I fix them? I think I should use charset , but I'm not sure how, Here, how the email header looks like

 Received: from smtpbg303.qq.com ([184.105.206.26]) by mx-ha.gmx.net (mxgmxus001) with ESMTPS (Nemesis) id 0MAOx2-1X2yNC2ZFC-00BaVU for < sormester@lobbyist.com >; Sat, 14 Jun 2014 18:11:48 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201307; t=1402762305; bh=imEvSr8IPsqWTXU63xUHRv+wuQG+Tcz2mPP9ai4rrE4=; h=X-QQ-FEAT:X-QQ-SSF:X-HAS-ATTACH:X-QQ-BUSINESS-ORIGIN: X-Originating-IP:In-Reply-To:References:X-QQ-STYLE:X-QQ-mid:From:To:Subject:Mime-Version:Content-Type:Content-Transfer-Encoding:Date: X-Priority:Message-ID:X-QQ-MIME:X-Mailer:X-QQ-Mailer: X-QQ-ReplyHash:X-QQ-SENDSIZE:X-QQ-FName:X-QQ-LocalIP; b=QXs4CveboS8nG6htN9W6amC3X+F7X3ZtFrt6jrjWI+RmbvqBuTCVmX9IlaqCX84H8 n14x2Wp7x4kDYcNRqhe+HjTpf715TTQXc4d40b9e38frC/5qIhpMtYNsD8iEJwRzHW U3xi8Yq7OCIB303fIpytx8tOjexQpZKSHbJ7ecX0= X-QQ-FEAT: zaIfg0hwV2pIDflZYPQUsuPPXG5wtRVHJU6PiOYLBBA= X-QQ-SSF: 00010000000000F000000000000000L X-HAS-ATTACH: no X-QQ-BUSINESS-ORIGIN: 2 X-Originating-IP: 180.155.99.102 In-Reply-To: < trinity-b7c6d611-52fd-4afa-b739-2deb243532a6-1402761364579@3capp -mailcom-lxa05> References: < 97e07dab7c2d1a005ed928c4350690e0@hotels-desk.co.uk >, < tencent_105D3DC11702F53465C0025D@qq.com > < trinity-b7c6d611-52fd-4afa-b739-2deb243532a6-1402761364579@3capp -mailcom-lxa05> X-QQ-STYLE: X-QQ-mid: webmail474t1402762303t356131 From: "=?gb18030?B?08bTzg==?=" < 38438nx@qq.com > To: "=?gb18030?B?V2lsaGVsbSBLdW1tZXI=?=" < sormester@lobbyist.com > Subject: =?gb18030?B?u9i4tKO6ILvYuLSjulBhbGFjZSBXZXN0bWluc3Rl?= =?gb18030?B?cjogMDEtMDctMjAxNCAtIDA0LTA3LTIwMTQ=?= Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_539C743F_08A07490_0157E268" Content-Transfer-Encoding: 8Bit Date: Sun, 15 Jun 2014 00:11:43 +0800 X-Priority: 3 Message-ID: < tencent_573A737E73016B9F5A3D10C1@qq.com > X-QQ-MIME: TCMime 1.0 by Tencent X-Mailer: QQMail 2.x X-QQ-Mailer: QQMail 2.x X-QQ-ReplyHash: 170675637 X-QQ-SENDSIZE: 520 X-QQ-FName: 7B2EFFAD16B8462B84D3499A4CC7DDEF X-QQ-LocalIP: 163.177.66.155 Envelope-To: < sormester@lobbyist.com > X-GMX-Antispam: 0 (Mail was not recognized as spam); Detail=V3; X-GMX-Antivirus: 0 (no virus found) 

Edit:

I tried using the charset package, but it has no effect. I still get the same error in the same messages.

 import "code.google.com/p/go-imap/go1/imap" header := imap.AsBytes(rsp.MessageInfo().Attrs["RFC822.HEADER"]) r, err := charset.NewReader("UTF-8", bytes.NewReader(header)) if err != nil { log.Fatal(err) } fmt.Printf("new char is %v", r) msg, err := mail.ReadMessage(r) if err != nil { log.Fatal(err) return mgs, err } mg.From, err = msg.Header.AddressList("From") if err != nil { log.Errorf("NO FROM msg %s, err %v", header, err) return } 

The mail package seems to be able to decode only rfc2047 , but the charset package does not support this

 character set "rfc2047" not found 

It seems mahonia who can solve the problem?

+2
mime-types email go mime
source share
2 answers

I hope this helps someone who might consider moving on to processing emails (for example, developing client applications). It seems the Go standard standard library is not mature enough to handle email. It does not handle multi-user, different char sets, etc. After almost two days of trying different hacks and packages, I decided to just throw away the go code and use the old good JavaMail solution.

0
source share

Alexey Vasiliev MIT-licensed http://github.com/le0pard/go-falcon/ includes the parser package, which is used no matter what encoding package is needed to decode the headers (the meat is in utils.go ).

 package main import ( "bufio" "bytes" "fmt" "net/textproto" "github.com/le0pard/go-falcon/parser" ) var msg = []byte(`Subject: =?gb18030?B?u9i4tKO6ILvYuLSjulBhbGFjZSBXZXN0bWluc3Rl?= =?gb18030?B?cjogMDEtMDctMjAxNCAtIDA0LTA3LTIwMTQ=?= `) func main() { tpr := textproto.NewReader(bufio.NewReader(bytes.NewBuffer(msg))) mh, err := tpr.ReadMIMEHeader() if err != nil { panic(err) } for name, vals := range mh { for _, val := range vals { val = parser.MimeHeaderDecode(val) fmt.Print(name, ": ", val, "\n") } } } 

It looks like its parser.FixEncodingAndCharsetOfPart used by the package to decode / transform the content, albeit with a few extra highlights caused by converting the body []byte to / from a string . If you do not find that the API works for you, you can at least use this code to find out how to do this.

Found via godoc.org "... and imported in 3 packages" link from encoding / simplifiedchinese - hooray godoc.org!

0
source share

All Articles