How to convert from encoding to UTF-8 in Go?

I am working on a project where I need to convert text from encoding (e.g. Windows-1256 Arabic) to UTF-8.

How to do it in Go?

+7
go unicode
source share
2 answers

You can use an encoding package that includes support for Windows-1256 through the golang.org/x/text/encoding/charmap package (in the example below, import this package and use charmap.Windows1256 instead of japanese.ShiftJIS ).

Here is a short example that encodes the Japanese UTF-8 string for ShiftJIS encoding and then decodes the ShiftJIS string back to UTF-8. Unfortunately, this does not work on the playground, as there are no β€œx” packets on the playground.

 package main import ( "bytes" "fmt" "io/ioutil" "strings" "golang.org/x/text/encoding/japanese" "golang.org/x/text/transform" ) func main() { // the string we want to transform s := "今ζ—₯は" fmt.Println(s) // --- Encoding: convert s from UTF-8 to ShiftJIS // declare a bytes.Buffer b and an encoder which will write into this buffer var b bytes.Buffer wInUTF8 := transform.NewWriter(&b, japanese.ShiftJIS.NewEncoder()) // encode our string wInUTF8.Write([]byte(s)) wInUTF8.Close() // print the encoded bytes fmt.Printf("%#v\n", b) encS := b.String() fmt.Println(encS) // --- Decoding: convert encS from ShiftJIS to UTF8 // declare a decoder which reads from the string we have just encoded rInUTF8 := transform.NewReader(strings.NewReader(encS), japanese.ShiftJIS.NewDecoder()) // decode our string decBytes, _ := ioutil.ReadAll(rInUTF8) decS := string(decBytes) fmt.Println(decS) } 

Here's a more complete example on the Japanese StackOverflow site. The text is Japanese, but the code should be clear: https://ja.stackoverflow.com/questions/6120

+9
source share

Use the modules from golang.org/x/text . In your case, it will be something like this:

 b := /* Win1256 bytes here. */ dec := charmap.Windows1256.NewDecoder() // Take more space just in case some characters need // more bytes in UTF-8 than in Win1256. bUTF := make([]byte, len(b)*3) n, _, err := dec.Transform(bUTF, b, false) if err != nil { panic(err) } bUTF = bUTF[:n] 
+2
source share

All Articles