How to write UTF-8 in a CSV file

Question

How to write UTF-8 in a CSV file

I am trying to create a csv text file from PyQt4 QTableWidget . I want to write UTF-8 encoded text as it contains special characters. I am using the following code:

 import codecs ... myfile = codecs.open(filename, 'w','utf-8') ... f = result.table.item(i,c).text() myfile.write(f+";")

It works until the cell contains a special character. I also tried

 myfile = open(filename, 'w') ... f = unicode(result.table.item(i,c).text(), "utf-8")

But it also stops when a special character appears. I have no idea what I'm doing wrong.

+69

python encoding csv utf-8 pyqt4

Martin Sep 12 '13 at 14:25

source share

7 answers

guaka · Answer 1 · 2015-07-26 21:19

From shell startup:

 pip2 install unicodecsv

And (unlike the original question), assuming you are using Python built into the csv module, go
import csv in
import unicodecsv as csv in your code.

Zanon · Answer 2 · 2016-05-21 14:50

It is very simple for Python 3.x ( docs ).

 import csv with open('output_file_name', 'w', newline='', encoding='utf-8') as csv_file: writer = csv.writer(csv_file, delimiter=';') writer.writerow('my_utf8_string')

For Python 2.x, look here .

Gijs · Answer 3 · 2014-03-24 11:07

Use this package, it just works: https://github.com/jdunck/python-unicodecsv .

Aaron Digulla · Answer 4 · 2013-09-12 16:47

The Python documentation examples show how to write Unicode CSV files: http://docs.python.org/2/library/csv.html#examples

(code is not copied here because it is protected by copyright)

Bojan Bogdanovic · Answer 5 · 2017-09-27 15:11

For me, the UnicodeWriter class from the Python 2 CSV module documentation didn't really work, as it breaks the csv.writer.write_row() interface.

For example:

 csv_writer = csv.writer(csv_file) row = ['The meaning', 42] csv_writer.writerow(row)

It works, but:

 csv_writer = UnicodeWriter(csv_file) row = ['The meaning', 42] csv_writer.writerow(row)

will throw AttributeError: 'int' object has no attribute 'encode' .

Since UnicodeWriter explicitly expects all column values to be strings, we can convert the values ourselves and simply use the default CSV module:

 def to_utf8(lst): return [unicode(elem).encode('utf-8') for elem in lst] ... csv_writer.writerow(to_utf8(row))

Or we can even monkey-patch csv_writer add the write_utf8_row function - the exercise remains for the reader.

pymen · Answer 6 · 2019-01-29 11:11

For python2 you can use this code before csv_writer.writerows(rows)
This code will NOT convert integers to utf-8 strings

 def encode_rows_to_utf8 (rows):
     encoded_rows = []
     for row in rows:
         encoded_row = []
         for value in row:
             if isinstance (value, basestring):
                 value = unicode (value) .encode ("utf-8")
             encoded_row.append (value)
         encoded_rows.append (encoded_row)
     return encoded_rows

vpathak · Answer 7 · 2017-01-15 13:38

A very simple hack is to use json import instead of csv. For example, instead of csv.writer, simply do the following:

  fd = codecs.open(tempfilename, 'wb', 'utf-8') for c in whatever : fd.write( json.dumps(c) [1:-1] ) # json dumps writes ["a",..] fd.write('\n') fd.close()

Basically, given the list of fields in the correct order, the formatted json string is identical to the csv string, with the exception of [and] at the beginning and end, respectively. And json seems reliable for utf-8 in python 2. *

How to write UTF-8 in a CSV file

More articles: