Python3 UnicodeDecodeError ascii default encoding using apache WSGI

import locale prefered_encoding = locale.getpreferredencoding() prefered_encoding 'ANSI_X3.4-1968' 

I use the inginious framework and it uses web.py to display its template.

 web.template.render(os.path.join(root_path, dir_path), globals=self._template_globals, base=layout_path) 

Rendering works on my localhost , but not on my staging server .

Both of them run python3. I see web.py applying utf-8 to

only encoding in Python2 (which is from my hands)

 def __str__(self): self._prepare_body() if PY2: return self["__body__"].encode('utf-8') else: return self["__body__"] 

here is the stack trace

 t = self._template(name), File "/lib/python3.5/site-packages/web/template.py", line 1028, in _template, self._cache[name] = self._load_template(name), File "/lib/python3.5/site-packages/web/template.py", line 1016, in _load_template return Template(open(path).read(), filename=path, **self._keywords) File "/lib64/python3.5/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 83: ordinal not in range(128), 

My html do include hebew chars, a small example

 <div class="modal-content"> <div class="modal-header"> <button type="button" class="close" data-dismiss="modal">&times;</button> <h4 class="modal-title feedback-modal-title"> 讞讬砖讜讘 讛讗讬讘专讬诐 讛专讗砖讜谞讬诐 讘住讚专讛 砖诇 讗讬讘专 专讗砖讜谉 讞讬讜讘讬 讜讬讞住 砖诇讬诇讬: <span class="red-text">讗讬 讛爪诇讞讛</span> 

and I open it like this:

 open('/path/to/feedback.html').read() 

and the line in which the encoding fails is Hebrew characters.

I tried to set some environment variables in ~/.bashrc :

 export PYTHONIOENCODING=utf8 export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8 export LANGUAGE=en_US.UTF-8 

as centos

The ingenious infrastructure is installed as pip in python3.5 packages. and it is served by the apache server under the apache user

I tried to set environment variables in the code (during application initialization) so that WSGI apache knew about them

 import os os.environ['LC_ALL'] = 'en_US.UTF-8' os.environ['LANG'] = 'en_US.UTF-8' os.environ['LANGUAGE'] = 'en_US.UTF-8' 

I edited /etc/httpd/conf/httpd.conf using the setenv method:

 SetEnv LC_ALL en_US.UTF-8 SetEnv LANG en_US.UTF-8 SetEnv LANGUAGE en_US.UTF-8 SetEnv PYTHONIOENCODING utf8 

and restarted using sudo service httpd restart and still no luck.

My question is what is the best solution to this problem. I understand that there are hacks for this, but I want to understand what is the reason for underlining, as well as how to solve it.

Thanks!

+8
python web.py encoding apache utf-8
source share
2 answers

finally found the answer while reading a file changed from

 open('/path/to/feedback.html').read() 

to

 import codecs with codecs.open(file_path,'r',encoding='utf8') as f: text = f.read() 

If someone has a more general approach that will work, I will accept his answer

+2
source share

Python 2 + 3 solution will be:

 import io with io.open(file_path, mode='r', encoding='utf8') as f: text = f.read() 

See the io.open documentation.

+1
source share

All Articles