GAE blobstore filename UTF-8 encoding problem

I have a problem with file name encoding in GAE blobstore here.

class UploadHandler(blobstore_handlers.BlobstoreUploadHandler): def post(self): upload_files = self.get_uploads('file') blob_info = upload_files[0] #Problem right here decoded_filename = blob_info.filename.decode("utf-8") # File_info = Fileinfo( key_name=str(blob_info.key()), filename=decoded_filename, ) File_info.put() self.redirect("/") 

When I run locally, it works fine in the SDK console, alt text

but after loading into GAE it saves it as a string without decoding "=? UTF-8? B? 54Wn54mH5pel5pyfIDIwMTAtMDgtMDM =? =" or =? Big5? B v8O59afWt9MgMjAxMC0xMi0wMiA =? =

alt text

I doubt the best solution might be to stop using the Chinese character file name ...

All suggestions are welcome :)

+4
source share
3 answers

This is an open problem: the Blobstore handler encoding the data encoding is here .

+2
source

The file name is BlobInfo MIME-encoded by Google. I do not know why Google does this.

It is broken for people living in multibyte countries.

You can get the correct file name if you use any character code, as shown below:

 import email for blob_info in self.get_uploads('file'): filename_mime = blob_info.filename if isinstance(filename_mime, unicode): filename_mime_utf8 = filename_mime.encode('utf-8') else: filename_mime_utf8 = filename_mime filename_encoded, encoding = email.header.decode_header(filename_mime_utf8)[0] if encoding is not None: filename_unicode = filename_encoded.decode(encoding) filename_utf8 = filename_unicode.encode('utf-8') blob_info._BlobInfo__entity['filename'] = filename_utf8 
0
source

Here is the setup for the ENDOH takanao solution, which you can call for each file_info object:

 def get_filename_from_file_info(file_info): filename_mime = file_info.filename if isinstance(filename_mime, unicode): filename_mime_utf8 = filename_mime.encode('utf-8') else: filename_mime_utf8 = filename_mime filename_encoded, encoding = email.header.decode_header(filename_mime_utf8)[0] if encoding is not None: filename_unicode = filename_encoded.decode(encoding) filename_utf8 = filename_unicode.encode('utf-8') return filename_utf8 return filename_mime_utf8 
0
source

All Articles