Delete the ^ L character in the log file

I want to delete all the characters "\ L" that I find when reading a file. I tried to use this function when I read the line:

def cleanString(self, s): if isinstance(s, str): s = unicode(s,"iso-8859-1","replace") s=unicodedata.normalize('NFD', s) return s.encode('ascii', 'ignore') 

But he does not delete this symbol. Does anyone know how to do this?

I tried using the replace function, but this is not better:

 s = line.replace("\^L","") 

Thank you for your responses.

+7
python unicode
source share
3 answers

You may not have the literal characters ^ and L , but something that displays as ^L

This will be the feed symbol.

So s = line.replace('\x0C', '') .

+3
source share

^L (codepoint 0C ) is an ASCII character, so it will not be affected by ASCII encoding. You can filter out all the control characters with a little regular expression (and while you're on it, filter out everything except non-ASCII):

 import re def cleanString(self, s): if isinstance(s, str): s = unicode(s,"iso-8859-1","replace") s = unicodedata.normalize('NFD', s) s = re.sub(r"[^\x20-\x7f]+", "", s) # remove non-ASCII/nonprintables return str(s) # No encoding necessary 
+2
source share

You had almost everything right, you just need a different view for ^L

 s = line.replace("\x0c", "") 

Here is a function that will return a representation of any control character.

 def cc(ch): return chr(ord(ch) & 0x1f) >>> cc('L') '\x0c' 

Some control characters have alternative representations, the common ones are '\r' for ^M and '\n' for ^J They are listed in the documentation diagram for string literals based on the name specified in the ASCII Code Chart .

+2
source share

All Articles