Notepad ++, How to remove all characters without ascii with regex?

I searched a lot, but no where it says how to remove non-ASCII characters from Notepad + ??

I need to know which command to write in the search and replace (with a picture it would be great)

  • if I want to whitelist and mark all ASCII words / lines so that no ASCII lines are marked

  • if the file is quite large and cannot select all ASCII lines and just select lines containing non-ASCII characters.

+82
regex expression notepad ++ non-ascii-characters
Jan 02 '14 at 19:08
source share
7 answers

This expression will look for non-ascii values:

[^\x00-\x7F]+ 

Check "Search Mode = Regular Expression" and click "Find Next."

Source: Regex any ascii character

+152
Jan 02 '14 at 19:11
source share

In Notepad ++, if you go to:

Search | Find characters in a range | Non-ASCII characters (128-255)

you can go through the document to every character other than ascii.

+33
May 16 '14 at 22:55
source share

To remove all non-ASCII characters, you can use the following substitution: [^\x00-\x7F]+

Removing non-ASCII

To highlight characters, I recommend using the Mark function in the search box: this selects non-ASCII characters and places a bookmark in the lines containing one of them

Highligh non-ascii

If you want to highlight and bookmark ASCII characters instead, you can use regex [\x00-\x7F] for this.

Greetings

+14
Jun 21 '16 at 7:04 on
source share

In addition to the ProGM answer, if you see characters in blocks like NUL or ACK and want to get rid of them, these are ASCII control characters (0 to 31), you can find them with the following expression and delete them:

 [\x00-\x1F]+ 

To remove all non-ascii AND ascii control characters, you must remove all characters matching this regular expression:

 [^\x1F-\x7F]+ 
+13
Jan 17 '15 at 16:33
source share

To save new lines:

  • First select a character for the new line ... I used #.
  • Select the replacement option extended.
  • enter \ n replace with #
  • Hit replace all

Further:

  • Select Replace Regular Expression.
  • Enter this: [^ \ x20- \ x7E] +
  • Keep replacement empty.
  • Hit replace all

Now select the option "Replace", "Advanced" and "Replace" # with \ n

:) now you have a clean ASCII file;)

+4
Feb 17 '16 at 20:01
source share

Another good trick is to switch to UTF8 mode in your editor so that you can see these funny characters and delete them yourself.

+1
30 Oct '16 at 16:49
source share

Another way...

  • Install the Text FX plugin if you don’t already have it.
  • Go to TextFX β†’ zap menu item of all non-printable characters in #. It will replace all invalid characters with three characters
  • Go to Find / Replace and find ###. Replace it with a space.

This is good if you cannot remember the regular expression or do not want to search for it. But the regex mentioned by others is also a good solution.

+1
Dec 14 '16 at 18:05
source share



All Articles