Removing bullet points from a txt file using perl

I am writing a perl script to process a text file. I need to remove marker points from a text file and create a new one without cartridges. When I look at the binary version of the text file, the bullet is stored as unicode bullet (0xe280a2). How to remove a bullet from a string.

I tried the following code:

open($filehandle, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while ($row = <$filehandle>) 
{
   @txt_str = split(/\•/, $row);
   $row = join(" ",@txt_str);
}
+4
source share
2 answers

A backslash will not help you, since a bullet is not a special character in regular expressions.

If you specify that the input is UTF-8, you should look for the UTF-8 bullet. To do this, add

use utf8;

and save the script as UTF-8; or use

\N{BULLET}

In your case, splitting and joining can be replaced by simply replacing the bullet with a space:

while (<$filehandle>) {
    s/\N{BULLET}/ /g; # or s/•/ /g under utf8
    print;            # <-- this was missing in your code
}
+5
source

s/•//g /? ($ row ) stdout, "unbulleted" sed , ,

+2

All Articles