How can I combine a Perl split command with a space trim?

Repost from Perlmonks for employee:

I wrote a perl script to separate long email lists separated by semicolons. What I would like to do with the code is to combine the separation with a space trim, so I don't need two arrays. Is it possible to trim when loading the first array. Output is a sorted list of names. Thanks.

#!/pw/prod/svr4/bin/perl use warnings; use strict; my $file_data = 'Builder, Bob ;Stein, Franklin MSW; Boop, Elizabeth PHD Cc: Bear, + Izzy'; my @email_list; $file_data =~ s/CC:/;/ig; $file_data =~ s/PHD//ig; $file_data =~ s/MSW//ig; my @tmp_data = split( /;/, $file_data ); foreach my $entry (@tmp_data) { $entry =~ s/^[ \t]+|[ \t]+$//g; push( @email_list, $entry ); } foreach my $name ( sort(@email_list) ) { print "$name \n"; } 
+4
source share
7 answers

You do not need to perform both operations at a time using the same function. Sometimes the execution of individual actions may be more clear. That is, first separate, then separate the space from each element (and then sort the result):

 @email_list = sort( map { s/\s*(\S+)\s*/\1/; $_ } split ';', $file_data ); 

EDIT: Deleting more than one part of a line at the same time can lead to errors, for example. Sinan points out below to leave trailing spaces in the Elizabeth piece. I encoded this fragment with the assumption that the name will not have internal spaces, which is actually completely wrong, and I would call it wrong if I realized this. The code is significantly improved (and also more readable) below:

 @email_list = sort( map { s/^\s+//; # strip leading spaces s/\s+$//; # strip trailing spaces $_ # return the modified string } split ';', $file_data ); 
+10
source

If you don't need to trim the first and last elements, this will do the trick:

 @email_list = split /\s*;\s*/, $file_data; 

If you need to trim the first and last elements, first draw $file_data and then repeat as above. :-P

+11
source

Well, you can do what Chris suggested, but it does not handle leading and trailing spaces in $ file_data.

You can add processing for these files as follows:

 $file_data =~ s/\A\s+|\s+\z//g; 

Also note that using a second array is not required. Check this:

 my $file_data = 'Builder, Bob ;Stein, Franklin MSW; Boop, Elizabeth PHD Cc: Bear, Izzy'; my @email_list; $file_data =~ s/CC:/;/ig; $file_data =~ s/PHD//ig; $file_data =~ s/MSW//ig; my @tmp_data = split( /;/, $file_data ); foreach my $entry (@tmp_data) { $entry =~ s/^[ \t]+|[ \t]+$//g; } foreach my $name ( sort(@tmp_data) ) { print "$name \n"; } 
+2
source
 my @email_list = map { s/^[ \t]+|[ \t]+$//g; $_ } split /;/, $file_data; 

or more elegant:

 use Algorithm::Loops "Filter"; my @email_list = Filter { s/^[ \t]+|[ \t]+$//g } split /;/, $file_data; 
+1
source

See How to remove empty space from the beginning / end of a line? in the FAQ.

 @email_list = sort map { s/^\s+//; s/\s+$//; $_ } split ';', $file_data; 

Now note also that the for loop pseudonizes each element of the array, therefore

 @email_list = sort split ';', $file_data; for (@email_list) { s/^\s+//; s/\s+$//; } 

will also work.

+1
source

My turn:

 my @fields = grep { $_ } split m/\s*(?:;|^|$)\s*/, $record; 

He also shares the first and last elements. If grep is redundant to get rid of the first element:

 my ( undef, @fields ) = split m/\s*(?:;|^|$)\s*/, $record; 

works if you know that there is a gap, but this is unlikely, therefore

 my @fields = split m/\s*(?:;|^|$)\s*/, $record; shift @fields unless $fields[0]; 

is the surest way to do this.

0
source

Banning some minor sintax error, this should do all the work for you. Oh, list the operations how beautiful you are!

 print join (" \n", sort { $a <=> $b } map { s/^[ \t]+|[ \t]+$//g } split (/;/, $file_data)); 
-1
source

All Articles