What does the Perl split function return when there is no value between tokens?

I am trying to split a string using the split function, but there is not always a value between tokens.

Example: ABC, 123 ,,, XYZ

However, I do not want to miss a few tokens. These values ​​are in certain positions on the line. However, when I split, and then try to execute my resulting array, I get the warning “Use uninitialized values”.

I tried to compare the value using $splitvalues[x] eq "" , and I tried to use defined($splitvalues[x]) , but I can’t understand for life how to determine what the split function inserts into my array when between tokens do not matter.

Here is a snippet of my code (now with more crisp kindness):

 my @matrixDetail = (); #some other processing happens here that is based on matching data from the #@oldDetail array with the first field of the @matrixLine array. If it does #match, then I do the split if($IHaveAMatch) { @matrixDetail = split(',', $matrixLine[1]); } else { @matrixDetail = ('','','','','','',''); } my $newDetailString = (($matrixDetail[0] eq '') ? $oldDetail[0] : $matrixDetail[0]) . (($matrixDetail[1] eq '') ? $oldDetail[1] : $matrixDetail[1]) . . . . (($matrixDetail[6] eq '') ? $oldDetail[6] : $matrixDetail[6]); 

because these are just fragments, I left some of the other logic, but the if statement is inside the sub, which technically returns the @matrixDetail array. If I do not find a match in my matrix and set the array to an array of empty rows manually, I will not receive any warnings. This only happens when split fills @matrixDetail.

Also, I have to mention that I have been writing code for almost 15 years, but most recently I needed to work with Perl. The logic in my script sounds (or at least works), I'm just doing anal work to clear my warnings and try to figure out this little nuance.

+4
source share
4 answers
 #!perl use warnings; use strict; use Data::Dumper; my $str = "ABC,123,,,,,,XYZ"; my @elems = split ',', $str; print Dumper \@elems; 

This gives:

 $VAR1 = [ 'ABC', '123', '', '', '', '', '', 'XYZ' ]; 

It puts an empty string.

Edit: Note that the documentation for split() states that "by default, empty start fields are saved and empty end fields are deleted." Thus, if your string is ABC,123,,,,,,XYZ,,,, then your returned list will be the same as in the above example, but if your string is ,,,,ABC,123 , then you have there will be a list with three blank lines in elements 0, 1 and 2 (in addition to 'ABC' and '123' ).

Edit 2: Try dropping the @matrixDetail and @oldDetail . Probably one of them is not as long as you think. You can also consider checking the number of items in these two lists before trying to use them to make sure you have as many items as you expect.

+4
source

I suggest using Text :: CSV from CPAN. This is a turnkey solution that already covers all the strange edge cases of parsing CSV files.

+1
source

divisible without anything between them give empty lines when split. Empty lines are evaluated as false in a boolean context.

If you know that your “verbose” input will never contain “0” (or another scalar that evaluates to false), this should work:

 my @matrixDetail = split(',', $matrixLine[1]); die if @matrixDetail > @oldDetail; my $newDetailString = ""; for my $i (0..$#oldDetail) { $newDetailString .= $matrixDetail[$i] || $oldDetail[$i]; # thanks canSpice } say $newDetailString; 

(maybe there are other scalars besides the empty string and zero, which are calculated as false, but I could not name them from the head).

TMTOWTDI:

 $matrixDetail[$_] ||= $oldDetail[$_] for 0..$#oldDetail; my $newDetailString = join("", @matrixDetail); 

edit: for loops now go from 0 to $#oldDetail instead of $#matrixDetail since the trailing ",," does not return split.

edit2: if you cannot be sure that the actual input will not be evaluated as false, you can always just check the length of your separated elements. It's safer, definitely, although perhaps less elegant ^ _ ^

0
source

There will be an empty field in the middle. '' Empty fields at the end will be omitted unless you specify the third parameter for a sufficiently large separation (or -1 for all).

0
source

All Articles