Perl - Code Enhancement

I just started coding in Perl, and just want to know if it is possible to make the code below more efficient or to do it in fewer lines.

I did a little research on the Win32::OLE and Text::CSV modules, but it was like what I read so far.

This question is basically a beginner asking a senior: "Hey, how do I become a better Perl progambler?"

The purpose of the code is to obtain data from the specified ranges in the specified sheets of the Excel workbook and write the contents of these ranges to CSV files.

In addition, I know that I need to perform general checks, for example, to make sure my $cellValue defined before adding it to the array, etc., but I look more at the general structure. How is there a way to smooth out a loop by putting the whole whole string in an array at once, or the whole range in an array or reference, or something like that?

thanks

 use strict; use warnings; use Spreadsheet::XLSX; my $excel = Spreadsheet::XLSX -> new ('C:\scott.xlsm',); my @sheets = qw(Fund_Data GL_Data); foreach my $sheet (@sheets) { my $worksheet = $excel->Worksheet($sheet); my $cell = $worksheet->get_cell(25,0); if ($cell) { # make sure cell value isn't blank my $myFile = "C:/$sheet.csv"; open NEWFILE, ">$myFile" or die $!; # write all cells from Range("A25:[MaxColumn][MaxRow]") to a csv file my $maxCol = $worksheet->{MaxCol}; my $maxRow = $worksheet->{MaxRow}; my @arrRows; my $rowString; # loop through each row and column in defined range and string together each row and write to file foreach my $row (24 .. $maxRow) { foreach my $col (0 .. $maxCol) { my $cellValue = $worksheet->{Cells} [$row] [$col]->Value(); if ($rowString) { $rowString = $rowString . "," . $cellValue; } else { $rowString = $cellValue; } } print NEWFILE "$rowString\n"; undef $rowString; } } } 
+7
source share
4 answers

Mark's offer is excellent. Another minor improvement would be to replace โ€œDo a bunch of nested logic if $cell ,โ€ do nothing unless $cell - this way you have a bit more readable code (remove 1 extra indent / nested block; you need to worry about what happens, if $ cell is empty.

 # OLD foreach my $sheet (@sheets) { my $worksheet = $excel->Worksheet($sheet); my $cell = $worksheet->get_cell(25,0); if ($cell) { # make sure cell value isn't blank # All your logic in the if } } # NEW foreach my $sheet (@sheets) { my $worksheet = $excel->Worksheet($sheet); next unless $worksheet->get_cell(25,0); # You don't use $cell, so dropped # All your logic that used to be in the if } 

As you noted, Text::CSV would be nice to consider, depending on when your data should ever be specified based on the CSV standard (for example, contains spaces, commas, quotes, etc.). If it may be necessary to quote, do not reinvent the wheel, and instead use Text::CSV to print. An unconfirmed example would be something like this:

 # At the start of the script: use Text::CSV; my $csv = Text::CSV->new ( { } ); # Add error handler! # In the loop, when the file handle $fh is opened foreach my $row (24 .. $maxRow) { my $cols = [ map { $worksheet->{Cells}[$row][$_] } 0 .. $maxCol) ]; my $status = $csv->print ($fh, $cols); # Error handling } 
+6
source

There is no reason to have this inner loop:

 print NEWFILE join(",", map { $worksheet->{Cells}[$row][$_] } 0 .. $maxCol), "\n"; 

Also, make sure your indexes are correct. I'm not familiar with Spreadsheet :: XLSX, so make sure max col and row are zero, like the rest of your code. If this is not the case, you need to iterate over 0 .. $maxCol-1 .

+6
source

I would advise against hard coding file names ... especially in small projects like this, I'm used to passing file names through GetOpt::Long . If you usually do this with all of your small projects, itโ€™s much easier to remember to do it right when it relies on a larger project.

Your code is well structured and readable, and you expected problems with your loop statements, you used warnings and strict ones, and you usually use libraries correctly.

+4
source

As others have said, your code is clear and well structured. But I think it can be improved with a little more Perlishness.

The following items come to mind

  • Use lexical file descriptors and a three-parameter open form ( open my $newfile, '>', $myFile )

  • Iterate over the values โ€‹โ€‹of the hash or array (or fragments thereof), and not their keys or indexes, unless you really need keys for the loop body

  • Retrieve pointers to data substructures in the loop if this is the focus of the loop ( my $rows = $worksheet->{Cells} )

  • Indicate where you use a loop to convert one list to another, and use map instead

I hope I didnโ€™t bounce the gun by writing a solution using Text::CSV , as you suggested. With luck, this is instructive for you.

 use strict; use warnings; use Spreadsheet::XLSX; use Text::CSV; my $csv = Text::CSV->new; my $excel = Spreadsheet::XLSX->new('C:\scott.xlsm',); foreach my $sheet (qw/ Fund_Data GL_Data /) { my $worksheet = $excel->Worksheet($sheet); next unless $worksheet->get_cell(25,0); my $myFile = "C:\\$sheet.csv"; open my $newfile, '>', $myFile or die $!; my $rows = $worksheet->{Cells}; # Write all cells from row 25 onwards to the CSV file foreach my $row (@{$rows}[24..$#{$rows}]) { my @values = map $_ ? $_->Value : '', @$row; $csv->print($newfile, \@values); print $newfile "\n"; } } 
+4
source

All Articles