What is the best way to read a huge CSV file using Perl?

Requirements

  • I have a very large CSV file to read. (about 3 GB).
  • I donโ€™t need all the records, I mean that there are some conventions that we can use, for example, if the contents of the third CSV column has โ€œXXXXโ€ and the 4th column is โ€œ999โ€.

Question: Can I use these conditions to improve the reading process? If so, how can I do this with Perl?

I need an example (Perl Script) in your answer.

-2
source share
5 answers

Here's the solution:

#!/usr/bin/env perl
use warnings;
use strict;
use Text::CSV_XS;
use autodie;
my $csv = Text::CSV_XS->new();
open my $FH, "<", "file.txt";
while (<$FH>) {
    $csv->parse($_);
    my @fields = $csv->fields;
    next unless $fields[1] =~ /something I want/;
    # do the stuff to the fields you want here
}
+13
source
+5

) , ) :

, , , , , CSV "XXXX" 4- "999". ?

. , CSV "XXXX" "999", ? (DBD:: CSV SQL WHILE, , CSV - , , , (), .)

, , - , : 1) " , ", 2) " offset nnn".

+5

Text::CSV - . DBD:: CSV, . DBI , , , , .

:

#!/usr/bin/perl

use strict;
use warnings;
use DBI;

$dbh = DBI->connect ("DBI:CSV:f_dir=/home/joe/csvdb") 
    or die "Cannot connect: $DBI::errstr";

$sth = $dbh->prepare ("SELECT id, name FROM info.txt WHERE id > 1 ORDER by id");
$sth->execute;

my ($id,$name);
$sth->bind_columns (\$id, \$name);
while ($sth->fetch) {
    print "Found result row: id = $id, name = $name\n";
}
$sth->finish;

I would use Text :: CSV for this task if you do not plan to talk with other types of databases, but in Perl TIMTOWDI , and this helps to find out your options.

+4
source

use a module like Text :: CSV, however, if you know that your data will not have embedded commas and its simple CSV format, then a simple loop is enough to iterate the file

while (<>){
  chomp;
  @s = split /,/;
  if ( $s[2] eq "XXXX" && $s[3] eq "999" ){
    # do something;
  } 
}
+3
source

All Articles