What is the best way to read a huge CSV file using Perl?

Question

What is the best way to read a huge CSV file using Perl?

Requirements

I have a very large CSV file to read. (about 3 GB).
I don’t need all the records, I mean that there are some conventions that we can use, for example, if the contents of the third CSV column has “XXXX” and the 4th column is “999”.

Question: Can I use these conditions to improve the reading process? If so, how can I do this with Perl?

I need an example (Perl Script) in your answer.

-2

file perl csv

André diniz Feb 01 '10 at 0:43

source share

5 answers

Use Text :: CSV

+5

ziya 01 . '10 0:48

) , ) :

, , , , , CSV "XXXX" 4- "999". ?

. , CSV "XXXX" "999", ? (DBD:: CSV SQL WHILE, , CSV - , , , (), .)

, , - , : 1) " , ", 2) " offset nnn".

+5

Dave Sherohman 01 . '10 11:14

Text::CSV - . DBD:: CSV, . DBI , , , , .

:

#!/usr/bin/perl

use strict;
use warnings;
use DBI;

$dbh = DBI->connect ("DBI:CSV:f_dir=/home/joe/csvdb") 
    or die "Cannot connect: $DBI::errstr";

$sth = $dbh->prepare ("SELECT id, name FROM info.txt WHERE id > 1 ORDER by id");
$sth->execute;

my ($id,$name);
$sth->bind_columns (\$id, \$name);
while ($sth->fetch) {
    print "Found result row: id = $id, name = $name\n";
}
$sth->finish;

I would use Text :: CSV for this task if you do not plan to talk with other types of databases, but in Perl TIMTOWDI , and this helps to find out your options.

+4

James thompson Feb 01 '10 at 5:34

source share

use a module like Text :: CSV, however, if you know that your data will not have embedded commas and its simple CSV format, then a simple loop is enough to iterate the file

while (<>){
  chomp;
  @s = split /,/;
  if ( $s[2] eq "XXXX" && $s[3] eq "999" ){
    # do something;
  } 
}

+3

ghostdog74 Feb 01 '10 at 0:55

source share

singingfish · Accepted Answer · 2010-02-01T00:57:59+0000

Here's the solution:

#!/usr/bin/env perl
use warnings;
use strict;
use Text::CSV_XS;
use autodie;
my $csv = Text::CSV_XS->new();
open my $FH, "<", "file.txt";
while (<$FH>) {
    $csv->parse($_);
    my @fields = $csv->fields;
    next unless $fields[1] =~ /something I want/;
    # do the stuff to the fields you want here
}

What is the best way to read a huge CSV file using Perl?

More articles: