Perl iterates over each line in the file and adds the end of each line to another file

Question

Perl iterates over each line in the file and adds the end of each line to another file

I have two text files containing the following:

FILE1.TXT

dog cat antelope

file2.txt

 1 2 Barry

The result that I want to achieve is as follows:

 dog1 dog2 dogBarry cat1 cat2 catBarry antelope1 antelope2 antelopeBarry

How I did it:

  open (FILE1, "<File1.txt") || die $!; open (FILE2, "<File2.txt") || die $!; my @animals = (<FILE1>); #each line of the file into an array my @otherStrings = (<FILE2>); #each line of the file into an array close FILE1 || die $!; close FILE2 || die $!; my @bothTogether; foreach my $animal (@animals) { chomp $animal; foreach my $otherString (@otherStrings) { chomp $otherString; push (@bothTogether, "$animal$otherString"); } } print @bothTogether;

How I did it, but I'm sure this is not the best way to do this , especially if the files can contain thousands of lines

What would be the best way to do this, perhaps using a hash?

+4

file text perl hash

perl-user Feb 06 '13 at 12:49

source share

2 answers

Besides some modern aspects of Perl (like two open arguments), your code is pretty simple.

The only improvement I see is that you can move the internal chomp to an extra loop, maybe do chomping while you read the file. This will save some time. But in general, if you want to do something with the data for each row of some other data, you are doing it right.

You should use or die instead of || die || die because of priority, and the final output will be a long string, because there are no more lines in the elements of the array.

Update : @FrankB made a good suggestion in his comment above : if your files are huge and you are struggling with memory, you should not insert them and put them in two arrays, but read and process the first line by line and then open and read the second for each of these first lines. It takes a lot of time but saves a ton of memory. You can then output the results directly, rather than pasting them into an array of results.

+1

simbabque Feb 06 '13 at 12:59

source share

user1919238 · Accepted Answer · 2013-02-06T13:07:21+0000

Your approach will work just fine for files with thousands of lines. It really is not that important. For millions of lines, this can be a problem.

However, you could reduce the memory usage in your code by simply reading one file in memory, and print the results immediately, rather than storing them in an array:

 use warnings; use strict; open my $animals, '<', 'File1.txt' or die "Can't open animals: $!"; open my $payloads, '<', 'File2.txt' or die "Can't open payloads: $!"; my @payloads = <$payloads>; #each line of the file into an array close $payloads or die "Can't close payloads: $!"; while (my $line = <$animals>) { chomp $line; print $line.$_ foreach (@payloads); } close $animals or die "Can't close animals: $!";

With two huge files of the same size, this will be approximately 1/4 of the memory of your source code.

Update: I also edited the code to include Simbabque in good suggestions for upgrading it.

Update 2:. As others noted, you could not read a single file in memory, going through the payload file line by line in each line of the animal file. However, this will be much slower. It should be avoided if absolutely necessary. The approach I suggested will be about the same as the source code.

Perl iterates over each line in the file and adds the end of each line to another file

More articles: