Comparing two unsorted files

I have two partitioned files (see examples below):

File 1

Java RAJ PERL ALEX PYTHON MAurice 

(etc.)

File 2

 ALEX 3.4 SAM 8.9 PEPPER 9.0 

Now, for example, if you say that ALEX is also found in file 2 (it is not necessary that ALEX is found), I should have a third file that looks like this:

 PERL ALEX 3.4 

The code should check all the values ​​in column 2 of file 1 in file2.

Any suggestions for a bash script?

+7
source share
4 answers

You want to use join for this. First you need to sort by join field:

 join -1 2 -2 1 <(sort +1 -2 file1) <(sort +0 -1 file2) 
+6
source
 awk 'NR==FNR {val[$1]=$2; next} $2 in val {print $0, val[$2]}' file2 file1 
+5
source

Is single-line with PERL also normal? It works without sorting. Assuming your files are called f1 and f2 ..

 perl -e 'open(F1, shift); open(F2, shift); $ls = $/;undef $/;$f2 = <F2>;$/ = $ls; while(<F1>) { ($t1, $t2) = $_ =~ /^(\w+)\s+(\w+)$/; if($t1) { ($t3) = $f2 =~ /^$t2\s+(.+)$/m; print "$t1 $t2 $t3 \n" if ($t3); } }' f1 f2 

With f1:

 Java RAJ PERL ALEX PYTHON Maurice 

And f2:

 ALEX 3.4 SAM 8.9 PEPPER 9.0 

Results in:

 PERL ALEX 3.4 
+1
source

You got great answers using join and awk, so I decided to post a pure bash -one:

 #!/bin/bash declare -A name2prog declare -A name2num while read ab; do name2prog[$b]=$a; done < file1 while read ab; do name2num[$a]=$b; done < file2 for i in "${!name2num[@]}" do if [[ ${name2prog[$i]} ]]; then echo ${name2prog[$i]} $i ${name2num[$i]} fi done 

outputs:

 $ ./try.sh PERL ALEX 3.4 
+1
source

All Articles