Check if the file is contained in another file

I have a file: a.txtwith a number in each line. I also have another file b.txtas well as a number on each line.
How can I check if all lines in a file are included a.txtin b.txt?

+4
source share
7 answers

You can use the command diffto compare two files

Using example

$ seq 1 5 > a.txt
$ seq 1 5 > b.txt
$ diff a.txt b.txt
$
$ seq 1 6 > b.txt
$ diff a.txt b.txt
5a6
> 6

EDIT

You can also try something like

$ seq 1 5 > a.txt
$ seq 1 5 > b.txt
$ diff a.txt b.txt > /dev/null  && echo files are same || echo files are not same
files are same
$ seq 1 6 > b.txt
$ diff a.txt b.txt > /dev/null  && echo files are same || echo files are not same
files are not same
+3
source

Try the following:

awk '
    NR==FNR{arr[$0]++;next}
    {print ($0 in arr) ? $0 " in both files" : $0 " *not* in both files"}
' b.txt a.txt

With :

 $ diff -a b.txt a.txt
2c2
< 3
---
> 2
6d5
< 7
+1
source

( ), , pipe - , uniq , .

:

>> cat a.txt
1
2
8
5
>> cat b.txt
1
2
5
3
8
>> cat a.txt b.txt | sort | uniq | wc -l
5

b.txt, !

+1
awk 'FNR==NR{b[$0];next}
            {if($0 in b){print $0" is present in b.txt"}
             else{print $0" is not present in b.txt"}
            }' b.txt a.txt
0

Perl:

#!/usr/bin/perl
use strict;
use warnings;
use List::Compare;
#read file a.txt
open (my $fh, "<", "a.txt") or die $!;
while (<$fh>){
    push @atxt = $_;
}
close($fh); 
#read file b.txt
open (my $fh2, "<", "b.txt") or die $!;
while (<$fh2>){
    push @btxt = $_;
} 
close($fh2);

my $lc = List::Compare->new(\@atxt, \@btxt);

print $lc->get_intersection;
print $lc->get_union;
print $lc->get_unique;
print $lc->get_complement;

, : http://search.cpan.org/~jkeenan/List-Compare-0.39/lib/List/Compare.pm

0

, , , a.txt b.txt , , :

, a.txt b.txt?

, . :

a.txt:

5
7
3

b.txt:

9
5
3
7

, .

, , ( , , ). , b.txt, a.txt false, . a.txt, true.

:

ContentSet = {}
for each element b of b.txt
    add b into ContentSet

for each element a of a.txt
    if a is not in ContentSet then return false

return true

, , , , , , , , -, , , O (1).

0

comm .

a.txt b.txt ( ),

comm -23 a.txt b.txt

, ,

comm -23 a.txt b.txt | wc -l

( wc -l "0" ), a.txt b.txt (-2 , b.txt, -3 , ).

, comm:

comm -23 <(sort a.txt) <(sort b.txt)

<(COMMAND) COMMAND FIFO /dev/fd ( , ). <(COMMAND) .

This really checks the strings, so if the number exists twice in a.txt, but only once in b.txt, this will duplicate the string from a.txt. If you don't need duplicates, use sort -u FILEinstead sort FILE(or sort FILE | uniqif yours sortdoesn't have a switch for unique sorting)

0
source

All Articles