How to remove duplicate lines in a text file in unix bash?

Question

How to remove duplicate lines in a text file in unix bash?

I just have a .txt file with multiple lines, I would like to delete duplicate lines without sorting the file. which command can i use on unix bash?

sample .txt file

orangejuice;orange;juice_apple pineapplejuice;pineapple;juice_pineapple orangejuice;orange;juice_apple

sample output:

 orangejuice;orange;juice_apple pineapplejuice;pineapple;juice_pineapple

+7

bash

t28292 Aug 11 '13 at 9:46

source share

2 answers

Steve · Answer 1 · 2013-08-11T12:27:28+0000

One way: awk :

 awk '!a[$0]++' file.txt

choroba · Answer 2 · 2013-08-11T09:48:49+0000

You can use Perl for this:

 perl -ne 'print unless $seen{$_}++' file.txt

The -n switch causes Perl to process the file line by line. Each line ( $_ ) is saved as a key in a hash named "view", but since ++ returns after the value is returned, the line is displayed for the first time it is encountered.

How to remove duplicate lines in a text file in unix bash?

More articles: