How do you filter out all unique lines in a file?

Question

How do you filter out all unique lines in a file?

Is there a way to filter all unique lines in a file using command line tools without sorting the lines? I would essentially do this:

sort -u myFile

without compromising sorting performance.

+4

command-line linux bash shell

xdhmoore Apr 3 '13 at 20:28

source share

1 answer

Kent · Accepted Answer · 2013-04-03T20:33:04+0000

Remove duplicate lines:

 awk '!a[$0]++' file

This is the famous awk one-liner. There are many explanations for inet. There is one explanation here:

This single line layer is very idiomatic. It registers the lines visible in the associative array "a" (arrays are always associative in Awk) and in the same time tests, if they have seen the line before. If he saw line before, then [line]> 0 and! a [line] == 0. Any expression that evaluates to false is no-op, and any expression that evaluates to true is "{print}".

How do you filter out all unique lines in a file?

More articles: