How do you filter out all unique lines in a file?

Is there a way to filter all unique lines in a file using command line tools without sorting the lines? I would essentially do this:

sort -u myFile 

without compromising sorting performance.

+4
source share
1 answer

Remove duplicate lines:

 awk '!a[$0]++' file 

This is the famous awk one-liner. There are many explanations for inet. There is one explanation here:

This single line layer is very idiomatic. It registers the lines visible in the associative array "a" (arrays are always associative in Awk) and in the same time tests, if they have seen the line before. If he saw line before, then [line]> 0 and! a [line] == 0. Any expression that evaluates to false is no-op, and any expression that evaluates to true is "{print}".

+16
source

All Articles