How to remove duplicate rows based on column value?

Given the following table

123456.451 entered-auto_attendant 123456.451 duration:76 real:76 139651.526 entered-auto_attendant 139651.526 duration:62 real:62` 139382.537 entered-auto_attendant 

Using a bash shell script based on Linux, I would like to delete all rows based on the value of column 1 (the one with the longest number). Given that this number is a variable number

I tried with

awk '{a[$3]++}!(a[$3]-1)' file

 sort -u | uniq 

But I do not get a result that would be like this, making a comparison between all the values ​​of the first column, delete all duplicates and show it

  123456.451 entered-auto_attendant 139651.526 entered-auto_attendant 139382.537 entered-auto_attendant 
+7
linux bash awk delete-row
source share
4 answers

You did not give the expected result, does it work for you?

  awk '!a[$1]++' file 

with your data, output:

 123456.451 entered-auto_attendant 139651.526 entered-auto_attendant 139382.537 entered-auto_attendant 

and this line only displays the unique string column1:

  awk '{a[$1]++;b[$1]=$0}END{for(x in a)if(a[x]==1)print b[x]}' file 

exit:

 139382.537 entered-auto_attendant 
+6
source share

uniq compares the entire string by default. Since your lines are not identical, they are not deleted.

You can use sort to conveniently sort by the first field, and also remove duplicates of it:

 sort -t ' ' -k 1,1 -u file 
  • -t ' ' fields are separated by spaces
  • -k 1,1 : look only at the first field
  • -u : remove duplicates

Also, you could see the awk '!a[$0]++' trick for string deduplication. You can do this deduction in the first column only with awk '!a[$1]++' .

+2
source share

Using awk:

 awk '!($1 in a){a[$1]++; next} $1 in a' file 123456.451 duration:76 real:76 139651.526 duration:62 real:62 
+1
source share

try this command

 awk '!x[$1]++ { print $1, $2 }' file 
+1
source share

All Articles