Compare consecutive rows and multiple columns in awk and randomly select one of the repeating rows

I read the question: Compare consecutive lines in awk / (or python) and randomly select one of the repeating lines . Now I have one more question: How do I change the code if I want to make this comparison not only for the x value, but also for the y value or more columns? Maybe something like

if ($1 != prev) && ($2 != prev) ??? 

In other words: I want to compare if the x value and y value of the current line match the x-value AND y-value of the next consecutive lines.

Data:

 #xyz 1 1 11 10 10 12 10 10 17 4 4 14 20 20 15 20 88 16 20 99 17 20 20 22 5 5 19 10 10 20 

The result should look like this:

 #xyz 1 1 11 10 10 17 4 4 14 20 20 15 20 88 16 20 99 17 20 20 22 5 5 19 10 10 20 

or (due to random selection)

 #xyz 1 1 11 10 10 12 4 4 14 20 20 15 20 88 16 20 99 17 20 20 22 5 5 19 10 10 20 

Code from the above link, which does the stuff for x values, but NOT for y values ​​in the AND condition:

 $ cat tst.awk function prtBuf( idx) { if (cnt > 0) { idx = int((rand() * cnt) + 1) print buf[idx] } cnt = 0 } BEGIN { srand() } $1 != prev { prtBuf() } { buf[++cnt]=$0; prev=$1 } END { prtBuf() } 
+5
source share
1 answer

This should do it:

 function prtBuf(idx) { if (cnt > 0) { idx = int((rand() * cnt) + 1) print buf[idx] } cnt = 0 } BEGIN { srand() } $1 != prev1 || $2 != prev2 { prtBuf() } { buf[++cnt]=$0; prev1=$1; prev2=$2 } END { prtBuf() } 
+2
source

All Articles