I have genomics files of the following type:
$ cat test-file_long.txt 2 41647 AG 2 45895 AG 2 45953 TC 2 224919 AG 2 230055 CG 2 233239 AG 2 234130 TG 2 23454 TC
When I use the following short AWK script, it does not return all elements that are larger than the element used in the if statement:
{ a[$2] } END{ for (i in a){ if(i > 45895) print i } }
The script returns this:
$ awk -f practice.awk test-file_long.txt 45953
However, when I modify the if statement using the int () function, it returns strings that are actually more than what I want:
{ a[$2] } END{ for (i in a){ if(int(i) > 45895) print i } }
Result:
$ awk -f practice.awk test-file_long.txt 233239 230055 234130 224919 45953
It seems that he does only a comparison with the first digit, and if they are the same, then he looks at the next digit, but does not process the integer. Can someone explain to me what this means about the internal mechanism of the associative array, that it does not make a numerical> / <comparison, unless I specify that I want int () of the array element? What if my array elements were float and int () was not an option?
source share