The problem seems to be related to awk defined in Cygwin.
I tried several different things, and it seems that awk silently handles replacing \r\n with \n in the input.
If we just ask awk repeat the text unchanged, it will "sanitize" the carriage return without asking:
$ echo -e "line1\r\nline2" | od -a 0000000 line 1 cr nl line 2 nl 0000015 $ echo -e "line1\r\nline2" | awk '{ print $0; }' | od -a 0000000 line 1 nl line 2 nl 0000014
However, it will leave other carriage messages inapplicable:
$ echo -e "Test\rTesting\r\nTester\rTested" | awk '{ print $0; }' | od -a 0000000 T est cr T esting nl T es 0000020 ter cr T ested nl 0000033
Using a custom _ separator of records _ ended up leaving the carriage left unchanged:
$ echo -e "Testing\r_Tested" | awk -v RS="_" '{ print $0; }' | od -a 0000000 T esting cr nl T ested nl 0000020 nl 0000021
The most striking example includes \r\n in the data, but not as a record separator:
$ echo -e "Testing\r\nTested_Hello_World" | awk -v RS="_" '{ print $0; }' | od -a 0000000 T esting nl T ested nl H 0000020 ello nl W orld nl nl 0000034
awk blindly converts \r\n to \n into input, even if we didn't ask for it.
This replacement seems to occur before applying record separation, which explains why RS="\r\n" never matches anything. By the time awk searches for \r\n , it has already replaced it with \n in the input.
Mr. Llama
source share