I have a text file with two bytes without ascii (0xFF and 0xFE):
??58832520.3,ABC 348384,DEF
Volume for this file:
FF FE 35 38 38 33 32 35 32 30 2E 33 2C 41 42 43 0A 33 34 38 33 38 34 2C 44 45 46
It seems that FF and FE are leading bytes (they exist in my whole file, although it seems to always be at the beginning of the line).
I am trying to remove these bytes with sed, but none of this looks like them.
$ sed 's/[^a-zA-Z0-9\,]//g' test.csv ??588325203,ABC 348384,DEF $ sed 's/[a-zA-Z0-9\,]//g' test.csv ??.
The main question is: how do I remove these bytes? Bonus question: two regular expressions are direct negatives, so one of them should logically filter these bytes, right? Why do both of these regular expressions correspond to bytes 0xFF and 0xFE?
Update: A direct approach to removing a range of hex bytes (suggested by the two answers below) seems to cut off the first βlegitimateβ byte from each line and leaves the bytes that I'm trying to get rid of:
$sed 's/[\x80-\xff]//' test.csv ??8832520.3,ABC 48384,DEF FF FE 38 38 33 32 35 32 30 2E 33 2C 41 42 43 0A 34 38 33 38 34 2C 44 45 46 0A
Note the absence of β5β and β3β at the beginning of each line, and a new 0A is added at the end of the file.
Bigger update . This problem seems to be system dependent. The problem was observed in OSX, but the sentences (including my original sed expression above) work, as I expect, they are on NetBSD.
Solution : the same task seems quite simple via Perl:
$ perl -pe 's/^\xFF\xFE//' test.csv 58832520.3,ABC 348384,DEF
However, I will leave this question open, as this is only a workaround and does not explain what the problem is with sed.