$ Windows newline character in bytes of Python bytes

$ matches at the end of a line, which is defined as the end of a line, or any location followed by a newline character.

However, the Windows newline flag contains two characters '\r\n', how do you '$'recognize '\r\n'as a newline character in bytes?

Here is what I have:

# Python 3.4.2
import re

input = b'''
//today is a good day \r\n
//this is Windows newline style \r\n
//unix line style \n
...other binary data... 
'''

L = re.findall(rb'//.*?$', input, flags = re.DOTALL | re.MULTILINE)
for item in L : print(item)

now output:

b'//today is a good day \r'
b'//this is Windows newline style \r'
b'//unix line style '

but the expected result is as follows:

the expected output:
b'//today is a good day '
b'//this is Windows newline style '
b'//unix line style '
+4
source share
3 answers

Cannot override binding behavior.

To match a //with any number of characters other than CR and LF, then use a negative character class [^\r\n]with a *quantifier:

L = re.findall(rb'//[^\r\n]*', input)

, re.M re.S.

\r? $ ( *? .):

rb'//.*?(?=\r?$)'

lookahead , $ lookahead, \n. , \r.

, , MSDN, , Python:

, $ \n, \r\n ( CR/LF). CR/LF, \r?$ .

PCRE (* ANYCRLF), (* CR) (* ANY), $, Python.

+3

, ...

re.findall(r'//.*?(?=\r|\n|(?!.))', input, re.DOTALL | re.MULTILINE)

$ ( \r, \n ).

+1

I think you could also use a \vvertical space that matches[\n\cK\f\r\x85\x{2028}\x{2029}]

To not include it in the output, use lookahead ://.*(?=\v|$)

Test at regex101.com

+1
source

All Articles