According to the BRE / ERE section with parentheses in the POSIX regular expression specification:
- [...] The right bracket (
']' ) loses its special meaning and appears in the expression of the bracket if it appears first in the list (after the initial stroke ( '^' ), if any), Otherwise, it must stop parenthesis expression if it does not appear in the character matching (for example, "[.].]" ) or is the final right parenthesis for the character matching, equivalence class or character class. Special characters '.' , '*' , '[' and '\' (period, asterisk, left bracket and backslash, respectively) lose their special meaning in the expression of the bracket.
and
- [...] If the expression in the bracket indicates both
'-' and ']' , place ']' (after '^' , if any) and '-' last in the bracket expression.
Therefore, your regular expression should be:
echo "fdsl[]" | grep -Eo "[][ az]+"
Pay attention to the E flag, which indicates the use of ERE, which supports quantifier + . + quantifier is not supported in BRE (default mode).
The solution in Mike Holt answers "[][az ]\+" with escaped + , it works because it works on GNU grep, which extends the grammar to support \+ to repeat one or more times . In fact, this is undefined behavior in accordance with the POSIX standard (which means that the implementation can give meaningful behavior and document it or produce a syntax error, or something else).
If everything is fine with the assumption that your code can only work in the GNU environment, then it makes full use of Mike Holt's answer . Using sed as an example, you are stuck in BRE when using POSIX sed (there is no flag to switch to ERE) and it is cumbersome to write even a simple regular expression with POSIX BRE, where there is only a specific quantifier * .
Original regex
Note that grep consumes the input file line by line, and then checks if the line matches the regular expression. Therefore, even if you use the P flag with your original regular expression, \n always redundant, since the regular expression cannot match in lines.
While it is possible to map the horizontal tab without the P flag , I think it is more natural to use the P flag for this task.
Given this input:
$ echo -e "fds\tl[]kSAJD<>?,./:\";'{}|[]\\ !@ #$%^&*()_+-=~\`89" fds l[]kSAJD<>?,./:";'{}|[]\ !@ #$%^&*()_+-=~`89
The original regex in the question works with a slight modification (unescape + at the end):
$ echo -e "fds\tl[]kSAJD<>?,./:\";'{}|[]\\ !@ #$%^&*()_+-=~\`89" | grep -Po "[ \[\]\t\na-zA-Z\/:\.0-9_~\"'+,;*\=()$\ !@ #&?-]+" fds l[]kSAJD ?,./:";' [] !@ #$ &*()_+-=~ 89
Although we can remove \n (as it is redundant, as explained above) and several other unnecessary shoots:
$ echo -e "fds\tl[]kSAJD<>?,./:\";'{}|[]\\ !@ #$%^&*()_+-=~\`89" | grep -Po "[ \[\]\ta-zA-Z/:.0-9_~\"'+,;*=()$\ !@ #&?-]+" fds l[]kSAJD ?,./:";' [] !@ #$ &*()_+-=~ 89