Ignore space to match regex

I need to match 8 or more digits, the sequence of which may contain spaces.

for example, all of the following will be valid matches.

12345678 1 2345678 12 3 45678 1234 5678 12 34567 8 1 2 3 4 5 6 7 8 

At the moment I have \d{8,} , but this will only fix a solid block of 8 or more digits.
[\d\s]{8,} will not work, since I do not want the empty space to contribute to the count of captured characters.

+7
c # regex
source share
3 answers

Waayy later, but it really needs the right answer, and reason. Who knew this question could have such a complex answer, right? Lol But there are many considerations surrounding an interval in a regular expression.

At first; Never put a space in a regular expression. This will make your regular expression unreadable and unbearable. Memories of using the mouse to highlight a space to make sure that this is only one space that comes to mind. This will break your regular expression: but it will not: [], because repetition in the character class is ignored. And if you need the exact number of spaces, you can see it in the character class as follows: [ ]{3} . Compared to crashes without a character class: {3} <- This is actually looking for 5 spaces, woops!

Secondly; Keep the Freespacing (?x) parameter, which makes your regex noteworthy and free. You should not be afraid that someone using this parameter might break your regular expression because you decided to put random keyboard spaces into it. In addition, (?x) will not ignore keyboard space when it is inside a character class, for example: [ ] . Therefore, it is safer to use character classes for your keyboard spaces.

Thirdly; Try not to use \s in this scenario. As Omagosh points out, it also includes newlines ( \r and \n ). The scenario you were talking about would not be pleasant. However, as Omagosh points out, you may want more than just keyboard space. Thus, you can use [ ] , [\s-[\r\n]] or [\f\t\v\u00A0\u2028\u2029\u0020] depending on what you like. The last two in these cases are the same thing, but subtracting a character class only works in .NET and a few other odd flavors.

Fourth This is usually a complex pattern: (\s*...\s*)* . That doesn't make any sense. This is the same as: (\s*\s*...)* or this: (\s*\s*\s*\s*...)* . Because the pattern is repeated. The only argument against what I'm saying is that you are guaranteed to capture the spaces before ... But this has never been required. In the worst case, you can see this: \s*(...\s*)*

Omagosh had the closest answer, but this is the shortest correct answer:

 Regex.Match(input, @"(?:\d[ ]*){8,}").Groups[0].Value; 

Or the following, if we say literally that six parameters are in the same text on several lines:

 Regex.Match(input, @"(?m)^(?:\d[ ]*){8,}$").Groups[0].Value; 

Or the following, if it is part of a large regular expression and needs a group:

 Regex.Match(input, @"...((?:\d[ ]*){8,})...").Groups[1].Value; 

And feel free to replace [ ] with subtracting the .NET class or the .NET explicit space class:

 @"(?:\d[\s-[\r\n]]*){8,}" // Or . . . @"(?:\d[\f\t\v\u00A0\u2028\u2029\u0020]*){8,}" 
+1
source share
 (\d *){8,} 

It matches eight or more occurrences of numbers, followed by zero or more spaces. Change it to

 ( *\d *){8,} #there is a space before first asterik 

to match lines with spaces at the beginning. Or

 (\s*\d\s*){8,} 

to match tabs and other space characters (including also line feeds).

Finally, make it not an exciting group with ?: . Thus, it becomes (?:\s*\d\s*){8,}

+13
source share
 (\d{8,}\s+)*\d{8,} 

must work

0
source share

All Articles