Why does the range [01-12] work as expected?

I am trying to use a range pattern [01-12] in a regular expression to match two-digit mm digits, but this does not work properly.

+76
regex
Jun 30 '10 at 10:14
source share
8 answers

It sounds like you misunderstood how the definition of character classes in regular expressions works.

To match any of lines 01 , 02 , 03 , 04 , 05 , 06 , 07 , 08 , 09 , 10 , 11 or 12 , it works something like this:

 0[1-9]|1[0-2] 

Recommendations




explanation

The character class itself tries to match one and exactly one character from the input string. [01-12] actually defines [012] , a character class that matches one character from input against any of the 3 characters 0 , 1 or 2 .

- definition of the range goes from 1 to 1 , which includes only 1 . On the other hand, something like [1-9] includes 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 .

Beginners often make mistakes by defining things like [this|that] . This does not work". This character definition defines [this|a] , ie it matches one character from the input to any of the 6 characters in t , h , i , s , | or. a More than likely (this|that) what is intended.

Recommendations




How ranges are defined

So now it’s obvious that the circuit, for example, between [24-48] hours , doesn’t “work”. The character class in this case is equivalent to [248] .

That is - in the definition of a character class it does not determine the numerical range in the template. Regex engines do not really "understand" the numbers in the template, except for the syntax with finite repetition (for example, a{3,5} corresponds to 3 to 5 a ).

Instead, range definition uses ASCII / Unicode character encoding to define ranges. The character 0 is encoded in ASCII as the decimal number 48; 9 is 57. Thus, the definition of the character [0-9] includes all characters whose values ​​are between decimal 48 and 57 in encoding. It is very reasonable that, by design, these are the symbols 0 , 1 , ..., 9 .

see also




Another example: A to Z

Let's look at another definition of the common character class [a-zA-Z]

In ASCII:

  • A = 65, Z = 90
  • a = 97, z = 122

It means that:

  • [a-zA-Z] and [A-Za-z] equivalent
  • In most flavors, [aZ] may be an invalid character range
    • therefore a (97) is "greater than" than Z (90)
  • [Az] is legal, but also includes the following six characters:
    • [ (91), \ (92), ] (93), ^ (94), _ (95), ' (96)

Related issues

  • is the regular expression [aZ] valid and if so is it the same as [a-zA-Z]
+166
Jun 30 '10 at 10:15
source share

The symbol class in regular expressions, denoted by the syntax [...] , defines the rules corresponding to a single character in the input. Thus, everything you write between the brackets indicates how to match a single character.

Thus, your template [01-12] broken as follows:

  • 0 - match one digit 0
  • or, 1-1, correspond to one digit in the range from 1 to 1
  • or, 2, correspond to one digit 2

So basically all you need is 0, 1 or 2.

To make the match you want by matching two numbers, starting from 01-12 as numbers, you need to think about how they will look like text.

You have:

  • 01-09 (i.e. the first digit is 0, the second digit is 1-9)
  • 10-12 (i.e., the first digit is 1, the second digit is 0-2)

Then you will need to write a regular expression for it, which might look like this:

  +-- a 0 followed by 1-9 | | +-- a 1 followed by 0-2 | | <-+--> <-+--> 0[1-9]|1[0-2] ^ | +-- vertical bar, this roughly means "OR" in this context 

Please note that trying to combine them to produce a shorter expression will fail, providing false positive matches for invalid input.

For example, the pattern [0-1][0-9] will basically correspond to numbers 00-19, which is a little more than you want.

I tried to find a specific source for more information about character classes, but for now, all I can give you is Google Query for Regex Character Classes . I hope you can find more information there to help you.

+22
Jun 30 '10 at 10:21
source share

This also works:

^([1-9]|[0-1][0-2])$

[1-9] matches single digits from 1 to 9

[0-1][0-2] matches two digits from 10 to 12

There are some good examples here.

+7
Jun 30 '10 at 10:27
source share

As polygenic lubricants say, your search will look more like 0 | 1-1 | 2, not what you want, because character classes (things in []) correspond to characters, not strings.

+1
Jun 30 '10 at 10:17
source share

[] in a regular expression denotes a character class. If ranges are not specified, they are implicit or each character inside it together. Thus, [abcde] matches (a|b|c|d|e) , except that it does not commit anything; it will match any of a , b , c , d or e . The entire range indicates a character set; [ac-eg] says that "matches any of: a , any character between c and e ; or g ". So your match says: "match any of: 0 , any character between 1 and 1 (i.e., just 1 )) or 2 .

Your goal, obviously, is to indicate a range of numbers: any number between 01 and 12 , written in two digits. In this particular case, you can match it with 0[1-9]|1[0-2] : either a 0 , followed by any digit between 1 and 9 , or 1 , followed by any digit between 0 and 2 . In general, you can convert any range of numbers into a valid regular expression in the same way. However, there may be a better option than regular expressions, or an existing function or module that can create a regular expression for you. It depends on your language.

+1
Jun 30 '10 at 10:20
source share

Use this:

 0?[1-9]|1[012] 
  • 07: valid
  • 7: valid
  • 0: does not match
  • 00: does not match
  • 13: does not match
  • 21: does not match

To check the template as 07/2018, use this:

 /^(0?[1-9]|1[012])\/([2-9][0-9]{3})$/ 

(Date range 01/2000 to 12/9999)

0
Jan 23 '18 at 7:24
source share

This is IP range regular expression builder
It is free and will automatically generate the regular expression needed for a range of IP addresses.

https://www.analyticsmarket.com/freetools/ipregex/

0
Jun 20 '19 at 17:17
source share

To solve this problem, you can use /^[0-1][0-9]$/; And if you want only 01 to 12 , you need to check two conditions:

The value 00 using the if :

 if(thevale=="00") { // message to user...not allowed } 

and

 if(thevalue >=13) { // message to user...not allowed } 

Sample code in Javascript:

 function CheckMonth(txtBox) { var ex = /^[0-1][0-9]$/; if (txtBox.value.trim() != "") { if (txtBox.value.trim() == "00") { alert('Please enter valid numbers.'); txtBox.value = ""; txtBox.focus(); } else if (ex.test(txtBox.value.trim()) == false) { alert('Please enter valid numbers.'); txtBox.value = ""; txtBox.focus(); } else if (parseInt(txtBox.value.trim()) >= 13) { alert('Please enter valid numbers.'); txtBox.value = ""; txtBox.focus(); } } } 
-3
Oct 27 '14 at 11:22
source share



All Articles