How to determine floating point number using regex

Question

How to determine floating point number using regex

What is a good regular expression for handling floating point numbers (e.g. Java Float)

The answer should be consistent with the following objectives:

1) 1. 2) .2 3) 3.14 4) 5e6 5) 5e-6 6) 5E+6 7) 7.e8 8) 9.0E-10 9) .11e12

So he should

ignore previous characters
requires the first character to the left of the decimal point to be non-zero
allow 0 or more digits on either side of the decimal point
allow number without decimal point
allow scientific notation
allow capital or lowercase letters 'e'
allow positive or negative indicators

For those who are wondering, yes, this is a homework problem. We got it as a task in my CS class for graduates in compilers. I have already included my answer for the class and will post it as the answer to this question.

[Afterword] My decision did not receive full credit because it did not process more than 1 digit to the left of the decimal. The assignment mentioned accessing Java floats, although none of the examples had more than 1 digit to the left of the decimal. I will post the accepted answer in his own post.

+12

floating-point regex

Kelly S. French Feb 19 '10 at 2:48

source share

7 answers

Just make both the decimal point and the E-then-exponent part optional:

 [1-9][0-9]*\.?[0-9]*([Ee][+-]?[0-9]+)?

I do not understand why you do not want the presenter [+-]? could capture a possible sign, but whatever! -)

Edit : in fact, there can be no digits to the left of the decimal point (in this case, I believe that there should be a decimal point and 1 + digits after it!), Therefore a vertical panel (alternative):

 (([1-9][0-9]*\.?[0-9]*)|(\.[0-9]+))([Ee][+-]?[0-9]+)?

+23

Alex Martelli Feb 19 '10 at 2:53 on

source share

http://www.regular-expressions.info/floatingpoint.html

+4

user230952 Feb 19 '10 at 4:27

source share

Here is what I have included.

 (([1-9]+\.[0-9]*)|([1-9]*\.[0-9]+)|([1-9]+))([eE][-+]?[0-9]+)?

To simplify the discussion, I will name the sections

 ( ([1-9]+ \. [0-9]* ) | ( [1-9]* \. [0-9]+ ) | ([1-9]+)) ( [eE] [-+]? [0-9]+ )? -------------------------------------------------------- ----------------------  AB

A: Meets all e / E parameters
B: corresponds to scientific notation

Destruction A we get three parts

  ( ([1-9]+ \. [0-9]* ) | ( [1-9]* \. [0-9]+ ) | ([1-9]+) ) ----------1---------- ---------2---------- ---3----

Part 1: allows 1 or more digits from 1 to 9, decimal, 0 or more digits after decimal (target 1)
Part 2: Allows 0 or more digits from 1 to 9, decimal, 1 or more digits after decimal (target 2)
Part 3: Allows 1 or more digits from 1 to 9 without decimal (see No. 4 in the list of goals)

Breaking B, we get 4 main parts

  ( [eE] [-+]? [0-9]+ )? ..--1- --2-- --3--- -4- ..

Part 1: requires the entry of upper or lower case "e" for scientific notation (for example, goals 8 and 9)
Part 2: allows an optional positive or negative sign for the exponent (e.g. goals 4, 5 and 6)
Part 3: allows 1 or more digits for the exhibitor (target 8)
Part 4: allows scientific notation to be optional as a group (goal 3)

+2

Kelly S. French Feb 19 '10 at 2:57

source share

 '([-+])?\d*(\.)?\d+(([eE]([-+])?)?\d+)?'

This is a regular expression that I came up with when trying to solve this problem in Matlab. In fact, it will not correctly determine numbers like (1.), but some additional changes can solve the problem ... well, maybe the following will fix it:

 '([-+])?(\d+(\.)?\d*|\d*(\.)?\d+)(([eE]([-+])?)?\d+)?'

+1

Ivan Nov 27 '13 at 14:43

source share

@Kelly S. French: there is no sign, because it is added by the unary minus (negation) in the parser, so it is not necessary to detect it as part of a float.

+1

Heiko Schäfer Apr 15 '14 at 20:03

source share

@Kelly S. French, this regex matches all your test cases.

 ^[+-]?(\d+\.\d+|\d+\.|\.\d+|\d+)([eE][+-]?\d+)?$

Source: perldoc perlretut

+1

Ram Chandra Giri 05 Oct '17 at 10:41 on

source share

Kelly S. French · Accepted Answer · 2010-03-24 17:04

[This is a response from the professor]

Definition:

N = [1-9]
D = 0 | N
E = [eE] [+ -]? D +
L = 0 | (ND *)

Then floating point numbers can be matched with:

((L. D * |. D +) E?) | (LE)

It was also acceptable to use D + rather than L, and add [+ -] ?.

A common mistake was to write D *. D *, but this may be the same as ".".

[Change]
Someone asked about a leading sign; I should have asked him why this was ruled out, but it didn’t work out. Since this was part of a grammar lecture, I assume that either it made the problem simpler (unlikely), or there was a small detail in the parsing where you divide the set of problems, so that the floating point value, regardless of the sign, is equal to focus ( perhaps).

If you parse an expression, for example.

-5.04e-10 + 3.14159E10

the sign of a floating point value is part of the operation applied to the value, and not an attribute of the number itself. In other words,

subtract (5.04e-10)
add (3.14159E10)

to form the result of the expression. Although I'm sure mathematicians can argue about this, remember that this was from a parsing lecture.

How to determine floating point number using regex

More articles: