Wednesday, February 9, 2011

How can I check if at least one of two subexpressions in a regular expression match?

I am trying to match floating-point decimal numbers with a regular expression. There may or may not be a number before the decimal, and the decimal may or may not be present, and if it is present it may or may not have digits after it. (For this application, a leading +/- or a trailing "E123" is not allowed). I have written this regex:

/^([\d]*)(\.([\d]*))?$/

Which correctly matches the following:

1
1.
1.23
.23

However, this also matches empty string or a string of just a decimal point, which I do not want.

Currently I am checking after running the regex that $1 or $3 has length greater than 0. If not, it is not valid. Is there a way I can do this directly in the regex?

  • I think this will do what you want. It either starts with a digit, in which case the decimal point and digits after it are optional, or it starts with a decimal point, in which case at least one digit is mandatory after it.

    /^\d+(\.\d*)?|\.\d+$/
    
    Patrick McElhaney : What sorry regex engine doesn't support +? I took the liberty of changing the code to what you would have had if you knew for sure + was supported.
    Glomek : I was thinking of some versions of grep, sed, and ed, but the original poster is probably using something more recent. It's one of those habits that you'll see in people who started with Unix systems that didn't use GNU tools. I also use zcat instead of the -z option to tar.
    From Glomek
  • Create a regular expression for each case and OR them. Then you only need test if the expression matches.

    /^(\d+(\.\d*)?)|(\d*\.\d+)$/
    
    From tvanfosson

0 comments:

Post a Comment