Dimensionless regular expressions

What is the best way to use regular expressions with parameters (flags) in Haskell

I use

Text.Regex.PCRE 

The documentation lists some interesting options, such as compCaseless, computF8, ... But I don't know how to use them (= ~)

+6
regex case-insensitive haskell
source share
3 answers

All Text.Regex.* Modules Text.Regex.* heavy use of type classes that exist for extensibility and "overloading" behavior, but make use less obvious from just seeing types.

Now you probably started working with basic =~ .

 (=~) :: ( RegexMaker Regex CompOption ExecOption source , RegexContext Regex source1 target ) => source1 -> source -> target (=~~) :: ( RegexMaker Regex CompOption ExecOption source , RegexContext Regex source1 target, Monad m ) => source1 -> source -> m target 

To use =~ , there must be an instance of RegexMaker ... for LHS and RegexContext ... for RHS and the result.

 class RegexOptions regex compOpt execOpt | ... | regex -> compOpt execOpt , compOpt -> regex execOpt , execOpt -> regex compOpt class RegexOptions regex compOpt execOpt => RegexMaker regex compOpt execOpt source | regex -> compOpt execOpt , compOpt -> regex execOpt , execOpt -> regex compOpt where makeRegex :: source -> regex makeRegexOpts :: compOpt -> execOpt -> source -> regex 

A valid instance of all these classes (for example, regex=Regex , compOpt=CompOption , execOpt=ExecOption and source=String ) means the ability to compile the regex options with compOpt,execOpt from some form of source . (Also, given some regex type, there is exactly one set of compOpt,execOpt , that comes with it. However, many different types of source all the same.)

 class Extract source class Extract source => RegexLike regex source class RegexLike regex source => RegexContext regex source target where match :: regex -> source -> target matchM :: Monad m => regex -> source -> m target 

A valid instance of all these classes (e.g. regex=Regex , source=String , target=Bool ) means that it can match a source and a regex to get target . (Other valid target given these specific regex and source are Int , MatchResult String , MatchArray , etc.)

Put them together, and it's pretty obvious that =~ and =~~ are just handy functions

 source1 =~ source = match (makeRegex source) source1 source1 =~~ source = matchM (makeRegex source) source1 

and also that =~ and =~~ do not leave room for passing various makeRegexOpts parameters.

You can make your own

 (=~+) :: ( RegexMaker regex compOpt execOpt source , RegexContext regex source1 target ) => source1 -> (source, compOpt, execOpt) -> target source1 =~+ (source, compOpt, execOpt) = match (makeRegexOpts compOpt execOpt source) source1 (=~~+) :: ( RegexMaker regex compOpt execOpt source , RegexContext regex source1 target, Monad m ) => source1 -> (source, compOpt, execOpt) -> m target source1 =~~+ (source, compOpt, execOpt) = matchM (makeRegexOpts compOpt execOpt source) source1 

which can be used as

 "string" =~+ ("regex", CompCaseless + compUTF8, execBlank) :: Bool 

or overwrite =~ and =~~ using methods that can take parameters

 import Text.Regex.PCRE hiding ((=~), (=~~)) class RegexSourceLike regex source where makeRegexWith source :: source -> regex instance RegexMaker regex compOpt execOpt source => RegexSourceLike regex source where makeRegexWith = makeRegex instance RegexMaker regex compOpt execOpt source => RegexSourceLike regex (source, compOpt, execOpt) where makeRegexWith (source, compOpt, execOpt) = makeRegexOpts compOpt execOpt source source1 =~ source = match (makeRegexWith source) source1 source1 =~~ source = matchM (makeRegexWith source) source1 

or you can just use match , makeRegexOpts , etc. if necessary.

+16
source share

I believe that you cannot use (= ~) if you want to use compOpt except defaultCompOpt .

Something like this work:

 match (makeRegexOpts compCaseless defaultExecOpt "(Foo)" :: Regex) "foo" :: Bool 

The following two articles will help you:

Real World Haskell, chapter 8. Effective file processing, regular expressions and file name matching

Haskell Regular Expression Tutorial

+8
source share

I don't know anything about Haskell, but if you use the PCRE-based regular expression library, then you can use mode modifiers inside the regular expression. To combine "visa-free" in case-insensitive mode, you can use this regular expression in PCRE:

 (?i)caseless 

The mode modifier (? I) overrides the case sensitivity parameter or the case sensitivity parameter that was set outside the normal expression. It also works with operators that do not allow you to set any parameters.

Similarly, (? S) turns on the "single line" mode, which leads to breaking the line of coincidence of points, (? M) turns on the "multi-line mode", which makes $ and $ match when breaking lines and (? X) the free space mode is turned on ( unsafe spaces and line breaks outside character classes are not significant). You can combine letters. (? ismx) includes everything. A hyphen disables options. (? -i) makes the regular expression case sensitive. (? xi) runs a regular expression at random intervals.

+7
source share

All Articles