Regular expression - matching word only once per line

Question

Regular expression - matching word only once per line

Happening:

ehello goodbye hello hello goodbye
ehello goodbye hello hello goodbye

I want to match line 1 (only has “hello” once!) DO NOT want to match line 2 (contains “hello” more than once)

I tried to use a negative look ahead and what not ... without real success.

+7

regex regex-negation

user1135229 Jan 6 '12 at 21:39

source share

3 answers

A common regex would be:

 ^(?:\b(\w+)\b\W*(?!.*?\b\1\b))*\z

Altho could be cleaner to invert the result of this match:

 \b(\w+)\b(?=.*?\b\1\b)

This works by matching the word and capturing it, and then checking with lookahead and backreference what it does / doesn't follow anywhere in the line.

+2

Qtax Jan 6 '12 at 21:51

source share

Since you are only worried about words (for example, tokens separated by spaces), you can simply break the spaces and see how often "hello" appears. Since you did not specify a language, the implementation in Perl is implemented here:

 use strict; use warnings; my $a1="ehello goodbye hellot hello goodbye"; my $a2="ehello goodbye hello hello goodbye"; my @arr1=split(/\s+/,$a1); my @arr2=split(/\s+/,$a2); #grab the number of times that "hello" appears my $num_hello1=scalar(grep{$_ eq "hello"}@arr1); my $num_hello2=scalar(grep{$_ eq "hello"}@arr2); print "$num_hello1, $num_hello2\n";

Output signal

 1, 2

+1

user554546 Jan 6 '12 at 21:48

source share

Kobi · Accepted Answer · 2012-01-06T21:45:33+0000

A simple option is this (using a multi-line flag, not a period):

^(?!.*\bhello\b.*\bhello\b).*\bhello\b.*$

First, make sure you don't have “hi” twice, and then check that you have at least once. There are other ways to test the same thing, but I think it's pretty simple.

Of course, you can just match for \bhello\b and count the number of matches ...

Regular expression - matching word only once per line

More articles: