C # - The fastest way to find one of many lines in another line

I need to check if a string contains any abusive words.

Following the advice of another question here, I created a HashSet containing the words:

HashSet<string> swearWords = new HashSet<string>() { "word_one", "word_two", "etc" };

Now I need to see if any values ​​from are swearWordsin my line.

I saw how this was done the other way around, for example:

swearWords.Contains(myString)

But this will return false.

What is the fastest way to check if any of the words in a HashSet are in myString?

NB: I suppose I can use the foreach loop to check each word in turn and break if a match is found, I'm just wondering if there is a faster way.

+5
source share
5 answers

, , .

Regex rx = new Regex("(" + string.Join("|", swearWords) + ")");
rx.IsMatch(myString)
+6

IEnumerable < > :

var containsSwears = swarWords.Any(w => myString.Contains(w));

. HashSet < > IEnumerable < >

+9
+6

"myString" IEnumerable, "Overlaps" ?

http://msdn.microsoft.com/en-us/library/bb355623(v=vs.90).aspx

(P.S. ...)

: .

+3

, , .

  • , , input.Contains, ; "" , .
  • ( ..).
  • , , : ?

, , , , , .

var words = Regex.Split(@"[^\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Pc}\p{Lm}]", myString);

The regular expression above is the standard class \W, modified to not include numbers; for more information see http://msdn.microsoft.com/en-us/library/20bw873z.aspx . For other approaches, see this question and possibly the CodeProject link provided in the accepted answer.

By dividing the input line, you can iterate over wordsand replace the ones that match something in your list (use swearWords.Contains(word)to check) or just determine if there are any matches with

var anySwearWords = words.Intersect(swearWords).Any();
+3
source

All Articles