Substituting substrings from a dictionary on another line: sentences?

Question

Substituting substrings from a dictionary on another line: sentences?

Hellow Stack Overflow Overflow. I would like to receive some suggestions regarding the following problem. I am using Java.

I have array # 1 with a row of rows. For example, two lines could be: "An apple fell on Newton’s head" and "Apples grow on trees."

On the other hand, I have another array # 2 with such terms (Fruits => Apple, Orange, Peach; Items => Pen, Book; ...). I would call this array my "dictionary".

Comparing elements from one array to another, I need to see in which category you fall from # 1 from # 2. For example. Both of # 1 came under Fruit.

My most important thing is speed. I need to do these operations quickly. A structure providing a constant search for time would be good.

I read the Hashset using the contains () method, but it does not allow substrings. I also tried to run a regex (apple | orange | peach ... etc) with a case-insensitive flag, but I read that it will not be fast when the terms increase in number (a minimum of 200 is expected). Finally, I searched and consider using ArrayList with indexOf (), but I don't know about its performance. I also need to know which of the terms actually matches, so in this case it will be "Apple."

Please provide your opinions, ideas and suggestions on this issue.

Aho-Corasick, / . , . , , , , .

, Qaru people, !:)

+5

java nlp

Inf.S 06 . '10 15:30

3

Nathan Hughes · Answer 1 · 2010-01-06T15:55:12+0000

Google, , ( , { "Fruits" = > [Apple]}, { "Apple" = > [ "Fruits" ]}. , .

, , ( ) . , .

MattK · Answer 2 · 2010-01-06T15:41:22+0000

? O (m), m - , O (n ²) - - , , , . , , BioJava .

, - O (n ²). , , , .

Hans-peter störr · Answer 3 · 2010-01-06T16:04:28+0000

200 , . , , , , , # 1, , , .

, : # 2, , , # 1.

(Regular expressions are compiled into a state machine - that is, on each character of the string that it simply searches for a table for the next state. If the regular expression is complex, you might have a countdown that increases the time, but your regular expression has a very simple structure.)

Substituting substrings from a dictionary on another line: sentences?

More articles: