Hello, I have a 32 megabyte file. This is a simple dictionary file encoded 1250 with 2.8 million lines in it. Each line has only one unique word:
cat dog god ...
I want to use Lucene to search for every anagram in a dictionary of a specific word. For instance:
I want to find any anagram of the word dog , and lucene should look for my dictionary and return the dog and god . In my webapp, I have a Word Entity:
public class Word { private Long id; private String word; private String baseLetters; private String definition; }
and baseLetters is a variable that is sorted alphabetically to search for such anagrams [the words god and dog will have the same baseLetters: dgo]. I managed to find such anagrams from my database using this baseLetters variable in different services, but I have a problem creating the index of my dictionary file. I know what I need to add to the fields:
word and baseLetters, but I have no idea how to do this :( Can someone show me some directions to achieve this?
Now I only have something like this:
public class DictionaryIndexer { private static final Logger logger = LoggerFactory.getLogger(DictionaryIndexer.class); @Value("${dictionary.path}") private String dictionaryPath; @Value("${lucene.search.indexDir}") private String indexPath; public void createIndex() throws CorruptIndexException, LockObtainFailedException { try { IndexWriter indexWriter = getLuceneIndexer(); createDocument(); } catch (IOException e) { logger.error(e.getMessage(), e); } } private IndexWriter getLuceneIndexer() throws CorruptIndexException, LockObtainFailedException, IOException { StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_36); IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_36, analyzer); indexWriterConfig.setOpenMode(OpenMode.CREATE_OR_APPEND); Directory directory = new SimpleFSDirectory(new File(indexPath)); return new IndexWriter(directory, indexWriterConfig); } private void createDocument() throws FileNotFoundException { File sjp = new File(dictionaryPath); Reader reader = new FileReader(sjp); Document dictionary = new Document(); dictionary.add(new Field("word", reader)); } }
PS: One more question. If I register DocumentIndexer as a bean in Spring, will the index be created / added every time I reinstall my webapp? and the same will happen with the future DictionarySearcher?
source share