I am programming a Java application that reads strictly text files (.txt). These files can contain more than 120,000 words.
The application should store all +120 000 words. He should call them word_1, word_2, etc. And he also needs to access these words in order to perform various methods on them.
All methods are associated with strings. For example, a method will be called to indicate how many letters are in word_80. Another method will be called to say which specific letters are specified in word_2200.
In addition, some methods will compare two words. For example, a method will be called to compare word_80 with word_2200 and need to be returned that has more letters. Another method will be called to compare word_80 with word_2200 and should return those specific letters that separate the words.
My question is: since I work almost exclusively with Strings, is it better to store these words in one big ArrayList? A Few Small ArrayLists? Or should I use one of many other storage options such as Vectors, HashSets, LinkedLists?
My two main problems: 1.) access speed and 2.) the availability of the maximum possible number of built-in methods at my disposal.
Thank you for your help!
Wow! Thanks to everyone for the quick answer to my question. All your suggestions really helped me. I ponder and consider all the options presented in your reviews.
Please forgive me for any fuzziness; and let me ask your questions:
Q) English? A) Text files are actually books written in English. The appearance of a word in a second language would be rare, but not impossible. Id puts the percentage of non-English words in text files by .0001%
Q) Homework?
A) I look at my questions with a smile. Yes, it looks like a school assignment. But no, this is not homework.
Q) Duplicates?
A) Yes. And probably every five or so words considering unions, articles, etc.
Q) Access? A) Both random and sequential. Of course, it is possible that the method will find a random word. It is equally possible that the method wants to search for the corresponding word between word_1 and word_120000 sequentially. Which leads to the last question ...
Q) Iterate over the whole list?
A) Yes.
In addition, I plan to develop this program to perform many other methods in words. Again, I apologize for my fuzziness. (Details make a world of difference, right?)
Hooray!
java storage
user63157
source share