I am trying to store a large list of strings in a compressed form so that they can be quickly analyzed or executed.
A target acyclic word graph (DAWG) is ideal for this purpose. However, I do not have a list of strings to include in the first place, so it should be built in stages. Also, when I look at it for a string, I need to return the data associated with the result (and not just a Boolean statement, if present).
I found the DAWG modification information for tracking string data here: http://www.pathcom.com/~vadco/adtdawg.html This looks extremely, extremely complicated, and I'm not sure that I am able to write it.
I also found several research papers that describe incremental construction algorithms, although I found that research papers are generally not very useful.
I donβt think I'm advanced enough to combine both of these algorithms myself. Is there documentation of an algorithm that already has these features, or an alternative algorithm with good memory usage and speed?
source
share