edit: many thanks for all the answers. Here are the results after applying the optimizations so far:
- Switching to sorting characters and encoding path lengths - new size 42 MB
- Deleting Indexes in Bulers - New DBM 33M Size
The very nice part is that it did not require any changes to the iphone code
I have an iphone application with a large dictionary stored in sqlite format (read-only). I am looking for ideas to reduce the size of a DB file, which is currently very large.
Here is the number of records and the resulting sqlite DB size:
franks-macbook:DictionaryMaker frank$ ls -lh dictionary.db -rw-r--r-- 1 frank staff 59M 8 Oct 23:08 dictionary.db franks-macbook:DictionaryMaker frank$ wc -l dictionary.txt 453154 dictionary.txt
... an average of about 135 bytes per record.
Here is my DB schema:
create table words (word text primary key, sowpods boolean, twl boolean, signature text) create index sowpods_idx on words(sowpods) create index twl_idx on words(twl) create index signature_idx on words(signature)
Here are some sample data:
photoengrave|1|1|10002011000001210101010000 photoengraved|1|1|10012011000001210101010000 photoengraver|1|1|10002011000001210201010000 photoengravers|1|1|10002011000001210211010000 photoengraves|1|1|10002011000001210111010000 photoengraving|1|1|10001021100002210101010000
The last field is the frequency of letters to search for anagram (each position is in the range 0..9). Two booleans are sub dictionaries.
I need to make queries such as:
select signature from words where word = 'foo' select word from words where signature = '10001021100002210101010000' order by word asc select word from words where word like 'foo' order by word asc select word from words where word = 'foo' and (sowpods='1' or twl='1')
One of my ideas is more efficient coding of letter frequencies, for example. binaries encode them as blob (possibly with RLE, since there are many zeros?). Any ideas on how best to achieve this, or other ideas to reduce size? I create a database in ruby and read it over the phone in lens C.
Also, is there a way to get database statistics so that I can see what uses the most space?
ruby sqlite iphone compression
frankodwyer
source share