First, some background:
My Android application has a DB table with many four columns. It sends requests to the server, and the server responds only when all four of these values ββare "valid." Several thousand users reported that something was not working for them (since they did not receive the results from the server) - I tried to find out what was causing the problem, and it turned out that the only possible reason was DB corruption that was not detected.
In ACRA logs, I have SQL error messages, but it concerns the fact that the application cannot open the file due to corruption. This gave me some insight, but I was still not sure if this was a problem. So, I created a very simple Python script that changes random bytes in a DB file and checks how SQLite handles this:
import random import array import sqlite3 db = array.array('B') db.fromstring(open('db').read()) ta = [x for x in sqlite3.connect('db').execute('SELECT * FROM table ORDER BY _id')] results = [0,0,0,0] tries = 1000 for i in xrange(0,tries): work = db[:] while work == db: for j in xrange(0,random.randint(1,5)): work[random.randint(1,len(db))-1] = random.randint(0,255) work.tofile(open('outdb','w')) try: c = sqlite3.connect('outdb') results[0] += 1 for r in c.execute('PRAGMA integrity_check;'): results[1] += 1 if (r[0] == 'ok') else 0 except: continue try: results[3] += 1 if [x for x in c.execute('SELECT * FROM table ORDER BY _id')] != ta else 0 results[2] += 1 except: c.close() continue print 'Results for '+str(tries)+' tests:' print 'Creating connection failed '+str(tries-results[0])+ ' times' print 'Integrity check failed '+str(results[0]-results[1])+ ' times' print 'Running a SELECT * query failed '+str(results[1]-results[2])+ ' times' print 'Data was succesfully altered '+str(results[3])+ ' times'
The results showed that βeditingβ tabular data in this way is quite possible:
Results for 1000 tests: Creating connection failed 0 times Integrity check failed 503 times Running a SELECT * query failed 289 times Data was succesfully altered 193 times
In general, it is interesting to see that query execution failed for half of the changes that were not detected by the integrity check, but the most interesting thing for me is that something can replace random bytes in my database, which makes my application useless for some of my users .
I read about the possible causes of corruption on the SQLite website, as well as on StackOverflow, I know that, for example, forcing the application to close can harm the database. I would just like to know if it is possible to implement a fast and reliable database integrity check.
I read data from one column of the whole table at startup (for autocomplete), so I thought about calculating some hash from all values ββ- I think this will work well, as some hash functions are designed just to check the integrity, but maybe There is a simpler, faster and better solution. Therefore, I ask you if you know something.