This is especially true for confidence in using various replication solutions that you could reinstall on another server without data loss. Or in the situation of the master, which you could know for a reasonable period of time if one of the databases did not go out of sync.
Are there any tools for this, or are people generally dependent on the replication system itself to warn of inconsistencies? I'm currently most familiar with Postgresql WAL delivery in setting up master-standby, but I am considering installing master-master with something like PgPool. However, since this solution is slightly less directly related to PostgreSQL itself (my main understanding is that it provides the connection that the application will use by intercepting various SQL statements and then sending them to all the servers in its pool), it made me more thinking about actually checking data consistency.
Specific Requirements:
I'm not talking about just the structure of the table. I would like to know that the actual record data is the same, so I know if the records were corrupted or skipped (in this case, I would reinitialize the damaged database with the latest backup files + WAL before returning them to the pool)
Databases are about 30-50 GB. I doubt that raw SELECT queries will work very well.
I do not see the need for real-time verification (although this, of course, would be nice). Hourly or even daily would be better than nothing.
Checking the block level will not work. These were two databases with independent storage.
Or is this type of verification simply not realistic?
David ackerman
source share