Client-server database synchronization

I am looking for some general strategies for synchronizing data on a central server with client applications that are not always online.

In my specific case, I have an Android application with sqlite databases and a PHP web application with a MySQL database.

Users will be able to add and edit information about the phone application and the web application. I need to make sure that changes in one place are reflected everywhere, even when the phone cannot immediately contact the server.

I do not deal with how to transfer data from the phone to the server or vice versa. I mention my specific technologies only because I cannot use, for example, the replication functions available for MySQL.

I know that the problem of client-server data synchronization has existed for a long time and would like to receive information - articles, books, tips, etc. - About templates to solve this problem. I would like to learn about common synchronization strategies to compare strengths and weaknesses and trade-offs.

+61
sql database design-patterns client-server data-synchronization
Aug 04 2018-10-15T00:
source share
4 answers

The first thing you need to decide is a general policy on which side is considered “authoritative” in the event of conflicting changes.

Ie: suppose that record No. 125 will be changed on the server on January 5 at 10 pm, and the same record will be changed on one of the phones (call customer A) on January 5 at 11 pm. The last sync was January 3rd. Then the user connects again, for example, on January 8th.

Determining what needs to be changed is “easy” in the sense that both the client and the server know the date of the last synchronization, therefore everything that has been created or updated (see below to learn more about this), since the last synchronization must be consistent .

So, suppose that the only record changed is # 125. You either decide that one of the two automatically “wins” and overwrites the other, or you need to maintain the reconciliation phase when the user can decide which version (server or client) is correct, overwriting another.

This decision is extremely important, and you must weigh the "role" of customers. Especially if there is a potential conflict not only between the client and the server, but in case different clients can change the same record (s).

[Assuming that # 125 could be modified by the second client (Client B), it is likely that Client B, which has not yet synchronized, will provide another version of the same record, which makes the previous resolution of the conflict controversial]

Regarding the "created or updated" point above ... how can you correctly identify the record if it was created on one of the clients (if this makes sense in your problem area)? Suppose your application manages a list of business contacts. If Client A says that you have to add the newly created John Smith, and on the server there is John Smith created yesterday by client D ... you create two records because you cannot be sure that they are not different people? Do you also ask the user to come to terms with this conflict?

Do customers have a "property" of a subset of the data? That is, if Client B is configured as an "authority" on the data for Region No. 5, can Client A change / create records for Region No. 5 or not? (This will facilitate conflict resolution, but may not be feasible for your situation).

To summarize, the main problems are:

  • How to determine the "identifier", given that individual clients could not access the server before creating a new record.
  • In the previous situation, no matter how difficult the decision can lead to duplication of data, you must foresee how to periodically solve them and how to inform customers that what they consider to be “Record No. 675” has actually been merged with / replaced recorded # 543
  • Decide whether conflicts will be resolved using fiat (for example, "the server version always trumps the client if the first has been updated since the last synchronization") or manually [/ li>
  • In the case of fiat, especially if you decide that the client takes precedence, you should also take care of how to handle other, not yet synchronized clients, which may have a few more changes.
  • The preceding elements do not take into account the granularity of your data (to simplify the description). It is enough to say that instead of reasoning at the “Record” level, as in my example, you may find it more suitable for recording changes at the field level. Or to work on a set of records (for example, Recording a person + Recording an address + Recording contacts) at the same time, considering their combination as a kind of “Meta-record”.

Bibliography:

  • More about this, of course, about Wikipedia .

  • A simple synchronization algorithm by Vdirsyncer

  • OBJC Data Sync Article

  • SyncML®: Synchronizing and Managing Your Mobile Data (Safari O'Reilly Book)

  • Unrestored Replicated Data Types

  • Optimistic Replication YASUSHI SAITO (HP Laboratories) and MARC SHAPIRO (Microsoft Research Ltd.) - ACM Computing Surveys, Vol. V, No. N, 3 2005.

  • Alexander Traud, Jürgen Nagler-Elaine, Frank Kargl and Michael Weber. Synchronize cyclic data by reusing SyncML. Proceedings of the Ninth International Conference on Mobile Data Management (MDM '08). IEEE Computer Society, Washington, USA, 165-172. DOI = 10.1109 / MDM.2008.10 http://dx.doi.org/10.1109/MDM.2008.10

  • Lam, F., Lam, N., and Wong, R. 2002. Effective synchronization for XML mobile data. Proceedings of the Eleventh International Conference on Information and Knowledge Management (McLean, Virginia, USA, November 4–09, 2002). CIKM '02. ACM, New York, NY, 153-160. DOI = http://doi.acm.org/10.1145/584792.584820

  • Cunha, PR and Maibaum, TS 1981. Resource & equil; abstract data type + synchronization - methodology for message-oriented programming. In the materials of the 5th international conference on software development (San Diego, California, USA, March 09 - 12, 1981). International Conference on Software Development. IEEE Press, Piscataway, NJ, 263-272.

(The last three are from the ACM digital library, I don’t know if you are a member or you can get them through other channels).

From the Dr.Dobbs website :

  • Creating Applications with SQL Server CE and SQL RDA by Bill Wagner May 19, 2004 (Recommendations for Developing an Application for Desktop and Mobile PC - Windows / .NET)

From arxiv.org:

  • Non-Conflicted Replicated JSON Datatype - This document describes the implementation of JSON CRDT (Replicated Conflict Types - CRDT) - a family of data structures that support simultaneous modification and ensure the convergence of such concurrent updates).
+69
Aug 6 2018-10-06T00:
source share

I would recommend that you have a timestamp column in each table, and each time you insert or update, update the timestamp value of each affected row. Then you iterate over all the tables, checking if the timestamp is newer than the one you have in the target database. If it is newer, check if you need to insert or update.

Observation 1: be aware of physical hits as strings are removed from the original db and you must do the same on the db server. You can solve this by avoiding physical deletions or logging all deletions in a timestamped table. Something like this: DeletedRows = (id, table_name, pk_column, pk_column_value, timestamp) So, you should read all the new rows of the DeletedRows table and delete on the server using table_name, pk_column and pk_column_value.

Observation 2: Know FK, since inserting data into a table that is linked to another table may fail. You must deactivate each FK before data synchronization.

+5
Apr 24 '12 at 18:30
source share

If someone is dealing with a similar design issue and needs to synchronize changes across multiple Android devices, I recommend checking out Google Cloud Messaging for Android (GCM).

I am working on one solution where changes made on one client should apply to other clients. And I just implemented a proof of concept implementation (server and client), and it works like a charm.

Basically, each client sends delta changes to the server. For example. Resource ID ABCD1234 has changed from 100 to 99.

The server checks these delta changes on its database and either approves the change (the client is in synchronization), or updates its database or rejects the change (client is not synchronized).

If the change is approved by the server, the server then notifies other clients (except the one who sent the delta change) through GCM and sends a multicast message with the same delta change. Clients process this message and update their database.

It's cool that these changes spread almost instantly !!! if these devices are connected to the network. And I don’t need to implement any polling mechanism for these clients.

Keep in mind that if the device has been offline for too long and more than 100 messages are expected in the GCM queue for delivery, GCM will discard this message and send a special message when the devices return to the network. In this case, the client must complete a full synchronization with the server.

Also check out this tutorial to get started with the CGM client implementation.

+3
Jan 16 '15 at 12:12
source share

this is for developers who use the Xamarin framework (see https://stackoverflow.com/a/167189/ )

A very simple way to achieve this using the xamarin platform is to use Azures offline data synchronization, as it allows you to click and retrieve data from the server on demand. Read operations are performed locally, and write operations are performed on demand; If the network connection is interrupted, write operations are queued until the connection is restored and then completed.

The implementation is pretty simple:

1) create a mobile application on the azure portal (you can try it for free here https://tryappservice.azure.com/ )

2) connect your client to the mobile application. https://azure.microsoft.com/en-us/documentation/articles/app-service-mobile-xamarin-forms-get-started/

3) code for setting up a local repository:

 const string path = "localrepository.db"; //Create our azure mobile app client this.MobileService = new MobileServiceClient("the api address as setup on Mobile app services in azure"); //setup our local sqlite store and initialize a table var repository = new MobileServiceSQLiteStore(path); // initialize a Foo table store.DefineTable<Foo>(); // init repository synchronisation await this.MobileService.SyncContext.InitializeAsync(repository); var fooTable = this.MobileService.GetSyncTable<Foo>(); 

4), then click and pull your data to make sure that we have the latest changes:

 await this.MobileService.SyncContext.PushAsync(); await this.saleItemsTable.PullAsync("allFoos", fooTable.CreateQuery()); 

https://azure.microsoft.com/en-us/documentation/articles/app-service-mobile-xamarin-forms-get-started-offline-data/

+2
Oct 21 '16 at 4:19
source share



All Articles