I think a good solution is to use virtual shards. You can start from one server and point the entire virtual shard to one server. Use the module on incremental id to evenly distribute strings across virtual shards. With Amazon RDS, you have the opportunity to turn the slave into the master, so before changing the scalding configuration (specify more virtual fragments to the new server), you must make the slave master and then update your configuration file and then delete all entries of the new master using modulu that does not match the range of fragments that you use for the new instance.
You also need to delete the rows from the original server, but now all new data with identifiers, which are modules based on the new ranges of the virtual fragment, point to the new server. This way you do not need to move data, but use the Amazon RDS promotion feature.
Then you can make a replica from the source server. You create a unique identifier: Shard ID + table type ID + Incremental number. Therefore, when you query a database, you know which fragment to go with and retrieve the data.
I do not know how this can be done with RavenDB, but it can work very well with Amazon RDS, because Amazon already provides you with the function of replicating and promoting the server.
I agree that their solution should be a solution that from the very beginning offers seamless communication skills and does not tell the developer to sort out the problems when this happens. In addition, I found that many NoSQL solutions that evenly distribute data across fragments should work in a low-latency cluster. Therefore, you must take this into account. I tried using Couchbase with two different EC2 computers (not in a dedicated Amazon cluster), and data balancing was very slow. It also increases the overall cost.
I also want to add that what pinterest did to solve scalability problems using 4096 virtual shards.
You also need to study swap issues with many NoSQL databases. With this approach, you can easily print the data, but perhaps not in the most efficient way, because you may need to query multiple databases. Another problem is changing the circuit. Pinterest solved this by posting all the data in JSON Blob in MySQL. When you want to add a new column, you create a new table with a new data key for column + and you can use the index in that column. If you need to request data, for example, by email, you can create another table with email id + and put the index in the email column. Counters are another problem, I mean atomic counters. Therefore, it is better to derive these counters from JSON and put them in a column so that you can increase the value of the counter.
There are great solutions there, but at the end of the day you will find out that they can be very expensive. I preferred to spend time building my own scalding solution and prevent headaches later. If you choose a different path, there are many companies waiting for you to get into trouble and ask for quite a lot of money to solve your problems. Because at the moment you need them, they know that you will pay everything for your project to work again. From my own experience, why I rack my brains to build my own stunning solution using your approach, which will also be much cheaper.
Another option is to use MySQL middleware, such as ScaleBase or DBshards. Thus, you can continue to work with MySQL, but at the time you need to scale, they have worked well. And the costs can be much lower than the alternative.
One more tip: when creating a configuration for fragments, set the write_lock attribute, which accepts false or true. Therefore, when it is false, the data will not be written to this fragment, so when you get a list of fragments for a certain type of table (for example, users), it will be written only for other fragments of the same type. It is also well suited for backup, so you can show a friendly error for visitors when you want to block all the fragments when backing up all the data in order to get snapshots of snapshots of all the fragments. Although I think you can send a global query to snapshot all databases using Amazon RDS and use a timed backup.
The fact is that most companies will not spend time working with a DIY fragmentation solution, they will prefer to pay for ScaleBase. These solutions come from individual developers who can afford to pay for a scalable solution from the start, but want to be sure that when they reach the level they need, they have a solution. Just look at the prices there, and you can understand that it will cost you a lot. I will be happy to share my code with you when you are done. In my opinion, you are going with the best path, it all depends on your application logic. I model my database as simple, not aggregated, not complex aggregation queries - this solves many of my problems. In the future, you can use Map Reduce to solve these large data queries.