MySQL query

Question

MySQL query

I am dealing with a lot of data in a MySQL database, and I would like to use sharding to scale. I understand the principles of the fragments, and I even know how I want to outline my data.

When I look through the database firewall, I cannot find comprehensive examples of how to actually manage and query a shredded database.

In particular, let's say I split my data into several tables / databases (shards), what is the best way to query this data? I don't think there is a way to intelligently find out mysql which shard to use.

Are there any third-party programs that can manage shards and my requests? Or do I need to change my code (which is written in php) in order to interact with fined data?

+7

mysql scaling sharding

Tucker Jun 04 '11 at 16:54

source share

3 answers

Jeremy adsitt · Answer 1 · 2012-11-15T10:25:03+0000

Why am I dealing with some larger systems, and there was a regular internal application that consolidated requests from servers for use in common applications for the company.

eg. select * from t1 been converted to:

 select * from db1.t1 union select * from db2.t2

and etc.

The main problem is that if you are faced with cross-servers, then on large systems with millions of lines it can hit the network hard and take a lot of time to process requests.

Say, for example, that you are conducting network analysis and you need to make a connection in the tables to determine the "links" of user attributes.

You can get some odd queries that have something like (forgive the syntax):

  select db1.user1.boss, db1.user1.name, db2.user.name db2.user.boss from db1 inner join on db1.user.name = db2.user.name

(for example, get a person’s boss, his boss, friend of a friend, etc.)

It can be awesome PITA when you want to get good data to do query type chains, but for simple statistics like sums, averages, etc ... what works best for these guys is a night request , which aggregated statistics into a table on each server (for example, nightlystats). e.g. select countif(user.datecreated>yesterday,1,0) as dailyregistered, sumif(user.quitdate)... into (the new nightly record) .

This made the daily statistics pretty trivial, since the calculations you would just summarize the full column, on average you would multiply the individual value of the byt server by the total number of servers, and then divided by the total amount, etc., and have a pretty fast toolbar at a high level.

We ended up doing a lot of indexing and optimization, and tricks like storing small local tables of commonly used information were useful for speeding up queries.

For larger requests, the db guy just dumped the full system copy in the backup system, and we will use this to process locally throughout the day so as not to get into the network too much.

There are several tricks that can reduce this, for example, to have common tables (for example, basic tables for users, etc. unchanging data, etc.) so that you do not need to spend time collecting them.

Another that is really useful in practice is to combine the totals and totals for simple queries into night tables.

The last thing that is interesting is that the workaround for the bw problem was for the “delayed” timeout to be programmed into the built-in “query aggregator”, what he did was the response time from recording the record, if the time started to linger, he requested fewer records and added latency to the requests he requested (since it was reporting, not temporary, it worked fine)

There are several SQLs that automatically bounce, and I recently read an article about tools (but not php) that will do some of them for you. I think they were associated with vm cloud providers.

There are also some tools and thoughts in this thread: MySQL sharding approaches?

If NoSQL is an option, you can consider all the db systems there before you go along this route.

The NoSQL approach may be easier to scale depending on what you are looking for.

Johan · Answer 2 · 2011-06-04T17:05:36+0000

Use Shard-Query .

see http://www.mysqlperformanceblog.com/2011/05/14/distributed-set-processing-with-shard-query/

Rakesh sadaka · Answer 3 · 2018-02-06T03:55:43+0000

You can use pagination or outline in mysql. If you use partitioning, then mysql will receive the correct data for you according to the conditions in the where section. If you are using sharding, you need to define the addition key. Thus, the data will be displayed in the tables in accordance with the header key.

Suppose you have a staff table and this table is fined according to employee_id, and the number of shard points is 10. Now the data in tables with tables can be placed in the table name, for example, employee_ (employee_id% 10). Thus, the employee data will be transferred to tables with the name employee_1, employees_2 ..... employees_10 in accordance with the shards key.

Here mysql does not automatically calculate the table name, but you should do it in the language you use.

MySQL query

More articles: