Running a parallel query on multiple database servers (running Microsoft SQL Server)

Question

Running a parallel query on multiple database servers (running Microsoft SQL Server)

Is it possible to configure several database servers (all hosting the same database) to simultaneously execute one query?

I do not ask about query execution using several processors at the same time - I know that this is possible.

UPDATE

What I mean is something like this:

There are two servers: Server1 and Server2
Both Foo server host databases and both Foo instances are identical
I connect to Server1 and present a complex (many connections, many calculations) request
Server1 decides that some calculations should be done on Server2, and some data should also be read from this server - the corresponding parts of the request are sent to Server2
Both servers read data and perform the necessary calculations.
Finally, Server1 and Server2 results are merged and returned to the client.

All this should happen automatically , without the need to explicitly refer to Server1 or Server2. I mean this parallel query execution - is this possible?

UPDATE 2

Thanks for the advice, John and wuputah.

I am exploring alternatives to increasing both the availability and capacity of the MOSS database database. So I'm looking for some kind of off-the-shelf SQL Server load balancing solution that will be transparent to the application , because I cannot change the application anyway. I think SQL Server does not have such a function (and Oracle, as I understand it, does this - this is the RAC mentioned by wuputah).

UPDATE 3

Quote from Best Tips for Clustering SQL Server :

Let's start by debunking a common misconception. You use the MSCS cluster for high availability, not load balancing. In addition, SQL Server does not have a built-in, automatic load balancing capability. You need a load balance through your physical design of the application.

+4

database sql-server

Marek grzenkowicz Feb 16 '09 at 8:46

source share

3 answers

Yes, I think this is possible, well, let me explain.

You need to study and explore the use of distributed queries. A distributed request is executed on several servers and is usually used to refer to data that is not stored locally.

http://msdn.microsoft.com/en-us/library/ms191440.aspx

For example, server A may contain a table of my clients, and server B contains my table Orders. It is possible to use distributed queries to run a query that refers to both server A and server B, with each server controlling the processing of its local data (which may include the use of parallelism).

Now, theoretically, you can store accurate data on each server and specifically design your queries so that only a specific table refers to certain servers, thereby distributing the request load. However, this is not true parallel processing in terms of CPU.

If your goal is to balance the processing load of your application, then a typical SQL Server approach is to use replication to distribute data processing across multiple servers. This method should also not be confused with parallel processing.

http://databases.about.com/cs/sqlserver/a/aa041303a.htm

I hope this helps, but of course, feel free to ask any questions you may have.

+2

John sansom Feb 16 '09 at 9:03

source share

An interesting question, but I try my best to make my idea useful for a multi-user system.

If I'm the only user who has half the requests made on Server1 and the other half on Server2 sounds cool :)

If there are two simultaneous users (say, with requests of the same complexity), then I'm afraid to see this helps: (

I could have the same data on both servers and load balancing - so I get Server1, my assistant gets Server2 - or I can have half of the data on Server1 and the other half on Server2, and each of them will be optimized, and the cache is just their own data is load distribution. But whenever you need to perform a merge to complete a query, the limiting factor becomes the size between them.

These are mainly federated database servers. Instead of having all my Clients on one server and all my Orders, I could, say, have my customers from the USA and their orders on one, and my European clients / orders on the other, and only if my request covers both There is any need for a merge step.

+1

Kristen Feb 16 '09 at 12:28

source share

wuputah · Accepted Answer · 2009-02-16T11:48:24+0000

What you are talking about is a clustering solution. It looks like SQL Server and Oracle have solutions, but I don't know anything about them. I can guess that it will be very expensive for them to buy and sell.

Possible alternative offers may include the following:

Use master-slave replication and run complex read requests from the slave. All records must go to the master, which are then sent to slaves, so everything remains in sync. This helps speed things up because the slave only needs to worry about the entries coming from the master that are already predefined on behalf of the slave (no deadlocks, etc.). If you want to use multiple servers, this is the first thing I would like to start.
Use master-master replication. This means that all records from both servers go to each other, so they remain in synchronization (at least theoretically). This has some advantages as a master-slave, but you don’t have to worry about records moving to one server and not to another. The more common use of master-master replication to support failure; master-slave is really better suited for performance.
Use the function that John Sansom talked about. I know little about this, but it seems to be based on dividing the database into tables on different servers, which will have some advantages as well as disadvantages. The big problem is that since the two systems cannot share memory, they will have to share a lot of data over the network to calculate complex joins.

Hope this helps!

RE Update 1:

If you cannot change the application, there is hope, but it can be a little complicated. If you want to configure master-slave replication, you can configure a proxy server to send read requests to the slave device (s) and write requests to the master device. I saw this with MySQL, but not with SQLServer. This is a bit of a problem if you do not want to write proxies yourself.

This was discussed earlier in SO , so you can find more information there.

RE Update 2:

Microsoft clustering cannot be designed for performance, but it is a Microsoft bug. This is still the level of difficulty you are talking about. If they say that this does not help, then your options are limited to those listed above and what you do with your application (for example, fragments, splitting into several databases, etc.).

Running a parallel query on multiple database servers (running Microsoft SQL Server)

More articles: