TSQL equally divides the results into groups and updates them

Question

TSQL equally divides the results into groups and updates them

I have my database with three tables:

enter image description here

The order table has the following data:

OrderID OperatorID GroupID OrderDesc Status Cash ... -------------------------------------------------------------------------- 1 1 1 small order 1 100 2 1 1 another order 2 0 3 1 2 xxxxxxxxxxx 2 1000 5 2 2 yyyyyyyyyyy 2 150 9 5 1 xxxxxxxxxxx 1 0 10 NULL 2 xxxxxxxxxxx 1 10 11 NULL 3 xxxxxxxxxxx 1 120

Operator table:

 OperatorID Name GroupID Active --------------------------------------- 1 John 1 1 2 Kate 1 1 4 Jack 2 1 5 Will 1 0 6 Sam 3 1

Group table:

 GroupID Name --------------- 1 G1 2 G2 3 X1

As you can see, John has 3 warrants, Kate 1, Will 1, Jack and Sam not.

Now I would like to assign operators to the order base in some conditions:

order must have cash> 0
order must have status = 1
order must be in group 1 or 2 Operator
must be active (active = 1)
must be in group 1 or 2

This is the result I would like to get:

 OrderID OperatorID GroupID OrderDesc Status Cash ... -------------------------------------------------------------------------- 1 1 1 small order 1 100 < change 2 1 1 another order 2 0 3 2 2 xxxxxxxxxxx 2 1000 < change 5 4 2 yyyyyyyyyyy 2 150 < change 9 5 1 xxxxxxxxxxx 1 0 10 4 2 xxxxxxxxxxx 1 10 < change 11 NULL 3 xxxxxxxxxxx 1 120

I would like to shuffle orders and update the operator identifier so that every time I call this script, I get a random recipient operator identifier, but each operator will have an equal number or orders (close to equal, because if I have There are 7 orders; one person will have 3 and the remaining 2).

I can use NTILE to distribute orders among groups, but I need to assign an operatorID to this group.

I think I need to do something like this:

 SELECT NTILE(2) OVER( order by orderID desc) as newID,* FROM orders(NOLOCK)

This will give me an order table grouped in equal parts. I need to know the length of the statement table (to add it as a parameter in NTILE), after which I could join my results with statements (using row_number() )

Is there a better solution?

My question is again: How to equally divide the result set into groups and update this record set using other table data?

EDIT: This is my code: http://sqlfiddle.com/#!3/39849/25

EDIT 2 I updated my question and added additional conditions.

I would like to assign operators to orders based on some conditions:

order must have cash> 0
order must have status = 1
order must be in group 1 or 2 Operator
must be active (active = 1)
must be in group 1 or 2

I create this request as a stored procedure.
Thus, the first step is to create data with new assignments in the temporary table and after final approval in the second stage to update the main table based on this temporary table.

I have two more questions:

Would it be better to first select all all orders and all operators that satisfy the conditions in the temporary table, and then perform shuffling or do all this in one big query?
I would like to pass an array or groups as a parameter to my procedure. Which option is best to pass an array to a stored procedure (SQL Server 2005).

I know that this has been asked many times, but I would like to know whether it is better to create a separate function that will cut a comma into a table ( http://www.sommarskog.se/arrays-in-sql-2005.html ) or put all in one big fat procedure? :)

FINAL ANSWER: avilable at http://sqlfiddle.com/#!3/afb48/2

 SELECT o.*, op.operatorName AS NewOperator, op.operatorID AS NewOperatorId FROM (SELECT o.*, (ROW_NUMBER() over (ORDER BY newid()) % numoperators) + 1 AS randseqnum FROM Orders o CROSS JOIN (SELECT COUNT(*) AS numoperators FROM operators WHERE operators.active=1) op WHERE o.cash>0 and o.status in (1,3) ) o JOIN (SELECT op.*, ROW_NUMBER() over (ORDER BY newid()) AS seqnum FROM Operators op WHERE op.active=1 ) op ON o.randseqnum = op.seqnum ORDER BY o.orderID

Answer based on the answer of Gordon Linoff. Thanks!

+7

sql tsql sql-server-2005

Misiu Aug 21 '12 at 16:20

source share

2 answers

Sorry - I don’t think you can get away from counting records ...

 DECLARE @myCount int SELECT @myCount = Count(*) FROM Operators SELECT a.OrderID, a.description, b.operatorName FROM ( SELECT NTILE(@myCount) OVER( ORDER BY NEWID()) AS newID, orderID, description FROM orders(NOLOCK) ) a INNER JOIN ( SELECT NTILE(@myCount) OVER( ORDER BY NEWID()) AS newID, OperatorName, OperatorID FROM Operators ) b ON a.NewID = b.NewID ORDER BY a.OrderID

0

Chains Aug 21 '12 at 17:46

source share

Gordon linoff · Accepted Answer · 2012-08-21T17:57:45+0000

I was not sure if you really needed an update request or a choice request. The following query returns a new operator for each order based on your conditions:

 /* with orders as (select 1 as orderId, 'order1' as orderDesc, 1 as OperatorId), operators as (select 1 as operatorID, 'John' as name) */ select o.*, op.name as NewOperator, op.operatorID as NewOperatorId from (select o.*, (ROW_NUMBER() over (order by newid()) % numoperators) + 1 as randseqnum from Orders o cross join (select COUNT(*) as numoperators from operators) op ) o join (select op.*, ROW_NUMBER() over (order by newid()) as seqnum from Operators op ) op on o.randseqnum = op.seqnum order by orderid

Basically, he assigned a new identifier to the rows for the connection. The order table gets a value from 1 to the number of operators, randomly assignd. Then it connects to the serial number on the operators.

If you need to update, you can do something like:

 with toupdate as (<above query>) update orders set operatorid = newoperatorid from toupdate where toupdate.orderid = orders.orderid

Your two questions:

Would it be better to first select all all orders and all operators that satisfy the conditions in the temporary table, and then perform shuffling or do all this in one big query?

The user of temporary tables is a matter of performance and requirements for the application. If the data is updated quickly, then yes, using a temporary table is a big win. If you randomize many times many times according to the same data, then this can be a victory, especially if the tables are too large to fit in memory. Otherwise, it is unlikely that there will be a large performance gain on a single run if you place conditions in the innermost subqueries. However, if performance is a problem, you can test two approaches.

I would like to pass an array or groups as a parameter to my procedure. Which option is best to pass an array to a stored procedure (SQL Server 2005).

Hmmm, switch to 2008, which has table options. Here is a very reference article on this issue by Erland Sommarskog: http://www.sommarskog.se/arrays-in-sql-2005.html .

TSQL equally divides the results into groups and updates them

More articles: