MySql: getting the number of items to increment under several conditions

Here is the dummy data , this is the call record data table.

This is a glimpse:

| call_id | customer | company | call_start | |-----------|--------------|-------------|---------------------| |1411482360 | 001143792042 | 08444599175 | 2014-07-31 13:55:03 | |1476992122 | 001143792042 | 08441713191 | 2014-07-31 14:05:10 | 

The customer and company fields represent their phone numbers.

  • The requirement to calculate the total 'gain' and total "lost" values โ€‹โ€‹based on the following logic:

EDIT:

-Customer A calls on company A.
-If customer A calls company B, then company B will receive a +1 win, and company A will lose +1.
-If customer A calls company C, then company C will receive a +1 win, and company B will lose +1.
-If customer A calls company C again, then this will not affect the spill / growth.
- win / loss only enters the game as soon as the 2nd call was made by client A.

- If the customer calls the companies in this order: A, B, B, C, A, A, C, B, D, the process should be as follows:

 A -> B -> B +1 gain, A +1 lost B -> C -> C +1 gain, B +1 lost A -> A +1 gain, C +1 lost A -> C -> C +1 gain, A +1 lost B -> B +1 gain, C +1 lost D -> D +1 gain, B +1 lost 

After the above process, we should have common values โ€‹โ€‹like:

 Company Total gain Total lost A 1 2 B 2 2 C 2 2 D 1 0 

I started working on this, but itโ€™s wrong, itโ€™s just an idea, it does not give me a separate gain gain and lost values โ€‹โ€‹based on the above conditions:

 DROP TABLE IF EXISTS GetTotalGainAndLost; CREATE TEMPORARY TABLE IF NOT EXISTS GetTotalGainAndLost AS ( SELECT SUM(count) as 'TotalGainAndLost', `date`, DAY(`date`) as 'DAY' FROM (SELECT count(*) as 'count', customer, `date` FROM (SELECT customer, company, count(*) AS 'count', DATE_FORMAT(`call_end`,'%Y-%m-%d') as 'date' FROM calls WHERE `call_end` LIKE CONCAT(2014, '-', RIGHT(CAST(concat('0', 01) AS CHAR),2),'-%') GROUP BY customer, company, DAY(`call_end`) ORDER BY `call_end` ASC) as tbl1 group by customer, `date` having count(*) > 1) as tbl2 GROUP by `date` ); Select * from GetTotalGainAndLost; DROP TABLE GetTotalGainAndLost; 

This query does not show any results.

  • The desired result will look like this:

There should be one line per company and date (total winnings and lost calls per day, for example, in January)

 | company | totalGain | totalLost | date | DAY | |-------------|------------|-------------|--------------|-------| | 08444599175 | 17 | 6 | 2014-07-01 | 1 | | 08444599175 | 12 | 10 | 2014-07-02 | 2 | | 08444599175 | 3 | 6 | 2014-07-02 | 3 | | 08444599175 | .... | ... | ... | ... | | 08444599175 | 7 | 6 | 2014-07-31 | 31 | 
+5
source share
4 answers

Relief

Denote N as the number of times a company appeared. Let's try to simplify the formula in three simple rules.

  • The first company to appear will profit N - 1, loss N.
  • The average company will have N profits, N losses.
  • The last company will have N profit, N - 1 loss

Testing

In your example:

  • Starting with company A, and appears 3 times.
  • Company B appears 3 times
  • Company C appears 2 times
  • Finish the job with company D, which appears 1 time.

Result

 Company Gain Lost A 2 3 B 3 3 C 2 2 D 1 0 

Translate to SQL

First, we start by counting the number of numbers of each company.

 SELECT company, COUNT(*) AS gain, COUNT(*) AS lost, DATE(call_start) AS date FROM calls GROUP BY DATE(call_start), company 

Then we begin to choose the number that each company appears for the first time for each client.

 SELECT company, -COUNT(*) AS gain, 0 AS lost, DATE(call_start) AS `date` FROM calls INNER JOIN ( SELECT MIN(call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer ) AS t ON (calls.call_id = t.call_id) GROUP BY DATE(call_start), calls.company 

The number of companies that appear last.

 SELECT company, 0 AS gain, -COUNT(*) AS lost, DATE(call_start) AS `date` FROM calls INNER JOIN ( SELECT MAX (call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer ) AS t ON (calls.call_id = t.call_id) GROUP BY DATE(call_start), calls.company 

SQL merge

Finally, we can merge all of SQL together using UNION ALL, and then make another group.

 SELECT company, SUM(gain) AS gain, SUM(lost) AS lost, `date` FROM ( ( SELECT company, COUNT(*) AS gain, COUNT(*) AS lost, DATE(call_start) AS `date` FROM calls GROUP BY DATE(call_start), company ) UNION ALL ( SELECT company, -COUNT(*) AS gain, 0 AS lost, DATE(call_start) AS `date` FROM calls INNER JOIN ( SELECT MIN(call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer ) AS t ON (calls.call_id = t.call_id) GROUP BY DATE(call_start), calls.company ) UNION ALL ( SELECT company, 0 AS gain, -COUNT(*) AS lost, DATE(call_start) AS `date` FROM calls INNER JOIN ( SELECT MAX(call_id) AS call_id FROM calls GROUP BY DATE(call_start), customer ) AS t ON (calls.call_id = t.call_id) GROUP BY DATE(call_start), calls.company ) ) AS t GROUP BY `date`, company 

Explanation

The above request assumes that each new day is independence. For instance,

  • Customer Company A (Day 1)
  • Client. Company B Challenge (Day 1) B wins 1, lost 1
  • Customer Call Company C (Day 1) C win 1, B lost 1
  • Customer Company D (Day 2)
  • Customer Call Company E (Day 2) E receives 1, D lost 1

Result will be

 COM GL DAY ---------------- A 0 1 1 B 1 1 1 C 1 0 1 D 0 1 2 E 1 0 2 
+5
source

This should work -

CTEGains will find out how many times a company appears on each customer for a date.

CTEFirst will find out if the company was the first contact for the customer that day.

CTELast discovers that the company was the last contact for the client that day.

Then the code should follow the logic you specified.

 CREATE TEMPORARY TABLE CTEGains (RNo int, customer varchar(14), company varchar(16), startdate date, gains int) CREATE TEMPORARY TABLE CTEFirst (customer varchar(14), call_start date, company varchar(16)) CREATE TEMPORARY TABLE CTELast (customer varchar(14), call_start date, company varchar(16)) Insert into CTEGains Select ROW_NUMBER() over (partition by customer order by Customer) Rno, customer, company, Convert(date,call_start) startdate, count(company) gains from calls group by customer, company, Convert(date,call_start), call_start Insert into CTEFirst Select customer, min(Convert(date,call_start)) call_start, min(company) company from calls group by customer, Convert(date,call_start) Insert into CTELast Select customer, max(Convert(date,call_start)) call_start, max(company) company from #calls group by customer, Convert(date,call_start) Select c1.company, SUM(gains) - case when exists (Select * from CTEGains c2 where c2.customer = max(c1.customer) and max(c1.Rno) = c2.Rno - 1 and c1.company = c2.company and c1.startdate = c2.startdate) then 1 else 0 end --Didn't gain as same company called - case when exists (select * from CTEFirst c2 where c2.company = c1.company and c2.call_start = c1.startdate) then 1 else 0 end TotalGain -- Didn't gain as first company , SUM(gains) - case when exists (Select * from CTEGains c2 where c2.customer = max(c1.customer) and max(c1.Rno) = c2.Rno - 1 and c1.company = c2.company and c1.startdate = c2.startdate) then 1 else 0 end --Didn't lose as same company as last called - case when exists (select * from CTELast c2 where c2.company = c1.company and c2.call_start = c1.startdate) then 1 else 0 end TotalLost -- didn't lose as last company , startdate [date], DatePart(DAY, startdate) [Day] from CTEGains c1 group by c1.company, c1.startdate Drop Table CTEFirst Drop Table CTEGains Drop Table CTELast 
+3
source

I think the easiest way to do this is with two queries. First, we can get a total income counting each call made by each client in another company:

 select g.company company, count(g.call_id) gain from calls c join calls g on c.customer = g.customer and c.company <> g.company and c.call_start < g.call_start left join calls m on g.customer = m.customer and g.company <> m.company and g.call_start > m.call_start and m.call_start > c.call_start where m.call_id is null group by g.company; 

The left connection is necessary so as not to calculate the additional winnings if the client makes different calls to different companies (i.e. if the client calls so that company a, b and c company c have only one win, not two).

Total lost with the same approach:

 select l.company company, count(l.call_id) lost from calls c join calls l on c.customer = l.customer and c.company <> l.company and c.call_start > l.call_start left join calls m on l.customer = m.customer and l.company <> m.company and c.call_start > m.call_start and l.call_start < m.call_start where m.call_id is null group by l.company; 

Here is a small fiddle demonstrating the solution: http://sqlfiddle.com/#!2/3236ab/7

+3
source

First make a few definitions:

  • Not the first call: any call that is not the first call to the client who made it.
  • Not the last call: any call that is not the last call of the client who made it.

We introduced the concepts of the first and last, so this means that we will need to determine the general order on our set of calls. We can follow any rule we want, but for the purpose of this explanation, I assumed that calls are ordered by start time and, at equal start time, by id. In other words:

  • If callA.sartTime < callB.startTime , then callA < callB
  • If callA.startTime = callB.startTime and callA.id = callB.id , then callA < callB

Pay attention to how we could get all the non-first calls of our set with the following query:

 SELECT * FROM calls AS non_first_calls RIGHT JOIN calls ON non_first_calls.customer = calls.customer AND non_first_calls.call_start >= calls.call_start AND non_first_calls.call_id > calls.call_id WHERE non_first_calls.call_id IS NOT NULL 

(the request output has duplicates, that is, calls can appear more than once)

Similarly, we can get all non-negative calls as follows:

 SELECT * FROM calls AS non_last_calls RIGHT JOIN calls ON non_last_calls.customer = calls.customer AND non_last_calls.call_start <= calls.call_start AND non_last_calls.call_id < calls.call_id WHERE non_last_calls.call_id IS NOT NULL 

Business logic

The company gets +1 every time a customer calls the company after making another call. This means that for any given company its profit is equal to the number of not-first calls received by it. Likewise, company losses are equal to the number received not last calls.

Mighty request

Thus, we only need to count, for each company, how many non-first calls and not last calls that he received.

For each part of the company means that we need to get a complete list of companies. We can do this with this query:

 SELECT DISTINCT company FROM calls 

Putting it all together:

 SELECT -- The company companies.company -- How many non-first calls (gains) it has received ,(SELECT COUNT(DISTINCT non_first_calls.call_id) gains FROM calls AS non_first_calls RIGHT JOIN calls ON non_first_calls.customer = calls.customer AND non_first_calls.call_start >= calls.call_start AND non_first_calls.call_id > calls.call_id WHERE non_first_calls.company = companies.company ) gains -- How many non-last calls (losses) it has received ,(SELECT COUNT(DISTINCT non_last_calls.call_id) gains FROM calls AS non_last_calls RIGHT JOIN calls ON non_last_calls.customer = calls.customer AND non_last_calls.call_start <= calls.call_start AND non_last_calls.call_id < calls.call_id WHERE non_last_calls.company = companies.company ) losses -- From the set of all companies FROM (SELECT DISTINCT company FROM calls) companies 

At work

I am not sure that the effectiveness of this request will be acceptable when working with a large amount of data.

At least you need a combined index ( customer , call_start ) (in that order) and another index on ( company ). This is the result that I got after running EXPLAIN on this query, indicating the indices mentioned and the provided sample data.

Output of EXPLAIN

+2
source

Source: https://habr.com/ru/post/1211511/


All Articles