SQL find duplicates and assign group number

The situation
On Microsoft SQL Server 2008, I have about 2 million rows. (this never happened, but we inherited the situation). Sample as follows:

usernum. |  phone  |  email
1        |  123    |  user1@local.com
2        |  123    |  user2@local.com
3        |  245    |  user3@local.com
4        |  678    |  user3@local.com

Goal
I would like to create a table that looks like this. The idea is that if the “phone” or “email” are the same, they are assigned the same group number.

groupnum |usernum. |  phone  |  email
1        |  1      |  123    |  user1@local.com
1        |  2      |  123    |  user2@local.com
2        |  3      |  245    |  user3@local.com
2        |  4      |  678    |  user3@local.com


python script, :
- usernum
-
- ,

- , usernum ( - )


Python script , . , , 10 000 , 2 . , t-sql, , python script, pyodbc.
, , sql.

+4
2

, , . , , , . , , ( ), ( ):

insert into yourGroupsTable (phone, email) -- assuming identity column of groupNum here
select distinct phone, email
from yourUserTable

-- assign group nums with priority on matching phone AND email
update yourUserTable
set groupNum = g.groupNum
from yourUserTable u
join yourGroupsTable g on u.phone = g.phone
    and u.email = g.email

, , GroupsTable - . , , ( ) - :

:

groupnum |usernum. |  phone  |  email
1        |  1      |  123    |  user1@local.com
1        |  2      |  123    |  user2@local.com
?        |  3      |  245    |  user3@local.com
?        |  4      |  678    |  user3@local.com
?        |  5      |  245    |  user7@local.com
?        |  6      |  678    |  user7@local.com  

?

+1

python script, ... mysql, ,

  THEN groupnum groupnum ...    groupnum

,

,

5 | 678 | user1@local.com

?

, [ ] , groupnum.

, mysql...

0

All Articles