What is wrong with this table join for yourself?

Question

What is wrong with this table join for yourself?

I have a table called TempAllAddresses with the following columns - ID , Address , State . I want to populate a new table Address , State and Count . Count should represent how many records are in the TempAllAddresses table that have an address similar to that address, followed by a wildcard. If that doesn't make sense, here is an example to illustrate - Let's say I have an entry like this:

 ID Address State 12345 13 Phoenix NY

What I want to do is insert a new record into a new table called AddressCount , which has 13 Phoenix for Address , NY for State , as well as the number of records in the table that have NY, state and LIKE address '13 Phoenix% 'for Count .

I want to accomplish this with the TempAllAddresses inner join on myself. This is what I tried, but it doesn't seem to fulfill what I'm looking for:

 SELECT t1.Address, t1.State, COUNT(t2.address) As NumEntities FROM TempAllAddresses t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' GROUP BY t1.State, t1.Address

The graph is definitely off. It should be equivalent to running " SELECT COUNT(*) FROM TempAllAddresses WHERE State=thisRecordsState and Address LIKE thisRecordsAddress + '%' ". How can i do this? What am I doing wrong?

Edit:

The account seems to be turned off as follows: If I have a record, as I mentioned above, then I have two more records that also have New York State and then have the addresses “13 Phoenix Road” and “13 Phoenix Rd ", then I want to get into the final record this entry:

 13 Phoenix NY 3

Instead, I seem to get:

 13 Phoenix NY 9

I'm not quite sure what's going on here ... some kind of Cartesian product? Permutations ...? Can anyone explain this?

Edit 2: Another edit, as I seem to have misunderstood (and really need a solution: () ... Here is a query with a correlated subquery that fulfills what I'm looking for. I would like to do the same with the inner join table to yourself, not a subtitle.

 SELECT Address, State, (SELECT Count(*) FROM TempAllAddresses innerQry WHERE innerQry.address LIKE outerQry.address + '%' AND innerQry.state = outerQry.state) As NumEntities FROM TempAllAddresses outerQry

Basically, for each record I want to get the number of records in the table that have the same state and the address that starts with this address (or equal to ... I want to include this address as part of the number).

+4

sql join self-join

froadie Dec 13 '10 at 9:35

source share

7 answers

Jeremy pridemore · Answer 1 · 2010-12-31T22:18:58+0000

There are two solutions: one uses CROSS APPLY and the other uses INNER JOIN as you originally wanted. Hope this helps. :)

 DECLARE @TempAllAddresses TABLE ( ID INT PRIMARY KEY IDENTITY(1, 1) NOT NULL , [Address] VARCHAR(250) NOT NULL , [State] CHAR(2) NOT NULL ) INSERT INTO @TempAllAddresses VALUES ('13 Phoenix', 'NY') , ('13 Phoenix St', 'NY') , ('13 Phoenix Street', 'NY') , ('1845 Test', 'TN') , ('1337 Street', 'WA') , ('1845 T', 'TN') SELECT TempAddresses.ID , TempAddresses.[Address] , TempAddresses.[State] , TempAddressesCounted.AddressCount FROM @TempAllAddresses TempAddresses CROSS APPLY ( SELECT COUNT(*) AS AddressCount FROM @TempAllAddresses TempAddressesApply WHERE TempAddressesApply.[Address] LIKE (TempAddresses.[Address] + '%') AND TempAddressesApply.[State] = TempAddresses.[State] ) TempAddressesCounted SELECT TempAddresses.ID , TempAddresses.[Address] , TempAddresses.[State] , COUNT(*) AS AddressCount FROM @TempAllAddresses TempAddresses INNER JOIN @TempAllAddresses TempAddressesJoin ON TempAddressesJoin.[Address] LIKE (TempAddresses.[Address] + '%') AND TempAddressesJoin.[State] = TempAddresses.[State] GROUP BY TempAddresses.ID , TempAddresses.[Address] , TempAddresses.[State]

Guillem vicens · Answer 2 · 2010-12-13T09:40:15+0000

Try this instead:

 SELECT Orig_Address, State, COUNT(Similar_Address) From ( SELECT t1.Address Orig_Address, t1.State State, t2.address Similar_Address FROM TempAllAddresses t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' AND t1.address <> t2.address ) GROUP BY State, Orig_Address

EDIT: forgot to include the difference between t1.address and t2.address, as @Spiny Norman said, since you probably don't want to compare the address with yourself.

NTN

Spiny norman · Answer 3 · 2010-12-13T09:57:29+0000

EDIT: [snip old stuff]

Try the following:

 SELECT t1.Address, t1.State, COUNT(distinct t2.id) As NumEntities FROM TempAllAddresses t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' GROUP BY t1.State, t1.Address

lijie · Answer 4 · 2010-12-13T10:55:45+0000

QUERY A:

 SELECT t1.Address, t1.State, COUNT(t2.address) As NumEntities FROM TempAllAddresses t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' GROUP BY t1.State, t1.Address

not equivalent

QUERY B:

 SELECT Address, State, (SELECT Count(*) FROM TempAllAddresses innerQry WHERE innerQry.address LIKE outerQry.address + '%' AND innerQry.state = outerQry.state) As NumEntities FROM TempAllAddresses outerQry

because B produces 1 row for each row in the source table ( TempAllAddresses ), while A will group rows in the source table that have the same state and address. To solve this problem, GROUP BY t1.ID, t1.State, t1.Address .

user533832 · Answer 5 · 2010-12-13T13:50:43+0000

There is a double count when there are several lines with exactly the same address.

Try:

 SELECT t1.Address, t1.State, COUNT(t2.address) As NumEntities FROM (select distinct Address, State from TempAllAddresses) t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' GROUP BY t1.State, t1.Address

kevpie · Answer 6 · 2010-12-14T10:11:27+0000

Nested GroupBy:

The subquery will find the shortest address for each individual address.
This does not take into account case sensitivity.
Then each version of these addresses is counted.

SQL:

 SELECT Address, State, count(1) As NumEntities FROM ( SELECT min(t1.Address) as Address, t1.State FROM TempAllAddresses t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' GROUP BY t1.State, t2.Address ) GROUP By State, Address

Rapidcoder · Answer 7 · 2010-12-31T15:39:50+0000

Have you tried analytic functions - they are often the easiest solution. I am not familiar with your table structure, but it should be something like this:

 SELECT t1.Address, t1.State, COUNT(t2.address) OVER (PARTITION BY t2.state) As NumEntities FROM TempAllAddresses t1 INNER JOIN TempAllAddresses t2 ON t1.state = t2.state AND T2.Address LIKE t1.address + '%' GROUP BY t1.State, t1.Address

You can even add ORDER BY to the OVER clause. See Oracle FAQ for an explanation.

What is wrong with this table join for yourself?

More articles: