SQL query that yields different results across multiple columns

Sorry, I could not provide a better header for my problem, as I am completely new to SQL. I am looking for an SQL query string that solves the problem below.

Suppose the following table:

  DOCUMENT_ID |  TAG
 ----------------------------
    1 |  tag1
    1 |  tag2
    1 |  tag3
    2 |  tag2
    3 |  tag1
    3 |  tag2
    4 |  tag1
    5 |  tag3

Now I want to select the entire individual document identifier that contains one or more tags (but they must contain all the specified tags). For example: Select all document_id with tag1, and tag2 will return 1 and 3 (but not 4, for example, since it does not have tag2).

What would be the best way to do this?

Regards, Kai

+4
source share
4 answers
SELECT document_id FROM table WHERE tag = 'tag1' OR tag = 'tag2' GROUP BY document_id HAVING COUNT(DISTINCT tag) = 2 

Edit:

Updated due to lack of restrictions ...

+14
source

This assumes that the DocumentID and tag are the Primary Key.

Edit: Changed the HAVING clause for counting DISTINCT tags. Thus, it does not matter what the primary key is.

Test Data

 -- Populate Test Data CREATE TABLE #table ( DocumentID varchar(8) NOT NULL, Tag varchar(8) NOT NULL ) INSERT INTO #table VALUES ('1','tag1') INSERT INTO #table VALUES ('1','tag2') INSERT INTO #table VALUES ('1','tag3') INSERT INTO #table VALUES ('2','tag2') INSERT INTO #table VALUES ('3','tag1') INSERT INTO #table VALUES ('3','tag2') INSERT INTO #table VALUES ('4','tag1') INSERT INTO #table VALUES ('5','tag3') INSERT INTO #table VALUES ('3','tag2') -- Edit: test duplicate tags 

Query

 -- Return Results SELECT DocumentID FROM #table WHERE Tag IN ('tag1','tag2') GROUP BY DocumentID HAVING COUNT(DISTINCT Tag) = 2 

results

 DocumentID ---------- 1 3 
+7
source
 select DOCUMENT_ID TAG in ("tag1", "tag2", ... "tagN") group by DOCUMENT_ID having count(*) > N and 

Adjust N and tag list as necessary.

+1
source
 Select distinct document_id from {TABLE} where tag in ('tag1','tag2') group by id having count(tag) >=2 

How you generate a list of tags in a where clause depends on the structure of your application. If you dynamically generate a query as part of your code, you can simply build the query as a large, dynamically generated string.

We have always used stored procedures to query data. In this case, we go to the tag list as an XML document. - a similar procedure may look something like one of them, where the input argument will be

 <tags> <tag>tag1</tag> <tag>tag2</tag> </tags> CREATE PROCEDURE [dbo].[GetDocumentIdsByTag] @tagList xml AS BEGIN declare @tagCount int select @tagCount = count(distinct *) from @tagList.nodes('tags/tag') R(tags) SELECT DISTINCT documentid FROM {TABLE} JOIN @tagList.nodes('tags/tag') R(tags) ON {TABLE}.tag = tags.value('.','varchar(20)') group by id having count(distict tag) >= @tagCount END 

OR

 CREATE PROCEDURE [dbo].[GetDocumentIdsByTag] @tagList xml AS BEGIN declare @tagCount int select @tagCount = count(*) from @tagList.nodes('tags/tag') R(tags) SELECT DISTINCT documentid FROM {TABLE} WHERE tag in ( SELECT tags.value('.','varchar(20)') FROM @tagList.nodes('tags/tag') R(tags) } group by id having count( distinct tag) >= @tagCount END 

END

-1
source

All Articles