DISTINCT only one value with SPARQL

I want to get a list of Italian cities with a population of more than 100 thousand with SPARQL, and I use the following query:

PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?city ?name ?pop WHERE { ?city a dbo:Settlement . ?city foaf:name ?name . ?city dbo:populationTotal ?pop . ?city dbo:country ?country . ?city dbo:country dbpedia:Italy . FILTER (?pop > 100000) } 

In the results, I get, for example, two different lines (which represent the same object, but with different names):

http://dbpedia.org/resource/Bologna Bologna @en 384038

http://dbpedia.org/resource/Bologna "Comune di Bologna" @en 384038

How can I use SELECT DISTINCT only in the ?city column, but still having outher as output columns?

+5
source share
1 answer

You can use GROUP BY to group by a specific column, and then use the SAMPLE() aggregate to select one of the values ​​from other columns, for example.

 PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?city (SAMPLE(?name) AS ?cityName) (SAMPLE(?pop) AS ?cityPop) WHERE { ?city a dbo:Settlement . ?city foaf:name ?name . ?city dbo:populationTotal ?pop . ?city dbo:country ?country . ?city dbo:country dbpedia:Italy . FILTER (?pop > 100000) } GROUP BY ?city 

So, grouping by ?city , you get only one row for each city, since you are grouped by ?city , you cannot directly select variables that are not group variables.

Instead, you should use the SAMPLE() aggregate to select one of the values ​​for each of the non-group variables that you want to have in the final results. This will select one of the ?name and ?pop values ​​to return as ?cityName and ?cityPop respectively

+11
source

Source: https://habr.com/ru/post/1215146/


All Articles