DISTINCT only one value with SPARQL

Question

DISTINCT only one value with SPARQL

I want to get a list of Italian cities with a population of more than 100 thousand with SPARQL, and I use the following query:

PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?city ?name ?pop WHERE { ?city a dbo:Settlement . ?city foaf:name ?name . ?city dbo:populationTotal ?pop . ?city dbo:country ?country . ?city dbo:country dbpedia:Italy . FILTER (?pop > 100000) }

In the results, I get, for example, two different lines (which represent the same object, but with different names):

http://dbpedia.org/resource/Bologna Bologna @en 384038
http://dbpedia.org/resource/Bologna "Comune di Bologna" @en 384038

How can I use SELECT DISTINCT only in the ?city column, but still having outher as output columns?

+5

rdf dbpedia linked-data sparql

drstein Mar 11 '15 at 14:03

source share

1 answer

Robv · Accepted Answer · 2015-03-11T14:22:06+0000

You can use GROUP BY to group by a specific column, and then use the SAMPLE() aggregate to select one of the values from other columns, for example.

 PREFIX dbo: <http://dbpedia.org/ontology/> SELECT ?city (SAMPLE(?name) AS ?cityName) (SAMPLE(?pop) AS ?cityPop) WHERE { ?city a dbo:Settlement . ?city foaf:name ?name . ?city dbo:populationTotal ?pop . ?city dbo:country ?country . ?city dbo:country dbpedia:Italy . FILTER (?pop > 100000) } GROUP BY ?city

So, grouping by ?city , you get only one row for each city, since you are grouped by ?city , you cannot directly select variables that are not group variables.

Instead, you should use the SAMPLE() aggregate to select one of the values for each of the non-group variables that you want to have in the final results. This will select one of the ?name and ?pop values to return as ?cityName and ?cityPop respectively

DISTINCT only one value with SPARQL

More articles: