How to access Wikidata SPARQL interface with Java?

Question

How to access Wikidata SPARQL interface with Java?

I am trying to request all instances of an object from Wikidat. I found out that currently the only way to do this is to use the SPARQL-API.

I found an example query that does what I want to do, and successfully executed it with a Webinterface. Unfortunately, I cannot execute it from my Java code. I am using the openRDF SPARQL library. Here is my code:

SPARQLRepository sparqlRepository = new SPARQLRepository( "https://query.wikidata.org/"); SPARQLConnection sparqlConnection = new SPARQLConnection( sparqlRepository); String query = "SELECT ?s ?desc ?authorlabel (COUNT(DISTINCT ?sitelink) as ?linkcount) WHERE {" + "?s wdt:P31 wd:Q571 ." + "?sitelink schema:about ?s ." + "?s wdt:P50 ?author" + "OPTIONAL { ?s rdfs:label ?desc filter (lang(?desc) = \"en\"). }" + "OPTIONAL {" + "?author rdfs:label ?authorlabel filter (lang(?authorlabel) = \"en\")." + "}" + "} GROUP BY ?s ?desc ?authorlabel ORDER BY DESC(?linkcount)"; TupleQuery tupleQuery = sparqlConnection.prepareTupleQuery( QueryLanguage.SPARQL, query); System.out.println("Result for tupleQuery" + tupleQuery.evaluate());

And here is the answer I get:

 Exception in thread "main" org.openrdf.query.QueryEvaluationException: <html> <head><title>405 Not Allowed</title></head> <body bgcolor="white"> <center><h1>405 Not Allowed</h1></center> <hr><center>nginx/1.9.4</center> </body> </html> at org.openrdf.repository.sparql.query.SPARQLTupleQuery.evaluate(SPARQLTupleQuery.java:59) at main.Test.main(Test.java:72) Caused by: org.openrdf.repository.RepositoryException: <html> <head><title>405 Not Allowed</title></head> <body bgcolor="white"> <center><h1>405 Not Allowed</h1></center> <hr><center>nginx/1.9.4</center> </body> </html> at org.openrdf.http.client.HTTPClient.handleHTTPError(HTTPClient.java:953) at org.openrdf.http.client.HTTPClient.sendTupleQueryViaHttp(HTTPClient.java:718) at org.openrdf.http.client.HTTPClient.getBackgroundTupleQueryResult(HTTPClient.java:602) at org.openrdf.http.client.HTTPClient.sendTupleQuery(HTTPClient.java:367) at org.openrdf.repository.sparql.query.SPARQLTupleQuery.evaluate(SPARQLTupleQuery.java:52) ... 1 more

I usually assume that this means I need an API key, but the Wikidata API looks completely open. Did I make a mistake establishing a connection?

+5

java sparql wikidata sesame wikidata-api

Andreas Hartmann May 23 '16 at 19:08

source share

2 answers

When I go to https://query.wikidata.org/ and look at the Tools> SPARQL REST endpoint, I see (highlighted by me):

Endpoint SPARQL
SPARQL queries can be sent directly to the SPARQL endpoint with a GET request at https://query.wikidata.org/sparql?query= {SPARQL} (POST and other method queries will be rejected using "Forbidden 403"). * The result is returned as XML by default or as JSON if either the request parameter format = json or the header Accept: application / sparql-results + json are provided.

It looks like you are using a different URL (it doesn't look like you have the final sparql there), so you probably don't actually click this endpoint.

However, since you can visit the URL you are using (presumably using GET), it looks like your API call can do POST, so you can check how the request is being made over the network.

Here is an example of using this endpoint from Jena to Use Jena to query wikidata . The OP of this question actually had the same problem you encountered (wrong request url).

+2

Joshua taylor May 23 '16 at 20:59

source share

Jeen broekstra · Accepted Answer · 2016-05-24T05:29:45+0000

The correct endpoint URL for Wikidata is https://query.wikidata.org/sparql - you are missing the last bit.

In addition, I noticed a few crashes in your code. First of all, you do this:

 SPARQLConnection sparqlConnection = new SPARQLConnection(sparqlRepository);

It should be as follows:

 RepositoryConnection sparqlConnection = sparqlRepository.getConnection();

Always retrieve your connection object from the Repository object using getConnection() - this means that resources are shared, and Repository can close dangling connections if necessary.

Secondly: you cannot print the result of the query as follows:

 System.out.println("Result for tupleQuery" + tupleQuery.evaluate());

If you want to print the result before System.out , you should do something like this:

 tupleQuery.evaluate(new SPARQLResultsTSVWriter(System.out));

Or (if you want to tune the result a little more):

 for (BindingSet bs : QueryResults.asList(tupleQuery.evaluate())) { System.out.println(bs); }

For what it's worth - with the above changes, the request request is executed, but it seems that your request is too "heavy" for Wikidata - at least I got a timeout error from the server. Try a simpler query and you will see that the code is working.

How to access Wikidata SPARQL interface with Java?

Endpoint SPARQL

More articles: