Ruby RDF query - extracting simple data from Seq and Bag elements

I get an xml-serialized RDF (as part of the description of XMP media files in case it is relevant) and processing in Ruby. I try to work with rdf gem, although I enjoy looking at other solutions.

I was able to download and query the most basic data, but I was stuck trying to build a query for elements containing sequences and packages.

XML RDF Example:

 <rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'> <rdf:Description rdf:about='' xmlns:dc='http://purl.org/dc/elements/1.1/'> <dc:date> <rdf:Seq> <rdf:li>2013-04-08</rdf:li> </rdf:Seq> </dc:date> </rdf:Description> </rdf:RDF> 

My best attempt to compile a query:

 require 'rdf' require 'rdf/rdfxml' require 'rdf/vocab/dc11' graph = RDF::Graph.load( 'test.rdf' ) date_query = RDF::Query.new( :subject => { RDF::DC11.date => :date } ) results = date_query.execute(graph) results.map { |result| { result.subject.to_s => result.date.inspect } } => [{"test.rdf"=>"#<RDF::Node:0x3fc186b3eef8(_:g70100421177080)>"}] 

I get the impression that my results at this point (“query solutions”?) Are a reference to the rdf:Seq container. But I'm lost in how to progress. In the above example, I expect that it will eventually end up with an array ["2013-04-08"] .

When there is input without rdf:Seq and rdf:li containers, I can extract the lines that I want using RDF::Query , following the examples in http://rdf.rubyforge.org/RDF/Query.html - unfortunately I cannot find examples of more complex queries or RDF structures processed in Ruby.

Edit: Also, when I try to find suitable methods to use with the RDF::Node object, I see no way to explore further relationships that might have:

 results[0].date.methods - Object.methods => [:original, :original=, :id, :id=, :node?, :anonymous?, :unlabeled?, :labeled?, :to_sym, :resource?, :constant?, :variable?, :between?, :graph?, :literal?, :statement?, :iri?, :uri?, :valid?, :invalid?, :validate!, :validate, :to_rdf, :inspect!, :type_error, :to_ntriples] # None of the above leads AFAICS to more data in the graph 

I know how to get the same data in xpath (well, at least provided that we always get the same paths in serialization), but we feel that in this case this is not the best query language (this is my backup plan, however, if it turns out to be too complicated to implement the solution of the RDF request)

+4
source share
1 answer

I think you are right when you say "my results at this stage (" resolving queries "?) Are a reference to the rdf: Seq container." RDF / XML is a really awful serialization format; instead, you should consider data as a graph. Here is a photo of RDF: Bag. RDF: Seq works the same way, and #students in the example is similar to #date in your case. RDF: Bag example, RDF: Seq is the same

So, to go to the date literature, you need to skip another node further on the chart. I am not familiar with the syntax of this Ruby library, but something like:

 require 'rdf' require 'rdf/rdfxml' require 'rdf/vocab/dc11' graph = RDF::Graph.load( 'test.rdf' ) date_query = RDF::Query.new({ :yourThing => { RDF::DC11.date => :dateSeq }, :dateSeq => { RDF.type => RDF.Seq, RDF._1 => :dateLiteral } }) date_query.execute(graph).each do |solution| puts "date=#{solution.dateLiteral}" end 

Of course, if you expect that Seq will actually contain several dates (otherwise it would be pointless to have Seq), you will have to match them with RDF._1 => :dateLiteral1 , RDF._2 => :dateLiteral2 , RDF._3 => :dateLiteral3 etc ..

Or for a more general solution, map all properties and objects in dateSeq with:

 :dateSeq => { :property => :dateLiteral } 

and then filter out the case where :property ends with RDF:type , and :dateLiteral is actually not a date, but RDF:Seq . Perhaps the library also has a special method for retrieving all Seq content.

+3
source

All Articles