I cannot describe my problem formally because of my poor English; let me talk about this using an example. In the table below, "subject", "predicate" are actually grouped.
We define a set by lines if they are the same βsubjectβ. Now I want to combine any two sets if they contain the same "predicate", summarize the "count" of the same "predicate" and count the number of individual items that have the same set.
subject predicate count ----------------------------- s1 p1 1 s1 p2 2 s2 p1 3 s3 p1 2 s3 p2 2
Therefore, two sets are required from this table:
{2, (p1, 3), (p2, 4)}, {1, (p1,3)}
where in the first set 2 indicates that there are two objects (s1 and s3) having this set; (p1,3) is the sum of (s1, p1, 1) and (s3, p1, 2).
So how can I get these sets and save them in Java?
How to do it with SPARQL?
Or, firstly, save these triples in Java, then how can I get these sets using Java?
One solution might be concat predicates and counts,
SELECT (COUNT(?s) AS ?distinct) ?propset (group_concat(?count; separator = \"\\t\") AS ?counts) { SELECT ?s (group_concat(?p; separator = \" \") AS ?propset) (group_concat(?c; separator = \" \") AS ?count { ?s ?p ?c } GROUP BY ?s ORDER BY ?s } GROUP BY ?propset ORDER BY ?propset
Then the counts can be divided, then added up. It works great on a small dataset, but a lot of time.
I think I will give up this strange problem. Thank you very much for your reply.