How to build a query that exactly matches the link vector in DataScript?

Customization Consider the following movie database and data throwing with data stolen from learndatalogtoday.org : the following code can be executed in the JVM / Clojure REPL or ClojureScript REPL if project.clj contains [datascript "0.15.0"] as a dependency.

 (ns user (:require [datascript.core :as d])) (def data [["First Blood" ["Sylvester Stallone" "Brian Dennehy" "Richard Crenna"]] ["Terminator 2: Judgment Day" ["Linda Hamilton" "Arnold Schwarzenegger" "Edward Furlong" "Robert Patrick"]] ["The Terminator" ["Arnold Schwarzenegger" "Linda Hamilton" "Michael Biehn"]] ["Rambo III" ["Richard Crenna" "Sylvester Stallone" "Marc de Jonge"]] ["Predator 2" ["Gary Busey" "Danny Glover" "Ruben Blades"]] ["Lethal Weapon" ["Gary Busey" "Mel Gibson" "Danny Glover"]] ["Lethal Weapon 2" ["Mel Gibson" "Joe Pesci" "Danny Glover"]] ["Lethal Weapon 3" ["Joe Pesci" "Danny Glover" "Mel Gibson"]] ["Alien" ["Tom Skerritt" "Veronica Cartwright" "Sigourney Weaver"]] ["Aliens" ["Carrie Henn" "Sigourney Weaver" "Michael Biehn"]] ["Die Hard" ["Alan Rickman" "Bruce Willis" "Alexander Godunov"]] ["Rambo: First Blood Part II" ["Richard Crenna" "Sylvester Stallone" "Charles Napier"]] ["Commando" ["Arnold Schwarzenegger" "Alyssa Milano" "Rae Dawn Chong"]] ["Mad Max 2" ["Bruce Spence" "Mel Gibson" "Michael Preston"]] ["Mad Max" ["Joanne Samuel" "Steve Bisley" "Mel Gibson"]] ["RoboCop" ["Nancy Allen" "Peter Weller" "Ronny Cox"]] ["Braveheart" ["Sophie Marceau" "Mel Gibson"]] ["Mad Max Beyond Thunderdome" ["Mel Gibson" "Tina Turner"]] ["Predator" ["Carl Weathers" "Elpidia Carrillo" "Arnold Schwarzenegger"]] ["Terminator 3: Rise of the Machines" ["Nick Stahl" "Arnold Schwarzenegger" "Claire Danes"]]]) (def conn (d/create-conn {:film/cast {:db/valueType :db.type/ref :db/cardinality :db.cardinality/many} :film/name {:db/unique :db.unique/identity :db/cardinality :db.cardinality/one} :actor/name {:db/unique :db.unique/identity :db/cardinality :db.cardinality/one}})) (def all-datoms (mapcat (fn [[film actors]] (into [{:film/name film}] (map #(hash-map :actor/name %) actors))) data)) (def all-relations (mapv (fn [[film actors]] {:db/id [:film/name film] :film/cast (mapv #(vector :actor/name %) actors)}) data)) (d/transact! conn all-datoms) (d/transact! conn all-relations) 

Description In a nutshell, there are two types of objects in this database - films and actors (a word intended to be unregistered) - and three kinds of Danish words:

  • movie object :film/name (unique string)
  • film:: :film/cast (several links)
  • subject of subject:: :actor/name (unique string)

Question I would like to build a query that asks: in which films are these actors N , and only these actors N were the only stars with N> = 2?

For example, RoboCop starred in the roles of Nancy Allen, Peter Weller, Ronnie Cox, but not a single film starred with only the first two of them, Allen and Weller. Therefore, I expect the following request to create an empty set:

 (d/q '[:find ?film-name :where [?film :film/name ?film-name] [?film :film/cast ?actor-1] [?film :film/cast ?actor-2] [?actor-1 :actor/name "Nancy Allen"] [?actor-2 :actor/name "Peter Weller"]] @conn) ; => #{["RoboCop"]} 

However, the request is incorrect because I don’t know how to express that any matches should exclude any participants who are not Allen or Weller, I want to find films in which only Allen and Weller collaborated without any other actors, so I want adapt the above query to create an empty set. How can I customize this request to enforce this requirement?

+8
clojure datomic clojurescript datalog datascript
source share
2 answers

Since DataScript has no negation (as of May 2016), I do not consider it possible with a single static query in a β€œclean” Datalog.

My way:

  • program the query programmatically to add sentences N, which indicate that the cast should contain participants N
  • Add a predicate function, which, given the movie, the database, and the set of participant identifiers, uses the EAVT index to determine if each movie has an actor that is not in the set.

Here's the base implementation

 (defn only-those-actors? [db movie actors] (->> (datoms db :eavt movie :film/cast) seq (every? (fn [[_ _ actor]] (contains? actors actor))) )) (defn find-movies-with-exact-cast [db actors-names] (let [actors (set (d/q '[:find [?actor ...] :in $ [?name ...] ?only-those-actors :where [?actor :actor/name ?name]] db actors-names)) query {:find '[[?movie ...]] :in '[$ ?actors ?db] :where (concat (for [actor actors] ['?movie :film/cast actor]) [['(only-those-actors? ?db ?movie ?actors)]])}] (d/q query db actors db only-those-actors?))) 
+2
source share

You can use the pleasure of predicates and d/entity together to filter file dates on the :film/cast field of an object. This approach looks much simpler until Datascript supports negation ( not etc.).

Look at the line (= a (:age (d/entity db e)) in the Datascript test case here

 [{:db/id 1 :name "Ivan" :age 10} {:db/id 2 :name "Ivan" :age 20} {:db/id 3 :name "Oleg" :age 10} {:db/id 4 :name "Oleg" :age 20}] ... (let [pred (fn [db ea] (= a (:age (d/entity db e))))] (is (= (q/q '[:find ?e :in $ ?pred :where [?e :age ?a] [(?pred $ ?e 10)]] db pred) #{[1] [3]}))))) 

In your case, the body of the predicate might look something like this:

 (clojure.set/subset? actors (:film/cast (d/entity db e)) 

In terms of performance, calling d/entity is fast because it is an index search.

0
source share