Datomic aggregations: counting related entities without losing results with zero-count - datomic

I have a domain model made of Questions, each Question being associated with a number of Comments and Affirmations.
I would like to make a Datalog query which extracts a bunch of content attributes for each Question, as well as the number of associated Comments and Affirmations, including when these relationships are empty (e.g some Question has no Comment or no Affirmation), in which case the returned count should be 0.
I have seen the following Gist, which shows how to use (sum ...) and (or-join) combined with a 'weight' variable to get a zero-count when the relationship is empty.
However, I do not see how to make this work when there are 2 relationships. I tried the following, but the returned counts are not correct:
(def query '[:find (sum ?comment-weight) (sum ?affirmation-weight) ?text ?time ?source-identifier ?q
:with ?uniqueness
:where
[?q :question/text ?text]
[?q :question/time ?time]
[?q :question/source-identifier ?source-identifier]
(or-join [?q ?uniqueness ?comment-weight ?affirmation-weight]
(and [?comment :comment/question ?q]
[?affirmation :affirmation/question ?q]
['((identity ?comment) (identity ?affirmation)) ?uniqueness]
[(ground 1) ?comment-weight]
[(ground 1) ?affirmation-weight])
(and [?comment :comment/question ?q]
[(identity ?comment) ?uniqueness]
[(ground 1) ?comment-weight]
[(ground 0) ?affirmation-weight])
(and [?affirmation :affirmation/question ?q]
[(identity ?affirmation) ?uniqueness]
[(ground 1) ?affirmation-weight]
[(ground 0) ?comment-weight])
(and [(identity ?q) ?uniqueness]
[(ground 0) ?comment-weight]
[(ground 0) ?affirmation-weight]))])
Originally asked on the Clojurians Slack.

So to summarize, the trick is to consider each Question, Comment and Affirmation as a 'data point', which has a 'weight' of 0 or 1 for each count, and is identified uniquely (so that Datalog counts correctly, see :with). In particular, each Question has a zero weight for all counts.
The posted code was almost correct, it only needs to remove the first clause of the (or-join ...), which results in creating 'artificial' data points (Comment + Affirmation tuples) that pollute the counts.
The following should work:
[:find (sum ?comment-weight) (sum ?affirmation-weight) ?text ?time ?source-identifier ?q
:with ?uniqueness
:where
[?q :question/text ?text]
[?q :question/time ?time]
[?q :question/source-identifier ?source-identifier]
(or-join [?q ?uniqueness ?comment-weight ?affirmation-weight]
(and
[?comment :comment/question ?q]
[(identity ?comment) ?uniqueness]
[(ground 1) ?comment-weight]
[(ground 0) ?affirmation-weight])
(and
[?affirmation :affirmation/question ?q]
[(identity ?affirmation) ?uniqueness]
[(ground 1) ?affirmation-weight]
[(ground 0) ?comment-weight])
(and
[(identity ?q) ?uniqueness]
[(ground 0) ?comment-weight]
[(ground 0) ?affirmation-weight]))]

You can call arbitrary functions within a query, so consider using datomic.api/datoms directly (or a subquery, or an arbitrary function of your own) to count the comments and affirmations.
Using datomic.api/datoms:
'[:find ?comment-count ?affirmation-count ?text ?time ?source-identifier ?q
:where
[?q :question/text ?text]
[?q :question/time ?time]
[?q :question/source-identifier ?source-identifier]
[(datomic.api/datoms $ :vaet ?q :comment/question) ?comments]
[(count ?comments) ?comment-count]
[(datomic.api/datoms $ :vaet ?q :affirmation/question) ?affirmations]
[(count ?affirmations) ?affirmation-count]]
Or using a subquery:
'[:find ?comment-count ?affirmation-count ?text ?time ?source-identifier ?q
:where
[?q :question/text ?text]
[?q :question/time ?time]
[?q :question/source-identifier ?source-identifier]
[(datomic.api/q [:find (count ?comment) .
:in $ ?q
:where [?comment :comment/question ?q]]
$ ?q) ?comment-count-or-nil]
[(clojure.core/or ?comment-count-or-nil 0) ?comment-count]
[(datomic.api/q [:find (count ?affirmation) .
:in $ ?q
:where [?affirmation :affirmation/question ?q]]
$ ?q) ?affirmation-count-or-nil]
[(clojure.core/or ?affirmation-count-or-nil 0) ?affirmation-count]]
Or using a custom function:
(defn count-for-question [db question kind]
(let [dseq (case kind
:comments (d/datoms db :vaet question :comment/question)
:affirmations (d/datoms db :vaet question :affirmation/question))]
(reduce (fn [x _] (inc x)) 0 dseq)))
'[:find ?comment-count ?affirmation-count ?text ?time ?source-identifier ?q
:where
[?q :question/text ?text]
[?q :question/time ?time]
[?q :question/source-identifier ?source-identifier]
[(user/count-for-question $ ?q :comments) ?comment-count]
[(user/count-for-question $ ?q :affirmations) ?affirmation-count]]

Related

Pull expression in query with limit/default?

How can I use limit/default in at pull expression in a query? Given a cardinality-many attribute, how can I control how many of its values is returned (default is max 1000 values!).
(Found it hard to figure out the correct syntax from the documentation/examples)
Limit (for cardinality-many attributes)
Return max 2 values of cardinality-many attribute :ns/ints:
[:find (pull ?a [(limit :ns/ints 2)])
:where [?a :ns/str ?b]]
Return all values of cardinality-many attribute :ns/ints:
[:find (pull ?a [(limit :ns/ints nil)])
:where [?a :ns/str ?b]]
Default
Return default value 2000 if attribute :ns/ints has no value:
[:find (pull ?a [(default :ns/ints 2000)])
:where [?a :ns/str ?b]]
Return default values 2000 and 2001 if cardinality-many attribute :ns/ints has no values:
[:find (pull ?a [(default :ns/ints [2000 2001])])
:where [?a :ns/str ?b]]

Macros that generate code from a for-loop

This example is a little contrived. The goal is to create a macro that loops over some values and programmatically generates some code.
A common pattern in Python is to initialize the properties of an object at calling time as follows:
(defclass hair [foo bar]
(defn __init__ [self]
(setv self.foo foo)
(setv self.bar bar)))
This correctly translates with hy2py to
class hair(foo, bar):
def __init__(self):
self.foo = foo
self.bar = bar
return None
I know there are Python approaches to this problem including attr.ib and dataclasses. But as a simplified learning exercise I wanted to approach this with a macro.
This is my non-working example:
(defmacro self-set [&rest args]
(for [[name val] args]
`(setv (. self (read-str ~name)) ~val)))
(defn fur [foo bar]
(defn __init__ [self]
(self-set [["foo" foo] ["bar" bar]])))
But this doesn't expand to the original pattern. hy2py shows:
from hy.core.language import name
from hy import HyExpression, HySymbol
import hy
def _hy_anon_var_1(hyx_XampersandXname, *args):
for [name, val] in args:
HyExpression([] + [HySymbol('setv')] + [HyExpression([] + [HySymbol
('.')] + [HySymbol('self')] + [HyExpression([] + [HySymbol(
'read-str')] + [name])])] + [val])
hy.macros.macro('self-set')(_hy_anon_var_1)
def fur(foo, bar):
def __init__(self, foo, bar):
return None
Wbat am I doing wrong?
for forms always return None. So, your loop is constructing the (setv ...) forms you request and then throwing them away. Instead, try lfor, which returns a list of results, or gfor, which returns a generator. Note also in the below example that I use do to group the generated forms together, and I've moved a ~ so that the read-str happens at compile-time, as it must in order for . to work.
(defmacro self-set [&rest args]
`(do ~#(gfor
[name val] args
`(setv (. self ~(read-str name)) ~val))))
(defclass hair []
(defn __init__ [self]
(self-set ["foo" 1] ["bar" 2])))
(setv h (hair))
(print h.bar) ; 2

Don't Loop, Iterate! (Common Lisp)

I'm having trouble switching to an iterate version of some loop code:
(defun get-bound-?vars-1 (tree)
(loop for item in tree
when (consp item)
if (member (car item) '(exists forall doall))
nconc (delete-if-not #'?varp
(alexandria:flatten (second item)))
else nconc (get-bound-?vars item)))
My corresponding iterate translation:
(defun get-bound-?vars-2 (tree)
(iter (for item in tree)
(when (consp item)
(if (member (car item) '(exists forall doall))
(nconc (delete-if-not #'?varp
(alexandria:flatten (second item))))
(nconc (get-bound-?vars item))))))
As test case:
(defparameter *tree*
'(if (exists (?t transmitter)
(and (connecting ?t ?connector)
(bind (color ?t $hue))))
(if (not (exists ((?t1 ?t2) transmitter)
(and (connecting ?t1 ?connector)
(connecting ?t2 ?connector)
(bind (color ?t1 $hue1))
(bind (color ?t2 $hue2))
(not (eql $hue1 $hue2)))))
(activate-connector! ?connector $hue))))
Then loop OK:
(get-bound-?vars-1 *tree*) => (?T ?T1 ?T2)
But iterate not OK:
(get-bound-?vars-2 *tree*) => NIL
Thanks for any pointers.

Clojure nested doseq loop

I'm new to Clojure and I have a question regarding nested doseq loops.
I would like to iterate through a sequence and get a subsequence, and then get some keys to apply a function over all the sequence elements.
The given sequence has an structure more or less like this, but with hundreds of books, shelves and many libraries:
([:state/libraries {6 #:library {:name "MUNICIPAL LIBRARY OF X" :id 6
:shelves {3 #:shelf {:name "GREEN SHELF" :id 3 :books
{45 #:book {:id 45 :name "NECRONOMICON" :pages {...},
{89 #:book {:id 89 :name "HOLY BIBLE" :pages {...}}}}}}}}])
Here is my code:
(defn my-function [] (let [conn (d/connect (-> my-system :config :datomic-uri))]
(doseq [library-seq (read-string (slurp "given-sequence.edn"))]
(doseq [shelves-seq (val library-seq)]
(library/create-shelf conn {:id (:shelf/id (val shelves-seq))
:name (:shelf/name (val shelves-seq))})
(doseq [books-seq (:shelf/books (val shelves-seq))]
(library/create-book conn (:shelf/id (val shelves-seq)) {:id (:book/id (val books-seq))
:name (:book/name (val books-seq))})
)))))
The thing is that I want to get rid of that nested doseq mess but I don't know what would be the best approach, since in each iteration keys change. Using recur? reduce? Maybe I am thinking about this completely the wrong way?
Like Carcigenicate says in the comments, presuming that the library/... functions are only side effecting, you can just write this in a single doseq.
(defn my-function []
(let [conn (d/connect (-> my-system :config :datomic-uri))]
(doseq [library-seq (read-string (slurp "given-sequence.edn"))
shelves-seq (val library-seq)
:let [_ (library/create-shelf conn
{:id (:shelf/id (val shelves-seq))
:name (:shelf/name (val shelves-seq))})]
books-seq (:shelf/books (val shelves-seq))]
(library/create-book conn
(:shelf/id (val shelves-seq))
{:id (:book/id (val books-seq))
:name (:book/name (val books-seq))}))))
I would separate "connecting to the db" from "slurping a file" from "writing to the db" though. Together with some destructuring I'd end up with something more like:
(defn write-to-the-db [conn given-sequence]
(doseq [library-seq given-sequence
shelves-seq (val library-seq)
:let [{shelf-id :shelf/id,
shelf-name :shelf/name
books :shelf/books} (val shelves-seq)
_ (library/create-shelf conn {:id shelf-id, :name shelf-name})]
{book-id :book/id, book-name :book/name} books]
(library/create-book conn shelf-id {:id book-id, :name book-name})))

Updating value with cardinality many

I have a schema like this:
[{:db/id #db/id[:db.part/db]
:db/ident :person/name
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db/doc "A person's name"
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :person/roles
:db/valueType :db.type/keyword
:db/cardinality :db.cardinality/many
:db/doc "A person's role"
:db.install/_attribute :db.part/db}]
And a code like this:
;; insert new person
(def new-id (-> (d/transact conn [{:db/id (d/tempid :db.part/user)
:person/name "foo"
:person/roles #{:admin}}])
(:tempids)
(vals)
(first)))
(defn get-roles
[db eid]
(d/q '[:find ?roles .
:in $ ?eid
:where [?eid :user/roles ?roles]]))
(get-roles (d/db conn) new-id) ;; => [:admin]
;; update a person
(d/transact conn [{:db/id new-id
:person/roles #{:client}}])
(get-roles (d/db conn) new-id) ;; => [:admin :client]
It seems the default behaviour on it is, it will just assoc the new value.
How can I get this result, after doing the updating transaction:
(get-roles (d/db conn) new-id) ;; => [:client]
if what you want is to "reset" the list of roles to a new value (an 'absolute' operation 'in contrast to the 'relative' operations of just adding or removing roles), you'll have to use a transaction function to perform a diff and retract the values that need be.
Here's a basic generic implementation:
{:db/id (d/tempid :db.part/user),
:db/ident :my.fns/reset-to-many,
:db/fn
(d/function
{:lang :clojure,
:requires '[[datomic.api :as d]],
:params '[db e attr new-vals],
:code
'(let [ent (or (d/entity db e)
(throw (ex-info "Entity not found"
{:e e :t (d/basis-t db)})))
entid (:db/id ent)
old-vals (get ent attr)]
(into
[{:db/id (:db/id ent)
;; adding the new values
attr new-vals}]
;; retracting the old values
(comp
(remove (set new-vals))
(map (fn [v]
[:db/retract entid attr v])))
old-vals)
)})}
;; Usage
(d/transact conn [[:my.fns/reset-to-many new-id :person/roles #{:client}]])
Here is a solution once suggested from Robert Stuttaford
(defn many-delta-tx
"Produces the transaction necessary to have
`new` be the value for `entity-id` at `attr`
`new` is expected to be a set."
[db entity-id attr new]
(let [current (into #{} (map :v)
(d/datoms db :eavt
entity-id
(d/entid db attr)))]
(concat (for [id (clojure.set/difference new current)]
[:db/add entity-id attr id])
(for [id (clojure.set/difference current new)]
[:db/retract entity-id attr id]))))
For ease of testing, I would like to slightly modify part of the example of original question.
Change of the schema. I add db/unique
[{:db/id #db/id[:db.part/db]
:db/ident :person/name
:db/valueType :db.type/string
:db/cardinality :db.cardinality/one
:db/unique :db.unique/identity
:db/doc "A person's name"
:db.install/_attribute :db.part/db}
{:db/id #db/id[:db.part/db]
:db/ident :person/roles
:db/valueType :db.type/keyword
:db/cardinality :db.cardinality/many
:db/doc "A person's role"
:db.install/_attribute :db.part/db}]
get-roles
(defn get-roles
[db eid]
(d/q '[:find [?roles ...]
:in $ ?eid
:where [?eid :person/roles ?roles]] db eid))
My testing example
;; (many-delta-tx (d/db conn) [:person/name "foo"] :person/roles #{:sales :client})
;; => ([:db/add [:person/name "foo"] :person/roles :sales] [:db/retract [:person/name "foo"] :person/roles :admin]))

Resources