Is it possible to pass a wildcard _ into a parameterized query? Something like this:
(d/q [:find ?e
:in $ ?type
:where [?e :type ?type]] db _)
When I tried this as written above it threw an error. Is there a way to do this?
I know that I can get everything with a query that looks like this:
(d/q [:find ?e
:where [?e :type]] db)
But my goal is to avoid needing to build separate queries when I don't want to filter results by :type. The use case is, e.g., and API endpoint that may or may not filter results.
If I understand you correctly, you should be able to type:
(d/q [:find ?e
:in $
:where [?e :type]] db )
In Datomic, any unspecified values are considered to be wildcards. The above query will return a list of all entities that have the :type attribute, regardless of value.
Update
Datomic's query is designed to accept a plain value like 5 or :awesome to be substituted into the ?type variable. A symbol like _ (or the quoted version '_) does not fit the pattern expected by Datomic.
Just for fun, I tried several variations and could not get it Datomic to accept the symbol '_ for the ?type variable in the way you proposed. I think you'll have to write a separate query for the wildcard case.
Essentially, the wildcard _ is a special symbol (aka "reserved word") in the Datomic query syntax just like $. Datomic also enforces that query variables begin with a ? like ?e or ?type. These requirements are a part of the Datomic DSL that you can't change.
The only workaround besides hand-writing separate queries would be to dynamically compose the query vector from a base-part and add-on parts. Whether that is easier or harder than hand-writing the different queries depends on your specific situation.
Related
Is there a way to express this kind of logic purely inside a query?
(def e-top
(let [res (d/q '[:find ?e (count ?p)
:where [?p :likes ?e]] db)]
(first (apply max-key last res))))
If you need to work within one query, then aggregate of aggregates problems are best tackled with subquery (a nested call to query inside query). See this answer on the Datomic mailing list which includes a similar (not identical) query on the results of an aggregate against mbrainz:
(d/q '[:find ?track ?count
:where [(datomic.api/q '[:find ?track (count ?artist)
:where [?track :track/artists ?artist]] $) [[?track ?count]]]
[(> ?count 1)]]
(d/db conn))
For your case (assuming work stays in Clojure), apply will be faster and simpler. Subqueries that only need to do something simple (e.g. get something associated with the max value) tend to make more sense if you're using the REST API or some other client wrapping around Datomic where you don't have the perf benefits associated with the Peer library being in process.
I am using filter queries with Solr 4.10.0 / Lucene 4.10.0 and have the strange situation that while
fq=areas:Finanz- & Rechnungswesen and
fq=areas:"Finanz- & Rechnungswesen"
yield the same set of documents,
fq=areas:E-Commerce & Neue Medien and
fq=areas:"E-Commerce & Neue Medien"
don't – in the latter case, the set of results is empty.
I executed the queries in the Solr admin UI and checked in the Solr log that the filters correctly translate to the query params
fq=areas:Finanz-+%26+Rechnungswesen
fq=areas:"Finanz-+%26+Rechnungswesen"
fq=areas:E-Commerce+%26+Neue+Medien
fq=areas:"E-Commerce+%26+Neue+Medien"
respectively. Only in the last case, the result set is empty. Can anyone explain why this is the case? Unfortunately, Spring Data Solr quotes multi-word filters, so it gives a wrong result in that case.
Without seeing the data within your index it's hard to diagnose exactly why you have different numbers of results, however your queries may not be behaving how you expect due to the field syntax.
the filter query areas:Finanz- & Rechnungswesen will be parsed as:
areas:Finanz- {default_field}:rechnungswesen where {default_field} is whatever has been configured as your default field when one has not been supplied.
In order to debug these queries more easily, have a look at the results with debugQuery=true, this can also be done in the
Solr Admin UI's query interface.
To make sure that all terms are limited to your areas field, use parentheses, e.g:
areas:(Finanz- & Rechnungswesen)
For more details, have a look at the Solr query parser syntax: https://wiki.apache.org/solr/SolrQuerySyntax#Default_QParserPlugin:_LuceneQParserPlugin
Assume I have entity author with many related book entities.
What's the query to fetch author with biggest amount of books?
OK. Since I found an answer by myself - I am posting it here in case somebody will search for:
The solution is to build two datomic queries passing output of first one to second one.
(->>
(d/q '[:find (count ?b) ?a :where [?a :author/books ?b]] db)
(d/q '[:find (max ?count) ?a :in $ [?count ?a]] db))
This is as far as I got it the common way to work with less trivial queries in datomic - split it to several subqueries and chain together giving the DB do its job.
I have a bunch of records containing business names and I wish to do a query to find all the duplicates. How can this be done?
{:business/name "<>"}
If you're trying to enforce uniqueness on the attribute value you should look at the :db/unique schema attribute instead.
To find the duplicated values and how often they repeat, use:
(->> (d/datoms db :aevt :business/name)
(map :v)
(frequencies)
(filter #(> (second %) 1)))
which uses the datomic.api/datoms API to access the raw AEVT index to stream :business/name attribute values, calculate their frequency and filter them based on some criteria i.e. more than one occurrence. You can also achieve the same result using datalog and aggregation functions:
(->> (d/q '[:find (frequencies ?v)
:with ?e
:in $ ?a
:where [?e ?a ?v]]
db :business/name)
(ffirst)
(filter #(> (second %) 1)))
To find the entities with duplicated attribute values, use:
(->> (d/datoms db :aevt :business/name)
(group-by :v)
(filter #(> (count (second %)) 1))
(mapcat second)
(map :e))
which also leverages the d/datoms API to accomplish it. For a full code sample, including datalog implementations, see https://gist.github.com/a2ndrade/5641681
Being new to Lucene I'd like to find documents where a certain field is either within a given range or entirely absent. That is I'd like to combine the results of these two queries:
q=something AND field:[lower TO upper]
q=something AND -field:[* TO *]
Either query gives me the desired result but when I try to combine the two I get nothing:
q=something AND (field:[lower TO upper] OR -field:[* TO *])
something can be a more complex query. Actually, my query will be Solr query from within a Java program in case it makes a difference. How can this be done?
This should work as well:
q=( (+something -field:[* TO *]) OR (+something +field:[lower TO upper]) )