Pull all entity attributes and values given a set of entity IDs in Datomic - datomic

I have a hashset of Entity IDs:
#{1234 5678 9012 4864 ...}
How can I return a collection of maps of each Entity's attributes and values. I am guessing this is done with the pull api?

Sure, for instance with pull-many:
(require '[datomic.api :as d])
(d/pull-many db '[*] (seq #{1234 5678 9012 4864}))

Related

Ruby convert array of active records or objects into array of hashes

I have an object Persons which is an ActiveRecord model with some fields like :name, :age .etc.
Person has a 1:1 relationship with something called Account where every person has an account .
I have some code that does :
Account.create!(person: current_person)
where current_person is a specified existing Person active record object.
Note : The table Account has a field for person_id
and both of them have has_one in the model for each other.
Now I believe we could do something like below for bulk creation :
Account.create!([{person: person3},{person:: person2} ....])
I have an array of persons but am not sure of the best way to convert to an array of hashes all having the same key.
Basically the reverse of Convert array of hashes to array is what I want to do.
Why not just loop over your array of objects?
[person1, person2].each{|person| Account.create!(person: person)}
But if for any reason any of the items you loop over fail Account.create! you may be left in a bad state, so you may want to wrap this in an Active Record Transaction.
ActiveRecord::Base.transaction do
[person1, person2].each{|person| Account.create!(person: person)}
end
The create method actually persists each hash individually, as shown in the source code, so probably it's not what you are looking for. Either way the following code would do the job:
Account.create!(persons.map { |person| Hash[:person_id, person.id] })
If you need to create all records in the same database operation and are using rails 6+ you could use the insert_all method.
Account.insert_all(persons.map { |person| Hash[:person_id, person.id] })
For previous versions of rails you should consider using activerecord-import gem.
# Combination(1).to_a converts [1, 2, 3] to [[1], [2], [3]]
Account.import [:person_id], persons.pluck(:id).combination(1).to_a

Selecting entitites with the highest value of some attribute

Suppose I have one million article entities in my backend with an inst attribute called date, or one million player entities with an int attribute called points. What's a good way to select the 10 latest articles or top-scoring players?
Do I need to fetch the whole millions to the peer and then sort and drop from them?
Until getting hold of the reverse index becomes a Datomic feature, you could manually define one.
e.g. for a :db.type/instant, create an additional attribute of type :db.type/long which you would fill with
(- (Long/MAX_VALUE) (.getTime date))
and the latest 10 articles could be fetched with
(take 10 (d/index-range db reverse-attr nil nil))
Yes, you would need to fetch all the data, since there's no index that would help you out here.
I would have created my own "index" and normalized this data. You can have a separate set of N entities where you keep as many as you'd like. You could start with 10, or consider storing 100 to trade some (possibly negligible) speed for more flexibility. This index can be stored on a separate "singleton" entity that you add as part of your schema.
;; The attribute that stores the index
{:db/id #db/id[:db.part/db]
:db/ident :indexed-articles
:db/valueType :db.type/ref
:db/cardinality :db.cardinality/many
:db.install/_attribute :db.part/db}
;; The named index entity.
{:db/id #db/id[:db.part/db]
:db/ident :articles-index}
You can have a database function that does this. Every time you insert a new entity that you want to "index", call this function.
[[:db/add tempid :article/title "Foo]
[:db/add tempid :article/date ....]
[:index-article tempid 10]]
The implementation of index-article could look like this:
{:db/id #db/id[:db.part/user]
:db/ident :index-article
:db/fn #db/fn {:lang "clojure"
:params [db article-id idx-size]
:code (concat
(map
(fn [article]
[:db/retract
(d/entid db :articles-index)
:indexed-articles
(:db/id article)])
(->> (datomic.api/entity db :articles-index)
(sort-by (fn [] ... implement me ... ))
(drop (dec idx-size))))
[[:db/add (d/entid db :articles-index) :indexed-articles article-id]])}}
Disclaimer: I haven't actually tested this function, so it probably contains errors :) The general idea is that we remove any "overflow" entities from the set, and add the new one. When idx-size is 10, we want to ensure that only 9 items are in the set, and we add our new item to it.
Now you have an entity you can lookup from index, :articles-index, and the 10 most recent articles can be looked up from the index (all refs are indexed), without causing a full database read.
;; "indexed" set of articles.
(d/entity db :articles-index)
I've been looking into this and think I have a slightly more elegant answer.
Declare your attribute as indexed with :db/index true
{:db/id #db/id[:db.part/db -1]
:db/ident :ocelot/number
:db/valueType :db.type/long
:db/cardinality :db.cardinality/one
:db/doc "An ocelot number"
:db/index true
:db.install/_attribute :db.part/db}
This ensures that the attribute is included in the AVET index.
Then the following gives you access to the "top ten", albeit using the low-level datoms call.
(take-last 10 (d/datoms (db conn) :avet :ocelot/number))
Obviously if you need to do any further filtering ("who are the top ten scorers in this club ?") then this approach won't work, but at that point you have a much smaller amount of data in your hand and shouldn't need to worry about the indexing.
I did look extensively at the aggregation functions available from Datalog and am having trouble getting my head around them - and am uncertain that e.g. max would use this index rather than a full scan of the data. Similarly the (index-range ...) function almost certainly does use this index but requires you to know the start and/or end values.

Query to list all partitions in Datomic

What is a query to list all partitions of a Datomic database?
This should return
[[:db.part/db] [:db.part/tx] [:db.part/user] .... ]
where .... is all of the user defined partitions.
You should be able to get a list of all partitions in the database by searching for all entities associated with the :db.part/db entity via the :db.install/partition attribute:
(ns myns
(:require [datomic.api :as d]))
(defn get-partitions [db]
(d/q '[:find ?ident :where [:db.part/db :db.install/partition ?p]
[?p :db/ident ?ident]]
db))
Note
The current version of Datomic (build 0.8.3524) has a shortcoming such that :db.part/tx and :db.part/user (two of the three built-in partitions) are treated specially and aren't actually associated with :db.part/db via :db.install/partition, so the result of the above query function won't include the two.
This problem is going to be addressed in one of the future builds of Datomic. In the meantime, you should take care of including :db.part/tx and :db.part/user in the result set yourself.
1st method - using query
=> (q '[:find ?i :where
[:db.part/db :db.install/partition ?p] [?p :db/ident ?i]]
(db conn))
2nd method - from db object
(filter #(instance? datomic.db.Partition %) (:elements (db conn)))
The second method returns sequence of datomic.db.Partition objects which may be useful if we want to get additional info about the partition.
Both methods have known bug/inconsistency: they don't return :db.part/tx and :db.part/user built-in partitions.

What is the compatible version of this query for NDB?

Maybe it's wrong but I always use this query for my app:
cme_only = Comput.all().filter('__key__ =', cid.key())
What is the compatible version of this query for NDB?
The Metadata queries are very different..
edit:
cid is an entity and cme_only is an iterable that I'm sure has only one value
cid = Comput.get_by_id(int(self.request.get('id')))
cme_only = Comput.all().filter('__key__ =', cid.key())
and then in template:
{{ for Comput in cme_only }}
I do not like it but it was enough
There's no need for metadata queries. The NDB way to spell a query on __key__ is as follows:
ModelClass.query(ModelClass._key == key_value)
That is, just like querying for property foo is done by filtering on ModelClass.foo == value, ModelClass._key is a pseudo-property representing the key.
The other posters are correct that if you just one a single entity given its full key, using the get() method on the Key object is better (faster and cheaper). Also, if e is an entity (model instance), in NDB, the key is not e.key() but e.key (or e._key -- yes, that's the same _key attribute I mentioned in above, it works as a class attribute and as an instance attribute).
And indeed, if you have a urlsafe key (e.g. 'agFfcg4LEghFbXBsb3llZRgDDA') the way to turn it into a Key object is ndb.Key(urlsafe='agFfcg4LEghFbXBsb3llZRgDDA').
Good luck!
If cid is your entity then you could do that:
from google.appengine.ext import ndb
cme_only = ndb.Key(Comput, cid.key.id()).get()
But this will return you basically the same entity that you start with, the cid, But in general this is one way of querying by key.
You can check more on how to construct keys in the docs.

Case insensitive where clause in gql query for StringProperty

Using the google appengine datastore, is there a way to perform a gql query that specifies a WHERE clause on a StringProperty datatype that is case insensitive? I am not always sure what case the value will be in. The docs specify that the where is case sensitive for my values, is there a way to make this insensitive?
for instance the db Model would be this:
from google.appengine.ext import db
class Product(db.Model):
id = db.IntegerProperty()
category = db.StringProperty()
and the data looks like this:
id category
===================
1 cat1
2 cat2
3 Cat1
4 CAT1
5 CAT3
6 Cat4
7 CaT1
8 CAT5
i would like to say
gqlstring = "WHERE category = '{0}'".format('cat1')
returnvalue = Product.gql(gqlstring)
and have returnvalue contain
id category
===================
1 cat1
3 Cat1
4 CAT1
7 CaT1
I don't think there is an operator like that in the datastore.
Do you control the input of the category data? If so, you should choose a canonical form to store it in (all lowercase or all uppercase). If you need to store the original case for some reason, then you could just store two columns - one with the original, one with the standardized one. That way you can do a normal WHERE clause.
The datastore doesn't support case insensitive comparisons, because you can't index queries that use them (barring an index that transforms values). The solution is to store a normalized version of your string in addition to the standard one, as Peter suggests. The property classes in the AETycoon library may prove helpful, in particular, DerivedProperty.
This thread was helpful and makes me want to contribute with similar approach to make partial search match possible. I add one more field on datastore kind and save each word on normalized phrase as a set and then use IN filter to collide. This is an example with a Clojure. Normalize part should easy translate to java at least (thanks to #raek on #clojure), while database interaction should be convertable to any language:
(use '[clojure.contrib.string :only [split lower-case]])
(use '[appengine-magic.services.datastore :as ds])
; initialize datastore kind entity
(ds/defentity AnswerTextfield [value, nvalue, avalue])
; normalize and lowercase a string
(defn normalize [string-to-normalize]
(lower-case
(apply str
(remove #(= (Character/getType %) Character/NON_SPACING_MARK)
(java.text.Normalizer/normalize string-to-normalize java.text.Normalizer$Form/NFKD)))))
; save original value, normalized value and splitted normalized value
(defn textfield-save! [value]
(ds/save!
(let [nvalue (normalize value)]
(ds/new* AnswerTextfield [value nvalue (split #" " nvalue)]))))
; normalized search
(defn search-normalized [value]
(ds/query :kind AnswerTextfield
:filter [(= :nvalue (normalize value))]))
; partial normalized word search
(defn search-partial [value]
(flatten
(let [coll []]
(for [splitted-value (split #" " (normalize value))]
(merge coll
(ds/query :kind AnswerTextfield
:filter [(in :avalue [splitted-value])]))))))

Resources