List all Entities of single Datastore Kind using GetMulti - google-app-engine

Is there a way for me to use datastore's GetMulti, or another function built into the "appengine/datastore" package, to get all entities of a single kind?
For instance, I have a kind "Queue" with many entities that have two to three properties. Each entity has a unique stringID and what I'm trying to get is a slice or other comparable data type of each unique stringID.
The purpose of Queue is to store some metadata and the unique key names that I'll be looping over and performing a cron task on (e.g. keys "user1", "user2", and "user3" are stored as kind Queue, then - during cron - are looped over and interacted with).
Thanks.

I'm new to Google App Engine and I didn't read the documentation before diving in. Now that I actually read the docs, it looks like I'll be answering my own question. This can be accomplished via a simple query, looping over the Keys, and appending the StringID of each key to a slice of strings:
var queuedUsers []string
q := datastore.NewQuery("Queue").KeysOnly()
keys, _ := q.GetAll(c, nil)
for _, v := range keys {
queuedUsers = append(queuedUsers, v.StringID())
}

Related

Upsert a document with value from filter

I have a collection with following structure in MongoDB:
{
"userId": String,
"refs": Set<String>
}
I need to update the collections with those documents. I want to add to refs a new string, for users that are in the filter $in.
But, if the user is not exists I need to "upsert" him.
In code (golang) it looks like this:
filter := bson.M{
"userId": bson.M{
"$in:": tokens // tokens is []string
}
}
update := bson.M{
"$addToSet": bson.M{
"refs": newReference
}
}
ctx, _ := newDbOperationContext()
_, err := driver.UpdateMany(ctx, filter, update)
So, for existing users it works ok, the reference is added. But, for users that not exists nothing happen.
I set in driver.UpdateMany(bson, bson, opts...) opts to options.UpdateOptions.SetUpsert(true)", but as a result I got a document without userId:
{
"_id": ObjectId("..."),
"refs": ["new_reference"]
}
So, my question is, how to upsert the new values with userId field.
The scale is like 2*10^6 users to update, so I would like to do that using batch request. Creating using "one by one" and updating him is not an option here, I think.
Thanks for your support!
According to previous questions in SO like this one and this other one it does not seem possible to perform multiple upserts using only the $in operator because it will insert only a single document (the one matching the filter):
If no document matches the query criteria, db.collection.update() inserts a single document.
So as mentioned by #Kartavya the best is to perform multiple write operations using BulkWrite.
For that you need to append an upsert op (=WriteModel) for each of the users in tokens as a filter, and for all you can use the same $addToSet update operation:
tokens := [...]string{"userId1", "userId3"}
newRef := "refXXXXXXX"
// all docs can use the same $addToSet update operation
updateOp := bson.D{{"$addToSet", bson.D{{"refs", newRef}}}}
// we'll append one update for each userId in tokens as a filter
upserts := []mongo.WriteModel{}
for _, t := range tokens {
upserts = append(
upserts,
mongo.NewUpdateOneModel().SetFilter(bson.D{{"userId", t}}).SetUpdate(updateOp).SetUpsert(true))
}
opts := options.BulkWrite().SetOrdered(false)
res, err := col.BulkWrite(context.TODO(), upserts, opts)
if err != nil {
log.Fatal(err)
}
fmt.Println(res)
Looking at your use case, I think the best solution will be the following :
Since you have a high scale and wish to make batch requests, it is best to use BulkWrite : The db.collection.bulkWrite() method provides the ability to perform bulk insert, update, and remove operations.
Example : https://godoc.org/go.mongodb.org/mongo-driver/mongo#example-Collection-BulkWrite
This uses UpdateOne Model but it supports UpdateMany Model as well. It also a function of SetUpsert(true)
Now for the _id field : Your updated/upserted document should have _id field for the new document to have that _id field else mongoDb auto-generates an _id field while inserting the document if your upsert document does not have _id field
I think, it will not be much of a pain to have _id field in your documents, so that way your problem is solved.
Regarding the scale, I suggest using BulkWrite with UpdateOne or UpdateMany models.
Hope this helps.
In case of upsert if the document is not present then only the updator part of query is going to insert in the database. So that's why your output is like that. You can see here.

Dynamically query mongodb with golang

I'm trying to query my mongodb database using golang (and the mgo library) with only one function, and the method I am currently using is:
er = c.Find(sel(items)).Sort("-createdAt").All(&result)
Where items is a map and the key is the name of the field I am searching inthe db, and the value is what I want to search by.
and sel() is:
func sel(query map[string]string) bson.M {
result := make(bson.M, len(query))
result[ ] = "$in"
for k, v := range query {
result[k] = v
}
return result
currently it will return all of the results where at least one of the fields matches the input map. (So a logical OR) however I would like it to return the logical AND of these fields.
Does anyone have suggestions on how to modify the existing code or a new way of efficiently querying the database?
Thank you
I don't know what this line is supposed to mean:
result[ ] = "$in"
As it is a compile-time error.
But the elements of the query document (the conditions) are in logical AND connection by default, so this is all it takes:
func sel(query map[string]string) bson.M {
result := make(bson.M, len(query))
for k, v := range query {
result[k] = v
}
return result
}
If this gives you all the documents in the collection, then that means all the key-value pairs match all the documents. Experiment with simple filters to see that it works.
Also note that the mgo package also accepts a wide range of maps and structs, not just bson.M. Documentation of Collection.Find() has this to say about the allowed types:
The document may be a map or a struct value capable of being marshalled with bson. The map may be a generic one using interface{} for its key and/or values, such as bson.M, or it may be a properly typed map. Providing nil as the document is equivalent to providing an empty document such as bson.M{}.
So you can use your map which is of type map[string]string without converting it:
err = c.Find(items).Sort("-createdAt").All(&result)

Best practice to add the ID to a datastore entity?

When creating an entity using an IncompleteKey so that each record is inherently unique, what is the best way to add the key back into the record so it can be passed around in the structure- at the time of creation?
For example, is something like this (untested code) a good idea, using Transactions?
err = datastore.RunInTransaction(c, func(c appengine.Context) error {
incompleteKey := datastore.NewIncompleteKey(c, ENTITY_TYPE, nil)
entityKey, err := datastore.Put(c, incompleteKey, &MyStruct)
if(err != nil) {
return err
}
MyStruct.SelfID = entityKey.IntID()
_, err = datastore.Put(c, entityKey, &MyStruct)
return err
}, nil)
As a followup- I'm guessing this should almost never fail since it will almost never operate over the same incompleteKey?
You don't need to put the MyStruct into DB twice - it's unnecessary overhead. The key stored as a part of the entity and can be retrieved when needed.
There is a good example in docs on how to store an entity and used it ID as a reference: https://cloud.google.com/appengine/docs/go/datastore/entities#Go_Ancestor_paths
When you want to get keys for entities you can do this using this approach:
https://cloud.google.com/appengine/docs/go/datastore/queries#Go_Retrieving_results - (edited) notice in the example that keys and structs are populated in 1 operation.
If you query the an entity by key you already know it ID.
So there is no need to have an ID as a separate property. If you want to pass it around with the entity for your business logic you can create a wrapper - either generalized using interface() for the entity struct or a strongly typed (1 per each entity struct).

Google appengine queries fail with namespacing

I am introducing namespacing into my application, but I have run into an issue with one of my existing queries that performs the following operation in order to determine whether or not an entity exists for the given key.
// c is of type context.Context
c, _ = appengine.Namespace(c, "name")
k := datastore.NewKey(c, "Kind", "", id, nil)
q := datastore.NewQuery("Kind").Filter("__key__ =", k).KeysOnly()
keys, err := q.GetAll(c, nil)
When this command is executed with a namespace applied to the context, it gives back the following error:
datastore_v3 API error 1: __key__ filter namespace is but query namespace is db
I could just use a Get query instead, but I don't need to actually retrieve the entity at all. Plus, keys-only queries are free!
Update
It seems that all queries are failing after I have introduced namespacing. The documentation doesn't mention any sort of special treatment for the indices:
https://cloud.google.com/appengine/docs/go/multitenancy/multitenancy
"By default, the datastore uses the current namespace for datastore requests. The API applies this current namespace to datastore.Key objects when they are created. Therefore, you need to be careful if an application stores Key objects in serialized forms, since the namespace is preserved in those serializations."
Using namespaces with the Datastore
https://cloud.google.com/appengine/docs/go/multitenancy/multitenancy#Go_Using_namespaces_with_the_Datastore

What ways does Go have to easily convert data into bytes or strings

I've been developing a couple apps using Google App Engine Go SDK that use Memcache as a buffer for loading data from the Datastore. As the Memcache is only able to store data as []byte, I often find myself creating functions to encode the various structures as strings, and also functions to reverse the process. Needless to say, it is quite tedious when I need to do this sort of thing 5 times over.
Is there an easy way to convert any arbitrary structure that can be stored in Datastore into []byte in order to store it in Memcache and then load it back without having to create custom code for various structures in GAE Golang?
http://golang.org/pkg/encoding/gob or http://golang.org/pkg/encoding/json can turn arbitrary datatypes into []byte slices given certain rules apply to the datastructures being encoded. You probably want one of them gob will encode to smaller sizes but json is more easily shareable with other languages if that is a requirement.
I found myself needing the same thing. So I created a package called:
AEGo/ds
Documentation | Source
go get github.com/scotch/aego/ds
It uses the same API as the "appengine/datastore" so It will work as a drop in replacement.
import "github.com/scotch/aego/v1/ds"
u = &User{Name: "Bob"}
key := datastore.NewKey(c, "User", "bob", 0, nil)
key, err := ds.Put(c, key, u)
u = new(User)
err = ds.Get(c, key, u)
By default it will cache all Puts and Gets to memcache, but you can modify this behavior by calling the ds.Register method:
ds.Register("User", true, false, false)
The Register method takes a string representing the Kind and 3 bool - userDatastore, useMemcache, useMemory. Passing a true value will cause AEgo/ds to persist the record to that store. The Memory store is useful for records that you do not expect to change, but could contain stale data if you have more then one instance running.
Supported methods are:
Put
PutMulti
Get
GetMulti
Delete
DeleteMulti
AllocateIDs
Note: Currently cashing only occures with Get. GetMulti pulls from the datastore.
AEGo/ds is a work in progress, but the code is well tested. Any feedback would be appreciated.
And to answer you question here's how I serialized the entities to gob for the memcache persistence.

Resources