I'm trying to query my mongodb database using golang (and the mgo library) with only one function, and the method I am currently using is:
er = c.Find(sel(items)).Sort("-createdAt").All(&result)
Where items is a map and the key is the name of the field I am searching inthe db, and the value is what I want to search by.
and sel() is:
func sel(query map[string]string) bson.M {
result := make(bson.M, len(query))
result[ ] = "$in"
for k, v := range query {
result[k] = v
}
return result
currently it will return all of the results where at least one of the fields matches the input map. (So a logical OR) however I would like it to return the logical AND of these fields.
Does anyone have suggestions on how to modify the existing code or a new way of efficiently querying the database?
Thank you
I don't know what this line is supposed to mean:
result[ ] = "$in"
As it is a compile-time error.
But the elements of the query document (the conditions) are in logical AND connection by default, so this is all it takes:
func sel(query map[string]string) bson.M {
result := make(bson.M, len(query))
for k, v := range query {
result[k] = v
}
return result
}
If this gives you all the documents in the collection, then that means all the key-value pairs match all the documents. Experiment with simple filters to see that it works.
Also note that the mgo package also accepts a wide range of maps and structs, not just bson.M. Documentation of Collection.Find() has this to say about the allowed types:
The document may be a map or a struct value capable of being marshalled with bson. The map may be a generic one using interface{} for its key and/or values, such as bson.M, or it may be a properly typed map. Providing nil as the document is equivalent to providing an empty document such as bson.M{}.
So you can use your map which is of type map[string]string without converting it:
err = c.Find(items).Sort("-createdAt").All(&result)
Related
I used to use Milvus1.0. And I can get all IDs from Milvus1.0 by using get_collection_stats and list_id_in_segment APIs.
These days I am trying Milvus2.0. And I also want to get all IDs from Milvus2.0. But I don't find any ways to do it.
milvus v2.0.x supports queries using boolean expressions.
This can be used to return ids by checking if the field is greater than zero.
Let's assume you are using this schema for your collection.
referencing: https://github.com/milvus-io/pymilvus/blob/master/examples/hello_milvus.py
as of 3/8/2022
fields = [
FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False),
FieldSchema(name="random", dtype=DataType.DOUBLE),
FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dim)
]
schema = CollectionSchema(fields, "hello_milvus is the simplest demo to introduce the APIs")
hello_milvus = Collection("hello_milvus", schema, consistency_level="Strong")
Remember to insert something into your collection first... see the pymilvus example.
Here you want to query out all ids (pk)
You cannot currently list ids specific to a segment, but this would return all ids in a collection.
res = hello_milvus.query(
expr = "pk >= 0",
output_fields = ["pk", "embeddings"]
)
for x in res:
print(x["pk"], x["embeddings"])
I think this is the only way to do it now, since they removed list_id_in_segment
I have a collection with following structure in MongoDB:
{
"userId": String,
"refs": Set<String>
}
I need to update the collections with those documents. I want to add to refs a new string, for users that are in the filter $in.
But, if the user is not exists I need to "upsert" him.
In code (golang) it looks like this:
filter := bson.M{
"userId": bson.M{
"$in:": tokens // tokens is []string
}
}
update := bson.M{
"$addToSet": bson.M{
"refs": newReference
}
}
ctx, _ := newDbOperationContext()
_, err := driver.UpdateMany(ctx, filter, update)
So, for existing users it works ok, the reference is added. But, for users that not exists nothing happen.
I set in driver.UpdateMany(bson, bson, opts...) opts to options.UpdateOptions.SetUpsert(true)", but as a result I got a document without userId:
{
"_id": ObjectId("..."),
"refs": ["new_reference"]
}
So, my question is, how to upsert the new values with userId field.
The scale is like 2*10^6 users to update, so I would like to do that using batch request. Creating using "one by one" and updating him is not an option here, I think.
Thanks for your support!
According to previous questions in SO like this one and this other one it does not seem possible to perform multiple upserts using only the $in operator because it will insert only a single document (the one matching the filter):
If no document matches the query criteria, db.collection.update() inserts a single document.
So as mentioned by #Kartavya the best is to perform multiple write operations using BulkWrite.
For that you need to append an upsert op (=WriteModel) for each of the users in tokens as a filter, and for all you can use the same $addToSet update operation:
tokens := [...]string{"userId1", "userId3"}
newRef := "refXXXXXXX"
// all docs can use the same $addToSet update operation
updateOp := bson.D{{"$addToSet", bson.D{{"refs", newRef}}}}
// we'll append one update for each userId in tokens as a filter
upserts := []mongo.WriteModel{}
for _, t := range tokens {
upserts = append(
upserts,
mongo.NewUpdateOneModel().SetFilter(bson.D{{"userId", t}}).SetUpdate(updateOp).SetUpsert(true))
}
opts := options.BulkWrite().SetOrdered(false)
res, err := col.BulkWrite(context.TODO(), upserts, opts)
if err != nil {
log.Fatal(err)
}
fmt.Println(res)
Looking at your use case, I think the best solution will be the following :
Since you have a high scale and wish to make batch requests, it is best to use BulkWrite : The db.collection.bulkWrite() method provides the ability to perform bulk insert, update, and remove operations.
Example : https://godoc.org/go.mongodb.org/mongo-driver/mongo#example-Collection-BulkWrite
This uses UpdateOne Model but it supports UpdateMany Model as well. It also a function of SetUpsert(true)
Now for the _id field : Your updated/upserted document should have _id field for the new document to have that _id field else mongoDb auto-generates an _id field while inserting the document if your upsert document does not have _id field
I think, it will not be much of a pain to have _id field in your documents, so that way your problem is solved.
Regarding the scale, I suggest using BulkWrite with UpdateOne or UpdateMany models.
Hope this helps.
In case of upsert if the document is not present then only the updator part of query is going to insert in the database. So that's why your output is like that. You can see here.
I am trying to create a single search box on my website.
First I split up the search input in multiple strings using split().
Then I am looping over the multiple strings I created with split(), with every string I create a query. These query's will be stored in a list.
In the next step I am trying to execute all those query's and store the results (rows) in another list.
The next thing I want to do is union all these results(rows). In this case the final result will be an output of a query containing all the different keywords used in the searchbox.
This is my code:
def ajaxlivesearch():
str = request.vars.values()[0]
a=str.split()
items = []
q = []
r =[]
for partialstr in a:
q.append((db.profiel.sport.like('%'+partialstr+'%'))|(db.profiel.speelsterkte.like('%'+partialstr+'%'))|(db.profiel.plaats.like('%'+partialstr+'%')))
for query in q:
r.append(db(query).select(groupby=db.profiel.id))
for results in r:
for (i,row) in enumerate(results):
items.append(DIV(A(B(row.id_user.first_name) ,NBSP(1), B(row.id_user.last_name),BR(), I(row.sport),I(','), NBSP(1), I(row.speelsterkte),I(','), NBSP(1),I(row.plaats),HR(), _id="res%s"%i, _href=row.id_user, _onclick="copyToBox($('#res%s').html())"%i), _id="resultLiveSearch"))
return TAG[''](*items)
My question is: How do I union the multiple results(rows)?
You can get the union of two Rows objects (removing duplicates) as follows:
rows_union = rows1 | rows2
However, it would be more efficient to get all the records in a single query. To simplify, you can also use the .contains method rather than using .like and wrapping each term with %s.
fields = ['sport', 'speelsterkte', 'plaats']
query_terms = [db.profiel[f].contains(term) for f in fields for term in a]
query = reduce(lambda a, b: a | b, query_terms)
results = db(query).select()
Also, you are not using any aggregation functions, so it is not clear why you have specified the groupby argument (and in any case, each record has a unique id, so grouping would have no effect). Perhaps you instead meant orderby=db.profiel.id.
Finally, it is probably not a good idea to do request.vars.values()[0], as request.vars is a dictionary-like object, and the particular value of interest is not guaranteed to be the first item in .values(). Instead, just refer to the name of the particular variable (e.g., request.vars.keyword), which is also more efficient because you are extracting a single item rather than converting all values to a list.
Is there a way for me to use datastore's GetMulti, or another function built into the "appengine/datastore" package, to get all entities of a single kind?
For instance, I have a kind "Queue" with many entities that have two to three properties. Each entity has a unique stringID and what I'm trying to get is a slice or other comparable data type of each unique stringID.
The purpose of Queue is to store some metadata and the unique key names that I'll be looping over and performing a cron task on (e.g. keys "user1", "user2", and "user3" are stored as kind Queue, then - during cron - are looped over and interacted with).
Thanks.
I'm new to Google App Engine and I didn't read the documentation before diving in. Now that I actually read the docs, it looks like I'll be answering my own question. This can be accomplished via a simple query, looping over the Keys, and appending the StringID of each key to a slice of strings:
var queuedUsers []string
q := datastore.NewQuery("Queue").KeysOnly()
keys, _ := q.GetAll(c, nil)
for _, v := range keys {
queuedUsers = append(queuedUsers, v.StringID())
}
Let's say I have entities a, b and c all of the same type, and the situation is like this:
entity a is parent for entity b
entity b is parent for entity c
Now if I do the following query
query = ndb.Query(ancestor=a.key)
result = query.fetch()
The result will contain both b and c entities. Is there a way I can filter out c so that only entities that are direct descendants remain? Any way apart from me going through the results and removing them I mean.
The only way to do this is to modify your schema, adding a 'parent' KeyProperty that references an entity's direct parent, then filtering on that.
Actually, this is not supported at all. Nick's answer does work but only if you can specify the entity kind in your query which the OP did not specify:
"Kindless queries cannot include filters on properties. They can, however, filter by Entity Key by passing Entity.KEY_RESERVED_PROPERTY as the property name for the filter. Ascending sorts on Entity.KEY_RESERVED_PROPERTY are also supported."
This is a little late, however it will help anyone with the same problem.
The solution is to first do a keys-only query and take the subset of keys which are direct descendants.
With that subset of keys, you can batch get the desired entities.
I'm unfamiliar with python, so here's an example in go:
directDescKeys := make([]*datastore.Key, 0)
q := datastore.NewQuery("A").Ancestor(parentKey).KeysOnly()
for it := q.Run(ctx);; {
key, err := it.Next(nil)
if err == datastore.Done {
break
} else if err != nil {
// handle error
}
if reflect.DeepEquals(key.Parent(), parentKey) {
directDescKeys = append(directDescKeys, key)
}
}
entities := make([]*A, len(directDescKeys))
if err := datastore.GetMulti(ctx, directDescKeys, entities); err != nil {
// handle error
}