I'd like my users to be able to update the slug on the URL, like so:
url.co/username/projectname
I could use the primary key but unfortunately Firestore does not allow any modifcation on assigned uid once set so I created a unique slug field.
Example of structure:
projects: {
P10syfRWpT32fsceMKEm6X332Yt2: {
slug: "majestic-slug",
...
},
K41syfeMKEmpT72fcseMlEm6X337: {
slug: "beautiful-slug",
...
},
}
A way to modify the slug would be to delete and copy the data on a new document, doing this becomes complicated as I have subcollections attached to the document.
I'm aware I can query by document key like so:
var doc = db.collection("projects");
var query = doc.where("slug", "==", "beautiful-slug").limit(1).get();
Here comes the questions.
Wouldn't this be highly impractical as if I have more than +1000 docs in my database, each time I will have to call a project (url.co/username/projectname) wouldn't it cost +1000 reads as it has to query through all the documents? If yes, what would be the correct way?
As stated in this answer on StackOverflow: https://stackoverflow.com/a/49725001/7846567, only the document returned by a query is counted as a read operation.
Now for your special case:
doc.where("slug", "==", "beautiful-slug").limit(1).get();
This will indeed result in a lot of read operations on the Firestore server until it finds the correct document. But by using limit(1) you will only receive a single document, this way only a single read operation is counted against your limits.
Using the where() function is the correct and recommended approach to your problem.
Related
I'm hoping this is something really simple that I've miss understood as I'm new to both DynamoDb and nodeJs.
I have a table that has id, siteId.
These are being used to create a Global Secondary Index named IdxSiteId.
I need to be able to grab some of the other Item's values (not shown in screenshot) using the siteId. Reading things the best option is to use Query rather than GetItemBatch or Scan. Through trail and error got to this point.
const getWarnings = async (siteId) => {
const params = {
TableName: process.env.WARNINGS_TABLE_NAME,
IndexName: 'IdxSiteId',
KeyConditionExpression: 'SiteId = :var_siteId',
ExpressionAttributeValues: {
':var_siteId': siteId
},
ProjectionExpression: 'id, endTime, startTime, warningSubType',
ScanIndexForward: false
};
return new DynamoDB.DocumentClient().query(params).promise();
};
While this is the closest it has appeared to working I'm getting the following error Error retrieving current warnings ValidationException: Query condition missed key schema element: siteId and looking at the examples online I don't know what I'm doing wrong at this point.
As I said I'm sure this is super simple, but I could really do with a pointer.
Field names are case sensitive. Your GSI partition key is siteId. Your query is SiteId = :var_siteId. It should be siteId = :var_siteId.
I have a List<Key> which I would like to retrieve the full data records for but with applying additional filtering to it.
I can retrieve them via dbService.lookup(Project, keys) but lookup doesn't allow me to apply additional filtering.
This is essentially what I want to do:
dbService.query(Project)
..filter('__key__ IN', keys)
..filter('acl_read IN', roles)
..run();
but since __key__ is not supported in Google Cloud's Dart implementation, I cannot run this query.
I could do:
projects = dbService.lookup(keys);
projects.removeWhere((project) => (project.acl_read.fold(false, (result, key) => result || members.contains(key))));
but this seems not like the right way of achieving this.
So what's the right way of doing this?
There isn't a server-based method to do what you're looking to do, so your method of post filtering on the client-side is how you'd do it..
Alternatively, if you know that all querying all the keys with your filter results in a small set of keys then what you have in List, then do a full query first and then find the Union of results and List
I am new to cloudant , no-sql data base (i had worked on mongodb )
1) is there any cloudant ui to write the queires to find the resultset for developing.
2) how to create map-reduce in cloudant ?..
can u please reply me or send your thoughts.
The search indexes are written in JavaScript (at the moment, Cloduant has launched their own "Cloudant Query" which promises to be easier to work with but I haven't had the time to try it properly yet.)
Say you have documents in your DB which contain a field called "UserName" and you want to create a view on all these. You could write a function like this;
function(doc) {
if ( typeof doc.UserName !== "undefined" ) {
emit([doc.UserName], doc._id);
}
}
For example (it will output the user names and document ids)
If a given user name could be associated with multiple documents you could do this, for example;
function(doc) {
if ( typeof doc.UserName !== "undefined" ) {
emit([doc.UserName,doc._id], 1);
}
}
and also use the built-in "count" or "sum" reduce functions that Cloudant provides to tally the number of documents a given user name is associated with etc.
You can use the UI in the Cloudant DB dashboard to execute queries or (as I personally favour) use a tool like Postman (https://www.getpostman.com/)
One word of warning though; error- and sanity -checking of your JavaScript code is pretty much non-existent and you'll only know that something isn't working when you hit "save & build index" which can be a major pain if you're working on large databases (it can grind the whole thing to a halt). A pro tip, therefore, is to work out your indexes on smaller data sets in some safe little sandbox database before you let it lose on anything important...
All of this is supposedly going to be Much Better with Cloudant Query.
I want to get an entity key knowing entity ID and an ancestor.
ID is unique within entity group defined by the ancestor.
It seems to me that it's not possible using ndb interface. As I understand datastore it may be caused by the fact that this operation requires full index scan to perform.
The workaround I used is to create a computed property in the model, which will contain the id part of the key. I'm able now to do an ancestor query and get the key
class SomeModel(ndb.Model):
ID = ndb.ComputedProperty( lambda self: self.key.id() )
#classmethod
def id_to_key(cls, identifier, ancestor):
return cls.query(cls.ID == identifier,
ancestor = ancestor.key ).get( keys_only = True)
It seems to work, but are there any better solutions to this problem?
Update
It seems that for datastore the natural solution is to use full paths instead of identifiers. Initially I thought it'd be too burdensome. After reading dragonx answer I redesigned my application. To my suprise everything looks much simpler now. Additional benefits are that my entities will use less space and I won't need additional indexes.
I ran into this problem too. I think you do have the solution.
The better solution would be to stop using IDs to reference entities, and store either the actual key or a full path.
Internally, I use keys instead of IDs.
On my rest API, I used to do http://url/kind/id (where id looked like "123") to fetch an entity. I modified that to provide the complete ancestor path to the entity: http://url/kind/ancestor-ancestor-id (789-456-123), I'd then parse that string, generate a key, and then get by key.
Since you have full information about your ancestor and you know your id, you could directly create your key and get the entity, as follows:
my_key = ndb.Key(Ancestor, ancestor.key.id(), SomeModel, id)
entity = my_key.get()
This way you avoid making a query that costs more than a get operation both in terms of money and speed.
Hope this helps.
I want to make a little addition to dargonx's answer.
In my application on front-end I use string representation of keys:
str(instance.key())
When I need to make some changes with instence even if it is a descendant I use only string representation of its key. For example I have key_str -- argument from request to delete instance':
instance = Kind.get(key_str)
instance.delete()
My solution is using urlsafe to get item without worry about parent id:
pk = ndb.Key(Product, 1234)
usafe = LocationItem.get_by_id(5678, parent=pk).key.urlsafe()
# now can get by urlsafe
item = ndb.Key(urlsafe=usafe)
print item
I would like to create a logger using CouchDB. Basically, everytime someone accesses the file, I would like like to write to the database the username and time the file has been accessed. If this was MySQL, I would just add a row for every access correspond to the user. I am not sure what to do in CouchDB. Would I need to store each access in array? Then what do I do during update, is there a way to append to the document? Would each user have his own document?
I couldn't find any documentation on how to append to an existing document or array without retrieving and updating the entire document. So for every event you log, you'll have to retrieve the entire document, update it and save it to the database. So you'll want to keep the documents small for two reasons:
Log files/documents tend to grow big. You don't want to send large documents across the wire for each new log entry you add.
Log files/documents tend to get updated a lot. If all log entries are stored in a single document and you're trying to write a lot of concurrent log entries, you're likely to run into mismatching document revisions on updates.
Your suggestion of user-based documents sounds like a good solution, as it will keep the documents small. Also, a single user is unlikely to generate concurrent log entries, minimizing any race conditions.
Another option would be to store a new document for each log entry. Then you'll never have to update an existing document, eliminating any race conditions and the need to send large documents between your application and the database.
Niels' answer is going down the right path with transactions. As he said, you will want to create a different document for each access - think of them as actions. Here's what one of those documents might look like
{
"_id": "32 char hash",
"_rev": "32 char hash",
"when": Unix time stamp,
"by": "some unique identifier
}
If you were tracking multiple files, then you'd want to add a "file" field and include a unique identifier.
Now the power of Map/Reduce begins to really shine, as it's extremely good at aggregating multiple pieces of data. Here's how to get the total number of views:
Map:
function(doc)
{
emit(doc.at, 1);
}
Reduce:
function(keys, values, rereduce)
{
return sum(values);
}
The reason I threw the time stamp (doc.at) into the key is that it allows us to get total views for a range of time. Ex., /dbName/_design/designDocName/_view/viewName?startkey=1000&endkey=2000&group=true gives us the total number of views between those two time stamps.
Cheers.
Although Sam's answer is an ok pattern to follow I wanted to point out that there is, indeed, a nice way to append to a Couch document. It just isn't very well documented yet.
By defining an update function in your design document and using that to append to an array inside a couch document you may be able to save considerable disk space. Plus, you end up with a 1:1 correlation between the file you're logging accesses on and the couch doc that represents that file. This is how I imagine a doc might look:
{
"_id": "some/file/path/name.txt",
"_rev": "32 char hash",
"accesses": [
{"at": 1282839291, "by": "ben"},
{"at": 1282839305, "by": "kate"},
{"at": 1282839367, "by": "ozone"}
]
}
One caveat: You will need to encode the "/" as %2F when you request it from CouchDB or you'll get an error. Using slashes in document ids is totally ok.
And here is a pair of map/reduce functions:
function(doc)
{
if (doc.accesses) {
for (i=0; i < doc.accesses.length; i++) {
event = doc.accesses[i];
emit([doc._id, event.by, event.at], 1);
}
}
}
function(keys, values, rereduce)
{
return sum(values);
}
And now we can see another benefit of storing all accesses for a given file in one JSON document: to get a list of all accesses on a document just make a get request for the corresponding document. In this case:
GET http://127.0.0.1:5984/dbname/some%2Ffile%2Fpath%2Fname.txt
If you wanted to count the number of times each file was accessed by each user you'll query the view like so:
GET http://127.0.0.1:5984/test/_design/touch/_view/log?group_level=2
Use group_level=1 if you just want to count total accesses per file.
Finally, here is the update function you can use to append onto that doc.accesses array:
function(doc, req) {
var whom = req.query.by;
var when = Math.round(new Date().getTime() / 1000);
if (!doc.accesses) doc.accesses = [];
var event = {"at": when, "by": whom}
doc.accesses.push(event);
var message = 'Logged ' + event.by + ' accessing ' + doc._id + ' at ' + event.at;
return [doc, message];
}
Now whenever you need to log an access to a file issue a request like the following (depending on how you name your design document and update function):
http://127.0.0.1:5984/my_database/_design/my_designdoc/_update/update_function_name/some%2Ffile%2Fpath%2Fname.txt?by=username
A comment to the last two anwers is that they refer to CouchBase not Apache CouchDb.
It is however possible to define updatehandlers in CouchDb but I have not used it.
http://wiki.apache.org/couchdb/Document_Update_Handlers