Find CouchDB docs missing an arbitrary field

Find CouchDB docs missing an arbitrary field - database

I need a CouchDB view where I can get back all the documents that don't have an arbitrary field. This is easy to do if you know in advance what fields a document might not have. For example, this lets you send view/my_view/?key="foo" to easily retrieve docs without the "foo" field:
function (doc) {
var fields = [ "foo", "bar", "etc" ];
for (var idx in fields) {
if (!doc.hasOwnProperty(fields[idx])) {
emit(fields[idx], 1);
}
}
}
However, you're limited to asking about the three fields set in the view; something like view/my_view/?key="baz" won't get you anything, even if you have many docs missing that field. I need a view where it will--where I don't need to specify possible missing fields in advance. Any thoughts?

This technique is called the Thai massage. Use it to efficiently find documents not in a view if (and only if) the view is keyed on the document id.
function(doc) {
// _view/fields map, showing all fields of all docs
// In principle you could emit e.g. "foo.bar.baz"
// for nested objects. Obviously I do not.
for (var field in doc)
emit(field, doc._id);
}
function(keys, vals, is_rerun) {
// _view/fields reduce; could also be the string "_count"
return re ? sum(vals) : vals.length;
}
To find documents not having that field,
GET /db/_all_docs and remember all the ids
GET /db/_design/ex/_view/fields?reduce=false&key="some_field"
Compare the ids from _all_docs vs the ids from the query.
The ids in _all_docs but not in the view are those missing that field.
It sounds bad to keep the ids in memory, but you don't have to! You can use a merge sort strategy, iterating through both queries simultaneously. You start with the first id of the has list (from the view) and the first id of the full list (from _all_docs).
If full < has, it is missing the field, redo with the next full element
If full = has, it has the field, redo with the next full element
If full > has, redo with the next has element
Depending on your language, that might be difficult. But it is pretty easy in Javascript, for example, or other event-driven programming frameworks.

Without knowing the possible fields in advance, the answer is easy. You must create a new view to find the missing fields. The view will scan every document, one-by-one.
To avoid disturbing your existing views and design documents, you can use a brand new design document. That way, searching for the missing fields will not impact existing views you may be already using.

Related

Best way to store a single independent document in mongodb

I have MERN application, I have a bunch of collections in my db, now i want to store an object that represents an order:
Say i have "item1, item2, item3" in an "items" collection, each one can be anything really;
I just need the Id's (to reference them), I want the user to choose their order so that i know the correct way of displaying the items (Not the order in the db, an order for a seprate purpose)
I think the best way of doing it is having a single document, with the order data in it, but each document should be in a collection, so the question is, is it right to create a collection only to store a single document in it? or is there a better way?
This is an example of the document i want to store: (The array index is their order)
{
items: [{itemId:xxx, otherprops...}, {itemId: yyy, otherprops...}]
}
The items collection can have 100s of items, so changing the order in that collection is not the correct option for my needs.

is it right to create a collection only to store a single document in
it
If you look at the admin database in mongodb, you'll see that it does something similar to what I think you're trying to do. There's a system.version collection. In that collection, I've seen documents that contain settings-like information. For example, the featureCompatibility property is actually stored as a document with _id: "featureCompatibility". Shard identity information is also stored as a single document in this collection:
{
"_id" : "shardIdentity",
"clusterId" : ObjectId("2bba123c6eeedcd192b19024"),
"shardName" : "shard2",
"configsvrConnectionString" : "configDbRepl/alpha.example.net:28100,beta.example.net:28100,charlie.example.net:28100" }
There is only one such document in the system.version collection. You can very well create your own settings collection where you store these bespoke documents. It's certainly not unheard of.
Take a look at the "Shard Aware" section from the official mongodb documentation to see this type of practice in action:
https://docs.mongodb.com/manual/release-notes/3.6-upgrade-sharded-cluster/#prerequisites

(mongo) Arrays of ids : how should I do it?

today I finished the functional part of my website, so I went into the secure-my-app part of development. I want to give to users only the content they are related to, so, to my teachers ( = a user with user.role == "teacher"), I only want to give them access to a given assignment if their _id is in the assignment.teachersList array of _ids. I want to make this verification in the publish so i MUST get a cursor in the end of the query.
After looking at the OFFICIAL documentation of mongo , it seems like doing what I want should be as simple as :
// in a publish
Assignements.find({ teachersList: this.userId });
However, this always returns me false. First, being afraid of a this context problem, I tried something like :
// in a publish
let self = this;
Assignements.find({ teachersList: self.userId });
and it's not better, I still get nothing. I tried to use Cursor.map() and put my condition there, but as map doesnt return a cursor, I get the data but it's not working either since we are in a publish.
It's written in the doc that the first try I made should work, so no doubts I'm making something wrong, but what .. ?
Right now i'm starting to wonder if the problem comes from the fact that it's an array of _ids. Right now, it's only an array of Strings. And it seems that this.userId in a method only returns a String.... But maybe i'm wrong and I should use Meteor.Mongo.ObjectId(the_string_id) objects instead?
Alright that's it ! Please, if you have any idea why such an easy query doesnt work, tell me ! Thanks :)

teachersList is an array, and not an id. You are trying to find an document with an id that has the value of an array... not going to work...
I usually use this pattern on all private owner collections: add an onInsert event to my collection and then add the user as ownerId to the document being inserted. Then, when it comes to publish, I publish all documents with the ownerId of the user. Clean and simple, no roles even needed...
If, however you decide not to use the pattern above...
From MongoDB:
db.inventory.find( { tags: { $in: [ /^be/, /^st/ ] } } )
This query selects all documents in the inventory collection where the tags field holds an array that contains at least one element that starts with either be or st.
In Meteor your query might look something like:
Assignments.find({ownerId: { $in: teachersList}});

Get names of nodes with firebase [duplicate]

I have the following hierarchy on firebase, some data are hidden for confidentiality:
I'm trying to get a list of videos IDs (underlines in red)
I only can get all nodes, then detect their names and store them in an array!
But this causes low performance; because the dataSnapshot from firebase is very big in my case, so I want to avoid retrieving all the nodes' content then loop over them to get IDs, I need to just retrieve the IDs only, i.e. without their nested elements.
Here's my code:
new Firebase("https://PRIVATE_NAME.firebaseio.com/videos/").once(
'value',
function(dataSnapshot){
// dataSnapshot now contains all the videos ids, lines & links
// this causes many performance issues
// Then I need to loop over all elements to extract ids !
var videoIdIndex = 0;
var videoIds = new Array();
dataSnapshot.forEach(
function(childSnapshot) {
videoIds[videoIdIndex++] = childSnapshot.name();
}
);
}
);
How may I retrieve only IDs to avoid lot of data transfer and to avoid looping over retrived data to get IDs ? is there a way to just retrive these IDs directly ?

UPDATE: There is now a shallow command in the REST API that will fetch just the keys for a path. This has not been added to the SDKs yet.
In Firebase, you can't obtain a list of node names without retrieving the data underneath. Not yet anyways. The performance problems can be addressed with normalization.
Essentially, your goal is to split data into consumable chunks. Store your list of video keys, possible with a couple meta fields like title, etc, in one path, and store the bulk content somewhere else. For example:
/video_meta/id/link, title, ...
/video_lines/id/...
To learn more about denormalizing, check out this article: https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html

It is a bit old, and you probably already know, but in case someone else comes along. You can do this using REST api call, you only need to set the parameter shallow=true
here is the documentation

Get all documents in a CouchDB database except the design documents

In couchDB I have a database with some documents. When I create a view, the view is created inside of this database, together with the documents. Then when I take all the elements of the database, couchDB returns me all the elements including the views. Is there any way to get everything apart from the views?

You may use
/<mydb>/_all_docs?descending=true&endkey="_design0"
The '0' in _design0 makes sure that the output is stopped before the first design document. The optional parameter inclusive_end=false may work as well but did not for me in a short test.
See http://docs.couchdb.org/en/latest/api/database/bulk-api.html for further details.
But I'd also prefer a simple view for that task.

Using this should work:
/<mydb>/_all_docs?endkey="_design"
If you use only the auto-generated IDs, then you probably can also use:
/<mydb>/_all_docs?endkey="_"
but that may cause issues if you use custom IDs since the "_" character falls between uppercase and lowercase letters.

One way to do this is to 'categorize' the documents when you insert them into the CouchDB. A common technique to do this is adding a 'type' property to all the documents you created.
e.g.
{
firstName: 'John',
lastName: 'Doe',
type: 'user'
}
Then you can create a view that returns only documents with that property.
function(doc) {
if (doc.type) {
emit(doc._id, doc);
}
}

Filtering a collection vs. several collections in Backbone?

When is it appropriate to filter a collection vs. having several collections in Backbone?
For example, consider a music library app. It would have a view for displaying genres and another view for displaying the selected genre's music.
Would you rather make one huge collection with all the music and then filter it or several smaller ones?
Having just one collection would allow you add features for filtering by other attributes as well, but suppose you have tons of music: how do you prevent loading it all in when the application starts if the user if only going to need 1 genre?

I think the simplest approach is having a common unique Collection that, intelligently, fetch an already filtered by genre data from the server:
// code simplified and no tested
var SongsCollection = Backbone.Collection.extend({
model: Song,
url: function() {
return '/songs/' + this.genre;
},
initialize: function( opts ){
this.genre = opts.genre;
}
});
var mySongsCollection = new SongsCollection({ genre: "rock" });
mySongsCollection.fetch();
You have to make this Collection to re-fetch data from the server any time the User changes the selected genre:
mySongsCollection.genre = "punk";
mySongsCollection.fetch();

It's mostly a design choice, but my vote would be to choose a scheme that loosely reflects the database storing the collections.
If you're likely to be storing data in an SQL database, you will more likely than not have separate tables for songs and genres. You would probably connect them either via a genre_id column in the song table, or (if songs can have more than one genre) in terms of a separate song_genres join table. Consequently, you would probably want separate collections representing genres and the songs within them. In this case, backbone-relational might be very useful tool for helping keep them straight.
If you're storing information in any kind of relational/key-value/document store, it might make sense to simply store the genre with the song directly and filter accordingly. In this case, you might end up storing your document keys/queries in such a way that you could access songs either directly (e.g., via songs) or through the genre (e.g., genre:genre_id/songs). If this is the route you go, it may be more convenient to simply create a single huge collection of songs and plan to set up corresponding filters in both the application and database environment.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Find CouchDB docs missing an arbitrary field - database

Related

Best way to store a single independent document in mongodb

(mongo) Arrays of ids : how should I do it?

Get names of nodes with firebase [duplicate]

Get all documents in a CouchDB database except the design documents

Filtering a collection vs. several collections in Backbone?

Categories

Resources

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Find CouchDB docs missing an arbitrary field - database

Related

Best way to store a single independent document in mongodb

(mongo) Arrays of ids : how should I do it?

Get names of nodes with firebase [duplicate]

Get all documents in a CouchDB database *except* the design documents

Filtering a collection vs. several collections in Backbone?

Categories

Resources

Get all documents in a CouchDB database except the design documents