Meteor one time or "static" publish without collection tracking - database

Suppose that one needs to send the same collection of 10,000 documents down to every client for a Meteor app.
At a high level, I'm aware that the server does some bookkeeping for every client subscription - namely, it tracks the state of the subscription so that it can send the appropriate changes for the client. However, this is horribly inefficient if each client has the same large data set where each document has many fields.
It seems that there used to be a way to send a "static" publish down the wire, where the initial query was published and never changed again. This seems like a much more efficient way to do this.
Is there a correct way to do this in the current version of Meteor (0.6.5.1)?
EDIT: As a clarification, this question isn't about client-side reactivity. It's about reducing the overhead of server-side tracking of client collections.
A related question: Is there a way to tell meteor a collection is static (will never change)?
Update: It turns out that doing this in Meteor 0.7 or earlier will incur some serious performance issues. See https://stackoverflow.com/a/21835534/586086 for how we got around this.

http://docs.meteor.com/#find:
Statics.find({}, {reactive: false} )
Edited to reflect comment:
Do you have some information that the reactive: false param is only client side? You may be right, it's a reasonable, maybe likely interpretation. I don't have time to check, but I thought this may also be a server side directive, saying not to poll the mongo result set. Willing to learn...
You say
However, this is horribly inefficient if each client has the same large data set where each document has many fields.
Now we are possibly discussing the efficiency of the server code, and its polling of the mongo source for updates that happen outside of from the server. Please make that another question, which is far above my ability to answer! I doubt that is happening once per connected client, more likely is a sync between app server info and mongo server.
The client requests you issue, including sorting, should all be labelled non-reactive. That is separate from whether you can issue them with sorting instructions, or whether they can be retriggered through other reactivity, but which need not include a trip to the server. Once each document reaches the client side, it is cached. You can still do whatever minimongo does, no loss in ability. There is no client asking server if there are updates, you don't need to shut that off. The server pushes only when needed.

I think using the manual publish ( this.added ) still works to get rid of overhead created by the server observing data for changes. The observers either need to be added manually or are created by returning a Collection.curser.
If the data set is big you might also be concerned about the overhead of a merge box holding a copy of the data for each client. To get rid of that you could copy the collection locally and stop the subscription.
var staticData = new Meteor.Collection( "staticData" );
if (Meteor.isServer ){
var dataToPublish = staticData.find().fetch(); // query mongo when server starts
Meteor.publish( "publishOnce" , function () {
var self = this;
dataToPublish.forEach(function (doc) {
self.added("staticData", doc._id, doc); //sends data to client and will not continue to observe collection
});
});
}
if ( Meteor.isClient ){
var subHandle = Meteor.subscribe( "publishOnce" ); // fills client 'staticData' collection but also leave merge box copy of data on server
var staticDataLocal = new Meteor.Collection( null ); // to store data after subscription stops
Deps.autorun( function(){
if ( subHandle.ready() ){
staticData.find( {} ).forEach( function ( doc ){
staticDataLocal.insert( doc ); // move all data to local copy
});
subHandle.stop(); // removes 'publishOnce' data from merge box on server but leaves 'staticData' collection empty on client
}
});
}
update: I added comments to the code to make my approach more clear. The meteor docs for stop() on the subscribe handle say "This will typically result in the server directing the client to remove the subscription's data from the client's cache" so maybe there is a way to stop the subscription ( remove from merge box ) that leaves the data on the client. That would be ideal and avoid the copying overhead on the client.
Anyway the original approach with set and flush would also have left the data in merge box so maybe that is alright.

As you've already pointed out yourself in googlegroups, you should use a Meteor Method for sending static data to the client.
And there is this neat package for working with Methods without async headaches.

Also, you could script out the data to a js file, as either an array or an object, minimize it, then link to it as a distinct resource. See
http://developer.yahoo.com/performance/rules.html for Add an Expires or a Cache-Control Header. You probably don't want meteor to bundle it for you.
This would be the least traffic, and could make subsequent loads of your site much swifter.

as a response to a Meteor call, return an array of documents (use fetch()) No reactivity or logging. On client, create a dep when you do a query, or retrieve the key from the session, and it is reactive on the client.
Mini mongo just does js array/object manipulation with an syntax interpreting dsl between you and your data.

The new fast-render package makes one time publish to a client collection possible.
var staticData = new Meteor.Collection ('staticData');
if ( Meteor.isServer ){
FastRender.onAllRoutes( function(){
this.find( staticData, {} );
});
}

Related

Should I clean my objects before sending them to the server

I were working on my app today and when my friend looked on my code he told me that before I'm making an HTTP request to update objects I should remove the properties that are not used in my server and I didn't understand why.
I didn't find any best practice or any explanation on the web why it is better to clean my objects before sending them to my server...
Let's say I have a dictionary with 100 keys & values with the same properties (but different values) like this one:
{
'11':{'id':11, 'name':'test1', 'station':2, 'price': 2, 'people':6, 'show':true, 'light': true},
'12':{'id':12, 'name':'test2', 'station':4, 'price': 2, 'people: 1, 'show': true, 'light': false},
....
}
The only thing I need to change is the station of each pair. The new station number is set on my client and sent to my server to make an update in my DB for each pair...
Should I iterate over the dictionary and clean every object before making an HTTP request to my server as my friend said?
I can not add a comment because of my reputation, so I'll put as an answer
Not necessarily, it depends a lot on how your server's API works, if it expects an entire object, it's no use cleaning, now if you have the option to send only the modified element, you do not have to send the entire object.
The HTTP request will work in the same way with a single piece or with an integer object, but you can shorten the data traffic in kbps by sending less, only the Required, like as the changed values
Summary, it depends a lot on your approach, working single values and not whole objects you can do more generic functions and improve their entire scope.
Check: THIS It's similar to your question.
EDIT:
Maybe the cleanup he's referring to, is the question of clearing the code and sending only the necessary, so I understood the scope of the question
Remember that the less you pass, the more intact the original object will be (on the server).
It is a good practice to create generic (modulable) functions that only work with the necessary changes.
Couple reasons that come to mind:
plan for the unknown: today, your server doesn't care about the people attribute. But imagine you add something server side and a people attribute appears and is a string. Now all your clients fail, because they try to push numbers to a string
save the world: data is energy, and you're wasting it by sending more data that your server can handle, even if it's just a little
save your own energy: sending more attributes is likely to mean more work (to write the code and/or test it)

Load all documents at once. Not progressively

When I run a .fetch() command, it first returns null, then say suppose I have 100 documents and it will keep on loading from 1 to 100 and the counter keeps updating from 1 to 100 progressively. I don't want that to happen. I want all the results to be displayed at once after the fetch process has been completed.
Also, how can I display a relevant message to the user if no documents exist? The fetch method doesn't work for me since it returns 0 at first and hence "No document found" flashes for a second.
dbName.find({userID:"234234"}).fetch()
Even though the above has 100 docs, it first shows null and then keep loading the documents one by one. I want it load all at once or just return something if no docs exist.
I don't want that to happen. I want all the results to be displayed at once after the fetch process has been completed
To really obtain all documents at once on the client you will have to write a Meteor Method that returns all the documents:
Meteor.methods({
'allDocs' () {
return dbName.find({userID:"234234"}).fetch()
}
})
Note, that you have to call fetch on the cursor to return the documents, otherwise you will face an "unhandled promise rejection error".
Then call it from the client as usually. You can even add the documents to your client side local collection without affecting allow/deny (which should be off / deny all by default):
Meteor.call('allDocs', (err, documents) => {
// ... handle err
// all client collections have a local collection accessible via ._collection
const localCollection = dbName._collection
documents.forEach(doc => localCollection.insert(doc))
})
Advantages:
Returns all documents immediately
Less resources consumed (no publication observers required)
Works with caching tools, such as ground:db, to create offline-first applications
Disadvantages:
You should limit the query and access to your collections using Methods as much as possible, (using mdg:validated-method) which can require much more effort than shown in this examples
Not reactive! If you need reactivity on the client you need to include Tracker and reactive data-sources (ReactiveVar etc.) to provide some decent reactive user experience
Manual syncing can become frustrating and is error prone
Your question is actually about the subscription and it's state of readiness. While it is not yet ready, you can show a loading page, and once it is you can run the .fetch() to get the whole array. This logic could be put in your withTracker call, e.g.:
export default withTracker((props) => {
const sub = Meteor.subscribe('users');
return {
ready: sub.ready(),
users: sub.ready() && Users.find({userID: props.userID}).fetch()
};
})(UserComponent);
Then, in your component, you can decide whether to render a spinner (while ready == false), or the users.
Your question is not entirely clear to me in terms of tools etc (please state which database connector lib are you using), but firstly, given you're doing a database access, most likely, your ".fetch()" call is not a sync function but async, and also most likely, handled by a promise.
Secondly, given that you're using react, you want to set a new state only after you get all the results back.
If fetch is a promise then just do:
dbName.find({userID:"234234"}).fetch().then(results =>
setState({elements:results.data}) // do your processing accordingly
}
By only calling setState inside the promise, you'll always have all the results fetched at that instant and only with that do you update your component state with the setState function - either using your react component class this.setState or with hooks like useState (much cleaner).

New Google Realtime Document timing issue

I am creating a new in-memory realtime document from a large JSON:
var newDoc = gapi.drive.realtime.loadFromJson(jsonData);
Then saving that new document to a newly created drive file:
newDoc.saveAs(file.id);
I also monitor "isSaving" on the document:
newDoc.addEventListener(gapi.drive.realtime.EventType.DOCUMENT_SAVE_STATE_CHANGED, onSaveStateChange);
function onSaveStateChange(e) {
blah...
}
The problem is that when, according to the API, the document is apparently no longer saving, it IS in fact still uploading (as checked by Resource Monitor), and if I try to open that document during that time, I get unpredictable results.
This appears to be a realtime bug with the setting of isSaving, or the triggering of DOCUMENT_SAVE_STATE_CHANGED.
I badly need a way to determine when the new file is ACTUALLY available for use. A hack of some sort, or having to make an extra call would be fine... but as is, putting in any arbitrarily long delay won't account for a slow or intermittent network.

Keeping repository synced with multiple clients

I have a WPF application that uses entity framework. I am going to be implementing a repository pattern to make interactions with EF simple and more testable. Multiple clients can use this application and connect to the same database and do CRUD operations. I am trying to think of a way to synchronize clients repositories when one makes a change to the database. Could anyone give me some direction on how one would solve this type of issue, and some possible patterns that would be beneficial for this type of problem?
I would be very open to any information/books on how to keep clients synchronized, and even be alerted of things other clients are doing(The only thing I could think of was having a server process running that passes messages around). Thank you
The easiest way by far to keep every client UI up to date is just to simply refresh the data every so often. If it's really that important, you can set a DispatcherTimer to tick every minute when you can get the latest data that is being displayed.
Clearly, I'm not suggesting that you refresh an item that is being edited, but if you get the fresh data, you can certainly compare collections with what's being displayed currently. Rather than just replacing the old collection items with the new, you can be more user friendly and just add the new ones, remove the deleted ones and update the newer ones.
You could even detect whether an item being currently edited has been saved by another user since the current user opened it and alert them to the fact. So rather than concentrating on some system to track all data changes, you should put your effort into being able to detect changes between two sets of data and then seamlessly integrating it into the current UI state.
UPDATE >>>
There is absolutely no benefit from holding a complete set of data in your application (or repository). In fact, you may well find that it adds detrimental effects, due to the extra RAM requirements. If you are polling data every few minutes, then it will always be up to date anyway.
So rather than asking for all of the data all of the time, just ask for what the user wants to see (dependant on which view they are currently in) and update it every now and then. I do this by simply fetching the same data that the view requires when it is first opened. I wrote some methods that compare every property of every item with their older counterparts in the UI and switch old for new.
Think of the Equals method... You could do something like this:
public override bool Equals(Release otherRelease)
{
return base.Equals(otherRelease) && Title == otherRelease.Title &&
Artist.Equals(otherRelease.Artist) && Artists.Equals(otherRelease.Artists);
}
(Don't actually use the Equals method though, or you'll run into problems later). And then something like this:
if (!oldRelease.Equals(newRelease)) oldRelease.UpdatePropertyValues(newRelease);
And/Or this:
if (!oldReleases.Contains(newRelease) oldReleases.Add(newRelease);
I'm guessing that you get the picture now.

NDB Query not returning full object

I am doing a NDB query, which seems to only be fetching a partial object. For the Model, I've turned off caching, in case that was it. However, a number of properties are coming back with None, when I can see them filled in the Datastore Viewer.
This is with the local development server ( and deployed), and the query is being done by a Backend process.
Note: Clearing the memcache did not help.
NOTE: If I cause the backend to restart, it will start pulling down the correct data.
Basically:
Backend starts querying for instances of a Model every X seconds
Frontend causes a change to an instance of the Model
Backend continues to see the original version of the instance until restarted
Backend code is pretty simple:
while 1:
time.sleep(2)
q = None
res = None
q = core.Agent.query()
res = q.fetch(10)
for a in res:
logging.error("%s" % a.to_dict())
Frontend changes some properties (and it shows in the viewer) but the backend will only show old values. It also seems like a Filter will filter based on correct values, but fetch() returns old stuff.
You need to clear the context cache at the top of the loop, e.g.
while 1:
ndb.get_context().clear_cache()
<rest of your code>

Resources