NDB Query not returning full object - google-app-engine

I am doing a NDB query, which seems to only be fetching a partial object. For the Model, I've turned off caching, in case that was it. However, a number of properties are coming back with None, when I can see them filled in the Datastore Viewer.
This is with the local development server ( and deployed), and the query is being done by a Backend process.
Note: Clearing the memcache did not help.
NOTE: If I cause the backend to restart, it will start pulling down the correct data.
Basically:
Backend starts querying for instances of a Model every X seconds
Frontend causes a change to an instance of the Model
Backend continues to see the original version of the instance until restarted
Backend code is pretty simple:
while 1:
time.sleep(2)
q = None
res = None
q = core.Agent.query()
res = q.fetch(10)
for a in res:
logging.error("%s" % a.to_dict())
Frontend changes some properties (and it shows in the viewer) but the backend will only show old values. It also seems like a Filter will filter based on correct values, but fetch() returns old stuff.

You need to clear the context cache at the top of the loop, e.g.
while 1:
ndb.get_context().clear_cache()
<rest of your code>

Related

Single get request on frontend returning an entire table in chunks - how to do? Django

Could use some help with this please.
Django and PostgreSQL is the backend for a React frontend project of mine. The frontend needs to fetch and perform calculations on a large number of rows in a table... a million + rows is possible.
Right now, fetching and doing those calculations is implemented, but it takes way too long, and that's locally. When deployed, it doesn't even work because it's not in batches, it's just one big GET request which times out after 30 seconds.
I've set up splicing on the frontend for dealing with POST/PUT related requests to get around that time out issue, but how would I go about doing a GET request in batches? I've been looking into iterator and cursor stuff to try and deal with things in chunks, but I am also confused - if a single GET request is performed on the frontend, is it even possible to send multiple responses? I have seen conflicting information on that.
Some guidance on the best way to address this would be appreciated. If it helps, I am primarily working with a class based way of doing things, Model, Serializer, ModelViewSet.
Here's what I'm working on currently to attempt to do batch get-requests, but not sure if I'm on the right page.
def batch_get(self, request):
count = UploadedShares.objects.all().count()
chunk_size = 500
for i in range(0, count, chunk_size):
shares = UploadedShares.objects.all()[i:i + chunk_size]
serialized_shares = serializers.serialize('json', shares)
return JsonResponse(serialized_shares, safe=False)
Thanks.

Load all documents at once. Not progressively

When I run a .fetch() command, it first returns null, then say suppose I have 100 documents and it will keep on loading from 1 to 100 and the counter keeps updating from 1 to 100 progressively. I don't want that to happen. I want all the results to be displayed at once after the fetch process has been completed.
Also, how can I display a relevant message to the user if no documents exist? The fetch method doesn't work for me since it returns 0 at first and hence "No document found" flashes for a second.
dbName.find({userID:"234234"}).fetch()
Even though the above has 100 docs, it first shows null and then keep loading the documents one by one. I want it load all at once or just return something if no docs exist.
I don't want that to happen. I want all the results to be displayed at once after the fetch process has been completed
To really obtain all documents at once on the client you will have to write a Meteor Method that returns all the documents:
Meteor.methods({
'allDocs' () {
return dbName.find({userID:"234234"}).fetch()
}
})
Note, that you have to call fetch on the cursor to return the documents, otherwise you will face an "unhandled promise rejection error".
Then call it from the client as usually. You can even add the documents to your client side local collection without affecting allow/deny (which should be off / deny all by default):
Meteor.call('allDocs', (err, documents) => {
// ... handle err
// all client collections have a local collection accessible via ._collection
const localCollection = dbName._collection
documents.forEach(doc => localCollection.insert(doc))
})
Advantages:
Returns all documents immediately
Less resources consumed (no publication observers required)
Works with caching tools, such as ground:db, to create offline-first applications
Disadvantages:
You should limit the query and access to your collections using Methods as much as possible, (using mdg:validated-method) which can require much more effort than shown in this examples
Not reactive! If you need reactivity on the client you need to include Tracker and reactive data-sources (ReactiveVar etc.) to provide some decent reactive user experience
Manual syncing can become frustrating and is error prone
Your question is actually about the subscription and it's state of readiness. While it is not yet ready, you can show a loading page, and once it is you can run the .fetch() to get the whole array. This logic could be put in your withTracker call, e.g.:
export default withTracker((props) => {
const sub = Meteor.subscribe('users');
return {
ready: sub.ready(),
users: sub.ready() && Users.find({userID: props.userID}).fetch()
};
})(UserComponent);
Then, in your component, you can decide whether to render a spinner (while ready == false), or the users.
Your question is not entirely clear to me in terms of tools etc (please state which database connector lib are you using), but firstly, given you're doing a database access, most likely, your ".fetch()" call is not a sync function but async, and also most likely, handled by a promise.
Secondly, given that you're using react, you want to set a new state only after you get all the results back.
If fetch is a promise then just do:
dbName.find({userID:"234234"}).fetch().then(results =>
setState({elements:results.data}) // do your processing accordingly
}
By only calling setState inside the promise, you'll always have all the results fetched at that instant and only with that do you update your component state with the setState function - either using your react component class this.setState or with hooks like useState (much cleaner).

ndb query on Google Cloud Platform intermittently returning nothing

I have a Python application deployed on Google Cloud Platform. There is a Google Cloud Datastore in the background, with two Kinds. I use NDB to pull the data into the application.
class AttEvent(ndb.Model):
event = ndb.StringProperty()
matchdate = ndb.DateTimeProperty()
class MainPage(webapp2.RequestHandler):
def get(self):
query = AttEvent.query().order(AttEvent.matchdate)
for q in query.fetch():
try:
# application code
One of the Kinds (AtEvent in the code above) is causing me trouble. The app will deploy and work as expected for hours / days, but then intermittently stop returning data. Debugging shows the q object is legitimate object of the type AttEvent, but for each of the items in the values collection, it says "(Object has no fields)". When the application code attempts to reference a property of the model (i.e. q.event), it fails.
The query will suddenly start working again, minutes / hours later, even if I take no action. I can't see any pattern or apparent cause. Obviously this isn't ideal from a user perspective.
The Kind that is causing trouble is static data and only actually contains 3 entities. The other Kind is transactional, contains thousands of records, but has never exhibited the same behaviour.
The intermittent nature of the fault leads me to believe this is something to do with caching, but I am fairly new to Python and GCP, so I am not exactly sure. I've tried doing a context.clear_cache() before the query, but it has no effect.
Am I missing something obvious?
I don't know why this is happening, but I have a possible work around. Since the data is static and the entities seem to be small, you could store them in instance memory instead of querying for them every time you need them.
Store the entities in a module level variable like this:
att_entities = AttEvent.query().order(AttEvent.matchdate).fetch()
class AttEvent(ndb.Model):
event = ndb.StringProperty()
matchdate = ndb.DateTimeProperty()
class MainPage(webapp2.RequestHandler):
def get(self):
for q in att_entities:
try:
# application code
You would get the entities only when a new instance is launched so as long as it works the first time you are all set. As a bonus, it will make the get call faster since you don't need to retrieve the data from the data store.
You might need to add extra logic to cause att_entities to be updated as needed.

How to cut down API requests in AngularJS app

My problem is I'm making too many API requests, which I want to cut down if possible. Below I'll describe the situation:
I have three pages, all linked using ngRoute. Like this:
Page A: Teams (list of teams)
URL: "/teams"
Page B: Team Details (list of players)
URL: "/teams/team-details"
Page C: Player Details (list of player stats)
URL: "/teams/team-details/player-details"
Page A is populated by pulling an array of the teams from an API very easily using a simple $resource.query() request, and using ng-repeat to iterate through them.
Page B is populated by calling an html template and populating specific fields with values from a separate API request to the /team-details endpoint, taking the team_id value from the clicked element on Page A.
Page C (as with page B) takes a player_id from the clicked player on Page B and calls the /player-details endpoint using that value. This is yet another separate request.
This all works fine, but as you can imagine, a single user could quite easily rack up in excess of 100 API requests within an hour.
I have a request limit of 1000/hour, so if a mere 10 users are online at the same time, it could easily exceed my limit and shut down my API.
If I could access the API as one single master endpoint that outputted all data and subdata in one set, then that would solve my problem, but since I need to request separate endpoints I can't see how to do this.
Is there a better way to approach this? Or are these excessive API requests the only way?
Any help would be appreciated.
As far as I can see, Your model looks suitable for the application and meets how an API-driven application should work...
However, One potential cut-down you could make is to cache some of the results locally. i.e. store a local version of some of the data that is unlikely to change within a session. For example, If the number of teams is unlikely to change, then store the results of 1 API request locally and use that instead of recalling data from your API.
Following on from this route, you could choose to only update certain data after a certain time period. So, if a user has looked at some team-details then refuse to update this data for the next 10-20minutes. However, this does again depend how time-sensitive your data is.

Meteor one time or "static" publish without collection tracking

Suppose that one needs to send the same collection of 10,000 documents down to every client for a Meteor app.
At a high level, I'm aware that the server does some bookkeeping for every client subscription - namely, it tracks the state of the subscription so that it can send the appropriate changes for the client. However, this is horribly inefficient if each client has the same large data set where each document has many fields.
It seems that there used to be a way to send a "static" publish down the wire, where the initial query was published and never changed again. This seems like a much more efficient way to do this.
Is there a correct way to do this in the current version of Meteor (0.6.5.1)?
EDIT: As a clarification, this question isn't about client-side reactivity. It's about reducing the overhead of server-side tracking of client collections.
A related question: Is there a way to tell meteor a collection is static (will never change)?
Update: It turns out that doing this in Meteor 0.7 or earlier will incur some serious performance issues. See https://stackoverflow.com/a/21835534/586086 for how we got around this.
http://docs.meteor.com/#find:
Statics.find({}, {reactive: false} )
Edited to reflect comment:
Do you have some information that the reactive: false param is only client side? You may be right, it's a reasonable, maybe likely interpretation. I don't have time to check, but I thought this may also be a server side directive, saying not to poll the mongo result set. Willing to learn...
You say
However, this is horribly inefficient if each client has the same large data set where each document has many fields.
Now we are possibly discussing the efficiency of the server code, and its polling of the mongo source for updates that happen outside of from the server. Please make that another question, which is far above my ability to answer! I doubt that is happening once per connected client, more likely is a sync between app server info and mongo server.
The client requests you issue, including sorting, should all be labelled non-reactive. That is separate from whether you can issue them with sorting instructions, or whether they can be retriggered through other reactivity, but which need not include a trip to the server. Once each document reaches the client side, it is cached. You can still do whatever minimongo does, no loss in ability. There is no client asking server if there are updates, you don't need to shut that off. The server pushes only when needed.
I think using the manual publish ( this.added ) still works to get rid of overhead created by the server observing data for changes. The observers either need to be added manually or are created by returning a Collection.curser.
If the data set is big you might also be concerned about the overhead of a merge box holding a copy of the data for each client. To get rid of that you could copy the collection locally and stop the subscription.
var staticData = new Meteor.Collection( "staticData" );
if (Meteor.isServer ){
var dataToPublish = staticData.find().fetch(); // query mongo when server starts
Meteor.publish( "publishOnce" , function () {
var self = this;
dataToPublish.forEach(function (doc) {
self.added("staticData", doc._id, doc); //sends data to client and will not continue to observe collection
});
});
}
if ( Meteor.isClient ){
var subHandle = Meteor.subscribe( "publishOnce" ); // fills client 'staticData' collection but also leave merge box copy of data on server
var staticDataLocal = new Meteor.Collection( null ); // to store data after subscription stops
Deps.autorun( function(){
if ( subHandle.ready() ){
staticData.find( {} ).forEach( function ( doc ){
staticDataLocal.insert( doc ); // move all data to local copy
});
subHandle.stop(); // removes 'publishOnce' data from merge box on server but leaves 'staticData' collection empty on client
}
});
}
update: I added comments to the code to make my approach more clear. The meteor docs for stop() on the subscribe handle say "This will typically result in the server directing the client to remove the subscription's data from the client's cache" so maybe there is a way to stop the subscription ( remove from merge box ) that leaves the data on the client. That would be ideal and avoid the copying overhead on the client.
Anyway the original approach with set and flush would also have left the data in merge box so maybe that is alright.
As you've already pointed out yourself in googlegroups, you should use a Meteor Method for sending static data to the client.
And there is this neat package for working with Methods without async headaches.
Also, you could script out the data to a js file, as either an array or an object, minimize it, then link to it as a distinct resource. See
http://developer.yahoo.com/performance/rules.html for Add an Expires or a Cache-Control Header. You probably don't want meteor to bundle it for you.
This would be the least traffic, and could make subsequent loads of your site much swifter.
as a response to a Meteor call, return an array of documents (use fetch()) No reactivity or logging. On client, create a dep when you do a query, or retrieve the key from the session, and it is reactive on the client.
Mini mongo just does js array/object manipulation with an syntax interpreting dsl between you and your data.
The new fast-render package makes one time publish to a client collection possible.
var staticData = new Meteor.Collection ('staticData');
if ( Meteor.isServer ){
FastRender.onAllRoutes( function(){
this.find( staticData, {} );
});
}

Resources