Is my GAE Search corrupt? - google-app-engine

I've got a single index in a GAE Search application.
When I call index.put I get the OverQuotaError: The API call search.IndexDocument() required more quota than is available.
When I got to the GAE Console and look under Search then my index articleIndex contains no documents but its Amount Used is 78.2KB. I've also tried retrieving documents but none are returned.
I've tried using a new index but I can the same error message in my application's logs.
I have a copy of my application and that continues to work fine - this uses the same code and data but in a separate application space.
I created a new app with the code from my "corrupted" installation and the new installation indexes fine.
Has anyone had a GAE Search index that, although empty, is listed as taking up space?
I've tried running my GAE routines right after my daily quota is renewed ().

The quota representing storage usage is reconciled nightly, so if you have recently removed documents from the index that fact will not immediately be reflected. However, you say that using a new index still produces the same problem?
Note that daily quota renewal (not to be confused with the reconciliation mentioned above) does not affect the storage limit.
If you are still having troubles, you can file a report on the external issue tracker, mention the app ID and index name, and we can help investigate the current status of the index.

Related

How can I programmatically determine which Datastore indexes are in error?

When I run update_indexes on Google Datastore I get the message below. It is telling me to determine which indexes are in error by looking at the GUI, then delete these indexes.
I have 51 erroneous indexes out of 200, and copying them out of the GUI is not feasible.
(Edit: By laboriously removing and adding indexes from the datastore-indexes.xml, we identified the one problematic index.)
Good devops procedure demands that we do this sort of thing automatically.
How does one determine which indexes are in error programmatically? (Python, bash, or even Java are OK.)
Cannot build indexes that are in state ERROR.To vacuum and rebuild your indexes:
1. Create a backup of your index.yaml specification.
2. Determine the indexes in state ERROR from your admin console: https://appengine.google.com/datastore/indexes?&app_id=s~myproject
3. Remove the definitions of the indexes in ERROR from your index.yaml file.
4. Run "appcfg.py vacuum_indexes your_app_dir/"
5. Wait until the ERROR indexes no longer appear in your admin console.
6. Replace the modified version of your index.yaml file with the original.
7. Run "appcfg.py update_indexes your_app_dir/"
Unfortunately Cloud Datastore doesn't have a public API for managing indexes and the current command line tools use an internal API that doesn't have access to that information.
We're aiming to have an index management API sometime next year (already working on designs) and I'll make sure this key use case is something we cover.

NDB query().iter() of 1000<n<1500 entities is wigging out

I have a script that, using Remote API, iterates through all entities for a few models. Let's say two models, called FooModel with about 200 entities, and BarModel with about 1200 entities. Each has 15 StringPropertys.
for model in [FooModel, BarModel]:
print 'Downloading {}'.format(model.__name__)
new_items_iter = model.query().iter()
new_items = [i.to_dict() for i in new_items_iter]
print new_items
When I run this in my console, it hangs for a while after printing 'Downloading BarModel'. It hangs until I hit ctrl+C, at which point it prints the downloaded list of items.
When this is run in a Jenkins job, there's no one to press ctrl+C, so it just runs continuously (last night it ran for 6 hours before something, presumably Jenkins, killed it). Datastore activity logs reveal that the datastore was taking 5.5 API calls per second for the entire 6 hours, racking up a few dollars in GAE usage charges in the meantime.
Why is this happening? What's with the weird behavior of ctrl+C? Why is the iterator not finishing?
This is a known issue currently being tracked on the Google App Engine public issue tracker under Issue 12908. The issue was forwarded to the engineering team and progress on this issue will be discussed on said thread. Should this be affecting you, please star the issue to receive updates.
In short, the issue appears to be with the remote_api script. When querying entities of a given kind, it will hang when fetching 1001 + batch_size entities when the batch_size is specified. This does not happen in production outside of the remote_api.
Possible workarounds
Using the remote_api
One could limit the number of entities fetched per script execution using the limit argument for queries. This may be somewhat tedious but the script could simply be executed repeatedly from another script to essentially have the same effect.
Using admin URLs
For repeated operations, it may be worthwhile to build a web UI accessible only to admins. This can be done with the help of the users module as shown here. This is not really practical for a one-time task but far more robust for regular maintenance tasks. As this does not use the remote_api at all, one would not encounter this bug.

Not able to add new project to Google App Engine

I have less than 25 projects (well within the free limit). I have also deleted few unused projects. However, when trying to create a new project in Google Developer Console, I am getting a message "Increase Project Limit - You can create more projects after you request a project limit increase.".
Did the policy about 25 free projects on App Engine change? The document here "https://cloud.google.com/appengine/kb/#create" still says 25 is the limit.
This is on my personal Google account and not on corporate account.
When did you delete your projects? There is a 7-day waiting period before a project and associated data are permanently deleted. And also keep in mind the below details.
Note that after the 7-day waiting period ends, the time it takes to
completely delete a project may vary. For example, if a project has
billing set up, it might not be completely deleted until the current
billing cycle ends, you receive the next bill, and your account is
successfully charged. Additionally, the number and types of services
in use may also affect when the system permanently deletes a project.
You can check Developer Console Help for more information. Furthermore, you can list all your projects pending deletion.
Go to the Google Developers Console.
From the projects list, select Manage all projects.
Below the list of projects, click Projects pending deletion.

App Engine backup never finishes only clue is failure in map reduce worker_callback

Over the last few weeks we have repeatedly failed on doing a complete backup of the data store using the datastore admin tool. We thought the issues had to do with quota errors we were running into so we switched our application from a free to a paid app and we still have problems.
Each time we are attempting to back up to the blobstore and what occurs is that the process never finishes. We see the backup in our Pending Backups list but it never actually completes. We only have a total of 43MB of data right now so we don't see it as a data transfer problem. Looking at our default Task Queues it shows that we have two pending tasks one is a call to /_ah/mapreduce/controller_callback and another is a call to /_ah/mapreduce/worker_callback
The worker_callback racks up its retry count and the only error clue we have is on the Previous Run tab it shows the last http response code to be 500. There is no error message, nothing shows up in our error logs, it just keeps trying over and over again.
We've been able to narrow the backup problems to a specific entity kind for a particular namespace but we can't figure out why that entity kind is failing whereas the others are not. The major difference is the entity kind has a large number of embedded entities, but if the app engine is able to read / put those entities we can't understand why it seems to be having problems backing it up. The particular namespace that the error occurs in has the largest data stored for that entity kind compared to the other namespaces we have setup.
We think if we can see what error is occurring in the worker_callback we may be able to figure out why the backup is failing, or what is wrong with our data that's preventing the backup. Is there something we need to setup / enable through settings / configuration files to give us more detailed information on the backup? Or is there some other avenue we should explore to figure out how to investigate/fix this problem?
I should mention we are using the Java SDK as well as Objectify V3 to work with the data store. We are also backing up data to the Blobstore.
Thank you.
Well with the app engine team's help we figured what the problem was and we worked around the issue. I want to give details in case anyone else runs into this problem.
From issue 8363 the app engine team indicated that from their logs they could see that the map reduce failed because of the large number of properties that our entity kind had. The specific entity kind that was causing the failure had a large number of variable properties that was generating errors when map reduce tried to write out a schema. They indicated that the solution on their end was to ignore entities that were like this in the backup to make it so the backup worked successfully.
What we did to work around the issue and make the backup work was change how we told objectify to store out data. The large number of properties were being created due to our use of the #embedded keyword on a HashMap() class member field. Since the embedded keyword breaks down classes into individual components it was generating a large number of properties. We switched the member field to be #serialized and then ran a conversion process to make it use the new serialized property. This made the backup / restore work again.
You can read more about the differences between embedded and serialized on objectify's website
snielson, would you mind opening an issue on our Public issue tracker here. Remember to add your Application ID so we can further debug this specific scenario.
Thanks!

SOLR 3.6.0, After a full re-index of a bunch of entities, some of my items are not making it into the SOLR index, but no logs are being generated

Use a StreamingUpdateSolrServer, I used the following algorithm to re-index my huge dataset into SOLR.
Initialize StreamingUpdateSolrServer server = new StreamingUpdateSolrServer(solrServerUrl, numDocsToAddInBatch, numOfThreads);
For each Item…
-->Create document
-->Server.add(document)
When all finished,
server.commit();
server.optimize();
The problem:
Some of my items are not making it into the SOLR index, but no logs are being generated to tell me what happened.
I was able to find most of the documents, but some were missing. No errors in any logs – and I have substantial try/catch blocks with logs around all SOLRJ exceptions on the clients site.
Verify logging is not being hidden for the SOLR WAR
You will definitely want to verify that the SOLR server log settings are not hiding the fact that documents are failing to be added to the index.
Because SOLR uses the SLF4J API, your SOLR server could be over-riding the log settings allowing you to see an error message when the document failed to be indexed.
If you have a custom {solr-war}/WEB-INF/classes/logging.properties, you will need to make sure that the settings are not such that it is hiding the error messages.
By default, errors in adding an item should be shown automatically. So if you did not change your SOLR log settings at any point... you should be seeing any errors during indexing in your server log file.
Troubleshoot why Documents are failing to be indexed
In order to investigate this, it is helpful to follow verification step any time after the indexing is complete:
Initialize new log log_fromsolr
Initialize new log log_notfound
For each Item…
-->Search SOLR for the item. If SOLR has the object, log each item’s fields into log_fromsolr on a single line into log_fromsolr. This should include the unqiueKey for your document if you have one.
-->If document cannot be found in SOLR for this item, write a line to log_notfound with all the fields from the object from the database, also supplying the uniqueKey as the first line.
Once the verification step has completed, the log log_notfound created a list of all Documents that failed to be added into the Index.
You can use the log created by log_fromsolr to compare the document fields for an item that made it into the index and one that did not.
Verify it is not an intermittent issue
Sometimes it might be the case that it is not the same Items failing to be added to the index each time you try to index.
If you find objects in the log_notfound log, you will want to back up the current notfound log and run the indexing process again from scratch. Use a diff tool to see the differences between the first notfound log and the second notfound log.
An intermittent problem is evident when you see large numbers of differences in these files (Note: some differences are to be expected if new objects are being created in the database in between the first and second re-indexing).
If your problem is intermittent, it most certainly points at the application code with respect to your SOLR transactions not being committed correctly.
The same documents consistently come up missing each time it indexes
At this point we have to compare documents that are being found from the SOLR index, versus documents that are not getting into the Lucene index. Usually a field-by-field comparison of the object will start turning of some suspicious values that may be causing issues when adding the document to the index.
Try eliminating all the suspicious fields and then re-indexing the entire thing again. See if the documents are still failing to be indexed. If this worked, you will want to start re-introducing the fields that you removed and see if you can pinpoint the one that is the issue.

Resources