index.yaml is not updating - google-app-engine

I expect the index.yaml file to update with the necessary indices when I run queries in my development environment. It claims that it is updating this file in the dev server log, but the file doesn't actually change. Any idea what might be going on?
Here is the entire index.yaml file:
indexes:
# AUTOGENERATED
# This index.yaml is automatically updated whenever the dev_appserver
# detects that a new type of query is run. If you want to manage the
# index.yaml file manually, remove the above marker line (the line
# saying "# AUTOGENERATED"). If you want to manage some indexes
# manually, move them above the marker line. The index.yaml file is
# automatically uploaded to the admin console when you next deploy
# your application using appcfg.py.
The log has several of these lines at the points where I would expect it to add a new index:
INFO 2010-06-20 18:56:23,957 dev_appserver_index.py:205] Updating C:\photohuntservice\main\index.yaml
Not sure if it's important, but I'm using version 1.3.4 of the AppEngine SDK.

Are you certain you're running queries which need composite indexes to be built? Any queries that are on single properties will be served with the default indexes, and won't need index.yaml entries, and any queries that only use equality filters on multiple properties will be executing using a merge-join strategy that doesn't require building custom indexes.
Unless you're getting NeedIndexErrors thrown in production (without a message about the existing indexes not allowing the query to run efficiently enough), your empty index.yaml may be perfectly fine.

There is an issue that Python SDK on Linux doesn't regenerate index.yaml that was created on Windows. It may be related to your case, but it seems that you just don't really have queries that cause automatic index creation in SDK.

Related

Can appengine files in asia.artifacts.../containers/images be safely deleted?

Can appengine files in google cloud store under the bucket asia.artifacts.../containers/images be safely deleted without causing any problems. There is already 160Gb of them after just a few years. The documentation doesn’t make clear what they are for, or why they are retained there:
# gsutil du -sh gs://asia.artifacts.<project>.appspot.com
158.04 GiB gs://asia.artifacts.<project>.appspot.com
I just want to know if I can delete them, or if I need to keep paying for the storage space.
Originally I thought these files might correspond to what can be seen on the "Google Cloud Platform" "Container Registry" "Images" "app-engine-tmp". But even if you delete almost everything under the container registry web interface, there are still thousands of really old files sit-in in this containers/images folder.
If I had to guess the reason for this ever growing pile of probably junk files. I suspect if versions are deleted through the web interface, the underlying files are not removed. Is that correct?
UPDATE: I did find this clue in the cloud build logs that occur when you deploy. I tested out deleting the artifacts bucket on a test project. The project still works, and builds still works. An apparently harmless error message appears in the logs. Perhaps its genuinely safe to delete this artefacts folder. However, it'd be good to have clarity on what these ancient (apparently unused) artefact bucket files are for before deleting.
2021/01/15 11:27:40 Copying from asia.gcr.io/<project>/app-engine-tmp/build-cache/ttl-7d/default/buildpack-cache:latest to asia.gcr.io/sis-au/app-engine-tmp/build-cache/ttl-7d/default/buildpack-cache:f650fd29-3e4e-4448-a388-c19b1d1b8e04
2021/01/15 11:27:42 failed to copy image: GET https://storage.googleapis.com/asia.artifacts.<project>.appspot.com/containers/images/sha256:ca16b83ba5519122d24ee7d343f1f717f8b90c3152d539800dafa05b7fcc20e9?access_token=REDACTED: unsupported status code 404; body: <?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Details>No such object: asia.artifacts.<project>.appspot.com/containers/images/sha256:ca16b83ba5519122d24ee7d343f1f717f8b90c3152d539800dafa05b7fcc20e9</Details></Error>
Unable to tag previous cache image. This is expected for new or infrequent deployments.
It should be safe to delete those. According to Google docs:
Each time you deploy a new version, a container image is created using the Cloud Build service. That container image then runs in the App Engine standard environment.
Built container images are stored in the app-engine folder in Container Registry. You can download these images to keep or run elsewhere. Once deployment is complete, App Engine no longer needs the container images. Note that they are not automatically deleted, so to avoid reaching your storage quota, you can safely delete any images you don't need.
Also as a suggestion, if you don't want to manually delete the images just in case they start piling up again, you can set up Lifecycle Management on your "artifacts" bucket and add a rule to delete old files (for example, 30 days).
This thread is similar to your concern and they have great answers. Feel fee to check it out!
IMPORTANT UPDATE: This answer only applies on Standard environment. The artifacts bucket is used as the backing storage for Flex apps images. It's used when bringing up and autoscaling VMs, so be careful when you consider deleting them.

Appengine runs a stale version of the code -- and stack traces don't match source code

I have a python27 appengine application. My application generates a 500 error early in the code initialization, and I can inspect the stack trace in the StackDriver debugger in the GCP console.
I've since patched the code, and I've re-deployed under the same service name and version name (i.e. gcloud app deploy --version=SAME). Unfortunately, the old error still comes up, and line numbers in the stack traces reflect the files in the buggy deployment. If I use the code viewer to debug the error, I am however brought to the updated patched code in the online viewer -- and there is a mismatch. It behave as if the app instance is holding on to a previous snapshot of the code.
I'm fuzzy on the freshness and eventual consistency guarantees of GAE. Do I have to wait to get everything to serve the latest deployed version? Can I force it to use the newer code right away?
Things I've tried:
I initially assumed the problem had to do with versioning, i.e. maybe requests being load-balanced between instances with the same version, but each with slightly different code. I'm a bit fuzzy on the actual rules that govern which GAE instance gets chosen for a new request (esp whether GAE tries to reuse previous instances based on a source IP). I'm also fuzzy on whether or not active instances get destroyed right away when different code is redeployed under the same version name.
To take that possibility out of the equation, I tried pushing to a new version name, and then deleting all previous versions (using gcloud app versions list to get the list). But it doesn't help -- I still get stack traces from the old code, despite the source being up to date in the GCP console debugger. Waiting a couple hours doesn't do anything either.
I've tried two things:
disabling and re-enabling the application in GAE->Settings
I'd also noticed that there were some .pyc files uploaded in the snapshot, so I removed those and re-deployed.
I discovered that (1) is a very effective way to stop all running appengine instances. When you deploy a new version of a project, a traffic split is created (i.e. 0% for the old version and 100% for the new), but in my experience old instances might still be running if they've been used recently (despite them being configured to receive 0% of traffic). Toggling kills them all immediately. I unfortunately found that my stale code was still being used after re-enabling.
(2) did the trick. It wasn't obvious that .pyc were being uploaded. I discovered it by looking at GCP->StackDriver->Debug and I saw .pyc files in the tree snapshot.
I had recently updated my .gitignore to ignore locally installed pip runtime dependencies for the project (output of pip install -t lib requirements.txt). I don't want those in git, but they do need to ship as part of my appengine project. I had removed the #!.gitignore special include line from .gcloudignore. However, I forgot to re-add *.pyc into my .gcloudignore.
Another way to see the complete set of files included in an app deployment is to increase the verbosity to info on the gcloud app deploy command -- you see a giant json manifest with checksums. I don't typically leave that on because it's hard to visually inspect, but I would have spotted the .pyc in there.

GAE: What's faster loading an include config file from GCS or from cloud SQL

Based on the subdomain that is accessing my application I need to include a different configuration file that sets some variables used throughout the application (the file is included on every page). I'm in two minds about how to do this
1) Include the file from GCS
2) Store the information in a table on Google Cloud SQL and query the database on every page through an included file.
Or am I better off using one of these options and then Memcache.
I've been looking everywhere for what is the fastest option (loading from GCS or selecting from cloud SQL), but haven't been able to find anything.
NB: I don't want to have the files as normal php includes as I don't want to have to redeploy the app every time I setup a new subdomain (different users get different subdomains) and would rather either just update the database or upload a new config file to cloud storage, leaving the app alone.
I would say the most sane solution would be to store the configuration files in the Cloud SQL as you can easily make changes to them even from within the app and using the memcache since it was build exactly for this kind of stuff.
The problem with the GCS is that you cannot simply edit the file and you will have to delete and add a new version every time which is not going to be optimal in a long run.
GCS is cheaper, although for small text files it does not matter much. Otherwise, I don't see much of a difference.

using unit tests to keep index.yaml updated

So far the only way I have been able to keep index.yaml updated when I make code changes is to either hit the urls via browser or using TransparentProxy and the application is being served through dev_appserver.
This sucks.
Is there a way to bootstrap the appengine environment in the unit test runner so that what ever process is used to update the index.yaml can be run without incurring the overhead of the single threaded dev_appserver.
The difference is significant. My testsuite(80% coverage) runs at 2 minutes but does not update index.yaml, if I run the same suite using TransparentProxy to forward request to port 8080, the index.yaml does get updated but it takes about 4 hours. Again, this sucks.
You can use my Nose plugin for this, called nose-gae-index. It uses the internal IndexYamlUpdater class from the SDK, so it is definitely better than proxying requests.
Despite this improvement, there is definitely no need to have it enabled all the time. I use it before deployment and to inspect changes to index configuration caused by new commits.
Remember not to use queries that require indexes in the tests themselves, or they will be added to the configuration file as well!

App engine index building stalled stuck

I am having a problem with indexes building in my App Engine application. There are only about 200 entities in the indexes that are being built, and the process has now been running for over 24 hours.
My application name is romanceapp.
Is there any way that I can re-start or clear the indexes that are being built?
Try to redeploy your application to appspot, I have the same issue and this solved it.
Let me know if this helps.
Greeting, eng.Ilian Iliev
To handle "Error" indexes, first
remove them from your index.yaml file
and run appcfg.py vacuum_indexes.
Then, either reformulate the index
definition and corresponding queries
or remove the entities that are
causing the index to "explode."
Finally, add the index back to
index.yaml and run appcfg.py
update_indexes.
I found it here and it helps me.

Resources