Google Clouds Appengine storage - google-app-engine

Im new to the Appengine and I can't seem to work out the static file serving. I have read through all the official related docs.
I understand that tagging the directory or files as static in the app.yaml will "host them separated from your source". However I can't find any more info.
After the app engine deployment, I see two GCS buckets. The default bucket "projectname.appspot.com" and a staging bucket "staging.projectname.appspot.com". My project has many static files, which are all served correctly however the default bucket is totally empty. Where are these files actually stored?
I can find very little information on what exactly the staging bucket is and how its used other than its "for temporary files" which are deleted after a week. This staging bucket seems to hold both the source and static files however all the file names are hashed so its not immediately obvious.
My question is where exactly do app engine static files go? Where and how are they stored? My app is a Django based app if its relevant.

The static files aren't saved to your app's buckets, but in Google's infra dedicated for directly serving static content, which isn't accessible to you like your app's buckets.
See also the accepted answer to Does Google App Engine use google CDN to distribute static resources?
Normally you shouldn't need to worry about where the static files are stored - as long as the deployment is successful you can trust Google will serve them for your app.

Related

How do I dynamically generate a sitemap with Google App Engine

My website changes every day - I run a news website with new stories every day. I want Google to index my site as often as possible and want/need to autogenerate the sitemap.
I use Google App Engine (with Node.js) to run my site. With GAE - I do not have write-access to the root directory. To post the site map - I need to re-deploy my whole site after generating the map. That is an unnecessarily complex step.
I have searched far and wide and cannot see how to save my sitemap. So - I considered using a static one with a dynamically generated child that I store in another location where I have write access. Google says it wants all linked sitemaps in the same directory. So that appears to be a dead-end.
Can I use "App Deploy" in such a way that only the sitemap is uploaded? Any other possibilities? Appreciate any and all suggestions. It seems unlikely that Google didn't provide some way to solve this.
For a site where new URLs are being created regularly (like a news, blog site, etc), don't 'store' your sitemap. It should be generated on demand i.e. your App should include code to generate the content when the link <your_website>/sitemap.xml is loaded.
Separately, you should note that gcloud app deploy doesn't always deploys all your files. It usually deploys only files that have changed. You can easily confirm this by running the deploy command, changing a single file and then running the deploy command again. You will see that the logs will say something like - Uploading 1 files to Google Cloud Storage and the deploy will be faster. You can change X number of files, deploy again and the message will be updated to indicate it is only deploying x files.
However, I'm not sure what it uses to compute the diff. Maybe it compares it to the files currently in your staging bucket and if the files in the staging bucket have been deleted (they have a default life span of 15 days) it will deploy all the files again (but as I said, I'm not sure of this)

Are Google App Engine deployed files private by default?

Excuse the naivety here, I'm new to GAE, and haven't been able to find much in the available literature / answers about the deployed filesystem's public status...
My question is quite simple:
Assuming a standard app.yaml config, are the files that get pushed to GAE with gcloud app deploy publicly inaccessible, unless exposed by (in the example of Node.js) an express endpoint?
I want to make sure sensitive data like key files (for reference in code) in our deployed bundle are not exposed, and that the local filesystem of a deployment is only accessible privately by the code itself.
Unless you work with the “static” files & dir directives no data should be made visible to outside users by Google.
Authenticated Admin users (youself) can see all files deployed to the server in the admin console unless you disable “code downloads” (which was available on legacy App Engine but seems to be removed now).

Bucket of Staging files after deploying an app engine

After deploying a google app engine, at least 4 buckets are created in the google cloud storage:
[project-id].appspot.com
staging.[project-id].appspot.com
artifacts.[project-id].appspot.com
vm-containers.[project-id].appspot.com
What are they, and will they incur storage cost? Can they be safely deleted?
I believe the "artifacts" bucket is what they're referring to here. A key point is the following:
Once deployment is complete, App Engine no longer needs the container images. Note that they are not automatically deleted, so to avoid reaching your storage quota, you can safely delete any images you don't need.
I discovered this after (to my great surprise) Google started charging me money every month. I saw that the "artifacts" bucket had a directory named "images". (I naively thought that it had something to do with graphics or photographs, which was quite mysterious as my app doesn't do anything with graphics.)
Staging buckets are described in the App Engine's documentation when Setting Up Google Cloud Storage.
I am quoting relevant information here for future viewers:
Note: When you create a default bucket, you also get a staging bucket
with the same name except that staging. is prepended to it. You can
use this staging bucket for temporary files used for staging and test
purposes; it also has a 5 GB limit, but it is automatically emptied on
a weekly basis.
So in essence, when you create either an app Engine Standard or Flexible, you get these two buckets. You can delete the buckets (I deleted the staging one) and I was able to recover it by running gcloud beta app repair.
They are not mandatory for a GAE app - one has to explicitly enable GCS for a GAE app for some of these to be created.
At least a while back only the 1st 2 were created by default (for a standard environment python app) when GCS was enabled and they are by default empty.
It is possible that the others are created by default as well these days, I'm not sure. But they could also be created by and used for something specific you're doing in/for your app - only you can tell that.
You can check what's in them via the Storage menu in the developer console. That might give a hint as for their usage. For my apps which have such buckets created - they're empty.
From Default Google Cloud Storage bucket:
Applications can use a Default Google Cloud Storage bucket, which has
free quota and doesn't require billing to be enabled for the app. You
create this free default bucket in the Google Cloud Platform Console
App Engine settings page for your project.
The free quota is 5 GB, so as long as you don't reach that you're OK.
Now there is a matter of one bucket mentioned in the docs vs the multiple ones actually seen - debatable, I'm not sure what to suggest.
In short - I'd check the content of these directories. If they're not empty I'd check the estimated costs for any indication that the free 5 GB quota might not be applicable for them. If that's the case I'd investigate the actual usage and decide if to delete something or not.
Otherwise I'd just leave them be.
An update on what staging is for (at least in Python GAE Standard):
https://cloud.google.com/appengine/docs/standard/python3/using-cloud-storage
App Engine also creates a bucket that it uses for temporary storage when it deploys new versions of your app. This bucket, named staging.project-id.appspot.com, is for use by App Engine only. Apps can't interact with this bucket.
Still can't figure out what artifacts is for.
artifacts.[project-id].appspot.com These files in the bucket are created by the google container registry.
WARNING: Deleting them will cause you to lose access to your container registry.

Allowing an authenticated user to download a big object stored on Google Storage

I have some big files stored on Google Storage. I would like users to be able to download them only when they are authenticated to my GAE application. The user would use a link of my GAE such as http://myapp.appspot.com/files/hugefile.bin
My first try works for files which sizes are < 32mb. Using the Google Storage experimental API, I could read the file first then serve it to the user. It required my GAE application to be a team member of the project which Google Storage was enabled. Unfortunately this doesn’t work for large files, and it hogs bandwidth by first downloading the file to GAE and then serving it to the player.
Does anyone have an idea on how to carry out that?
You can store files up to 5GB in size using the Blobstore API: http://code.google.com/appengine/docs/python/blobstore/overview.html
Here's the Stackoverflow thread on this: Upload file bigger than 40MB to Google App Engine?
One thing to note, is reading blobstore can only be done in 32MB increments, but the API provides ways to accessing portions of the file for reads: http://code.google.com/appengine/docs/python/blobstore/overview.html#Serving_a_Blob
FYI in the upcoming 1.6.4 release of AppEngine we've added the ability to pass a Google Storage object name to the blobstore.send_blob() to send Google Storage files of any size from you AppEngine application.
Here is the pre-release announcement for 1.6.4.

Deployment of static directory contents to google app engine

I've deployed my first GAE application and I am getting "TemplateDoesNotExist" exception at my main page. It feels like my static directory content is not uploaded to GAE.
Isn't it possible that I update (appcfg.py update myapp/) all my files including the static ones and run it standalone on myappid.appspot.com ?
by the way here you can see the problem:
http://pollbook.appspot.com
PS: my app works perfect locally
Your templates should not be stored in a directory that you refer to as "static" in app.yaml. Static directories are for literally static files that will be served to end users by the CDN without changing. These files cannot be read by the templating engine. It works locally because the dev_appserver does not precisely emulate the production server.
Put your templates in a different directory like /templates or something. You do not need to refer to this directory in your app.yaml.

Resources