Bucket of Staging files after deploying an app engine - google-app-engine

After deploying a google app engine, at least 4 buckets are created in the google cloud storage:
[project-id].appspot.com
staging.[project-id].appspot.com
artifacts.[project-id].appspot.com
vm-containers.[project-id].appspot.com
What are they, and will they incur storage cost? Can they be safely deleted?

I believe the "artifacts" bucket is what they're referring to here. A key point is the following:
Once deployment is complete, App Engine no longer needs the container images. Note that they are not automatically deleted, so to avoid reaching your storage quota, you can safely delete any images you don't need.
I discovered this after (to my great surprise) Google started charging me money every month. I saw that the "artifacts" bucket had a directory named "images". (I naively thought that it had something to do with graphics or photographs, which was quite mysterious as my app doesn't do anything with graphics.)

Staging buckets are described in the App Engine's documentation when Setting Up Google Cloud Storage.
I am quoting relevant information here for future viewers:
Note: When you create a default bucket, you also get a staging bucket
with the same name except that staging. is prepended to it. You can
use this staging bucket for temporary files used for staging and test
purposes; it also has a 5 GB limit, but it is automatically emptied on
a weekly basis.
So in essence, when you create either an app Engine Standard or Flexible, you get these two buckets. You can delete the buckets (I deleted the staging one) and I was able to recover it by running gcloud beta app repair.

They are not mandatory for a GAE app - one has to explicitly enable GCS for a GAE app for some of these to be created.
At least a while back only the 1st 2 were created by default (for a standard environment python app) when GCS was enabled and they are by default empty.
It is possible that the others are created by default as well these days, I'm not sure. But they could also be created by and used for something specific you're doing in/for your app - only you can tell that.
You can check what's in them via the Storage menu in the developer console. That might give a hint as for their usage. For my apps which have such buckets created - they're empty.
From Default Google Cloud Storage bucket:
Applications can use a Default Google Cloud Storage bucket, which has
free quota and doesn't require billing to be enabled for the app. You
create this free default bucket in the Google Cloud Platform Console
App Engine settings page for your project.
The free quota is 5 GB, so as long as you don't reach that you're OK.
Now there is a matter of one bucket mentioned in the docs vs the multiple ones actually seen - debatable, I'm not sure what to suggest.
In short - I'd check the content of these directories. If they're not empty I'd check the estimated costs for any indication that the free 5 GB quota might not be applicable for them. If that's the case I'd investigate the actual usage and decide if to delete something or not.
Otherwise I'd just leave them be.

An update on what staging is for (at least in Python GAE Standard):
https://cloud.google.com/appengine/docs/standard/python3/using-cloud-storage
App Engine also creates a bucket that it uses for temporary storage when it deploys new versions of your app. This bucket, named staging.project-id.appspot.com, is for use by App Engine only. Apps can't interact with this bucket.
Still can't figure out what artifacts is for.

artifacts.[project-id].appspot.com These files in the bucket are created by the google container registry.
WARNING: Deleting them will cause you to lose access to your container registry.

Related

Can I leverage having my app engine and storage in the same bucket?

I have an app engine instance which needs to read frequently from my storage (using http). The files on storage are updated very frequently so it's not relevant to deploy them with the app.
I can't see any way to leverage the "proximity" of having the app and resources in the bucket. Is there a way to gain an advantage.
By leverage/advantage I mean saving costs, saving time, etc. Thanks!
When you create an app, App Engine creates a default bucket that provides the first 5GB of storage for free. The default bucket also includes a free quota for Cloud Storage I/O operations. See Pricing, quotas, and limits for details. You will be charged for storage over the 5GB limit.You can make use of this bucket when performing the l/O operations as per your requirement to read from these resources.
The name of the default bucket is in the following format:
project-id.appspot.com
Also as mentioned in the comments above by DazWilkin you may try to make use of another Cloud Storage bucket (in the same region as the app) for reading operations.

How do I dynamically generate a sitemap with Google App Engine

My website changes every day - I run a news website with new stories every day. I want Google to index my site as often as possible and want/need to autogenerate the sitemap.
I use Google App Engine (with Node.js) to run my site. With GAE - I do not have write-access to the root directory. To post the site map - I need to re-deploy my whole site after generating the map. That is an unnecessarily complex step.
I have searched far and wide and cannot see how to save my sitemap. So - I considered using a static one with a dynamically generated child that I store in another location where I have write access. Google says it wants all linked sitemaps in the same directory. So that appears to be a dead-end.
Can I use "App Deploy" in such a way that only the sitemap is uploaded? Any other possibilities? Appreciate any and all suggestions. It seems unlikely that Google didn't provide some way to solve this.
For a site where new URLs are being created regularly (like a news, blog site, etc), don't 'store' your sitemap. It should be generated on demand i.e. your App should include code to generate the content when the link <your_website>/sitemap.xml is loaded.
Separately, you should note that gcloud app deploy doesn't always deploys all your files. It usually deploys only files that have changed. You can easily confirm this by running the deploy command, changing a single file and then running the deploy command again. You will see that the logs will say something like - Uploading 1 files to Google Cloud Storage and the deploy will be faster. You can change X number of files, deploy again and the message will be updated to indicate it is only deploying x files.
However, I'm not sure what it uses to compute the diff. Maybe it compares it to the files currently in your staging bucket and if the files in the staging bucket have been deleted (they have a default life span of 15 days) it will deploy all the files again (but as I said, I'm not sure of this)

Migrating google cloud App Engine Standard zones

I am trying to deploy an a flask application entirely in the free tier of GCP.
I have deployed it on App Engine Standard in the us-west2 zone, and am now getting charged for cloud storage. It turns out cloud storage only has a free tier in the us-east1, us-west1, and us-central1 zones.
I cannot seem to figure out how to migrate or redeploy my app in the us-west1 region. There is plenty of documentation around migrating zones, but none of it seems to apply to App Engine Standard. Does GCP allow migration of App Engine Standard apps, and if so, how can I do so?
indeed, is not possible to move the region of an app once is set, the documentation states that:
You cannot change an app's region after you set it. App Engine Locations.
But, as well states that:
Cloud Storage Location
When you create an app, App Engine creates a default bucket in Cloud Storage. Generally, the location of this bucket is the region matching the location of your App Engine app.
regarding buckets, seems that is possible to rename it and move it to a different region, so you can give it a try moving your bucket back to free tier and see if that help with your billing, otherwise, as stated in the previous response, you will have to recreate your app basically from scratch.
Moving and renaming buckets
When you create a bucket, you permanently define its name, its geographic location, and the project it is part of. However, you can effectively move or rename your bucket:
--If there is no data in your old bucket, delete the bucket and create another bucket with a new name, in a new location, or in a new project.
-- If you have data in your old bucket, create a new bucket with the desired name, location, and/or project, copy data from the old bucket to the new bucket, and delete the old bucket and its contents. The steps below describe this process.
If you want your new bucket to have the same name as your old bucket, you must temporarily move your data to a bucket with a different name. This lets you delete the original bucket so that you can reuse the bucket name.
Moving data between locations incurs network usage costs. In addition, moving data between buckets may incur retrieval and early deletion fees, if the data being moved are Nearline Storage, Coldline Storage, or Archive Storage objects.
Regards.
First of all, it's not zone, but region in the Google Cloud semantic. But anyway, I understood.
And you can't change the region of a App Engine. You need to delete your project and to create it again. Or to create a new project and set the correct region from the init. Don't forget to save your data is you delete your project.
App Engine is a 13+ years old app and Google Cloud didn't think about this migration from the beginning. It's the weight of the legacy!

Multi-region Cloud Storage charges

I've read a few similar posts but haven't really understood why I am getting charged.
I recently (middle of May) switched from Python 2.7 environment with webapp2 to Python 3 with Flask. I switched from using old ndb to the new ndb. I am using Standard (not Flex) This application has very limited use and I thought was limited to a single region. The project size is 176.3 KB
At the end of the May the following was added to my bill:
Standard Storage US Multi-region 0D5D-6E23-4250 0.33
gibibyte month
I'm wondering why I have 'multi-region' anything. I thought I was set up for a single region. I never got billed for anything using Python 2.7 and webapp2 with old ndb. This application is small with limited use. When I delete the region bucket it just comes back when a new version is deployed. I thought there was a 5GB free use limit?
Could you please check the buckets in the Console Storage browser? When you create an app sometimes is created as well a bucket named multiregionName.artifacts.projectID.appspot.com. This bucket is Multi-region. You can check whether this bucket was created with the new versions you upgraded. If you check in the overview of the Bucket you can get the Creation time, Location-type and Location.
Also, according to GAE Default Cloud Storage bucket:
The Default Cloud Storage bucket has a free quota for daily usage as shown below. You create this free default bucket in the Google Cloud Console App Engine settings page for your project.
The following quotas apply specifically to use of the default bucket. See pricing for Cloud Storage Multi-Regional buckets for a description of these quotas.
In a nutshell, I would check the content of the multiregional bucket. If it has something investigate the actual usage and decide either to delete something or not.
Otherwise just calculate the estimated cost according to the pricing table and verify that is correct.
UPDATE
I created an GAE App and it created the following buckets:
joss-xxx.appspot.com
staging.joss-xxx.appspot.com Multiregion
us.artifacts.joss-xxx.appspot.com Multi-region
I tested deleting the buckets as you did and each time I deploy I get those three buckets again. So here the explanation:
joss-xxx.appspot.com is the default bucket for GAE, and has a 5GB free storage quota and is provided for your usage.
About the staging Bucket I found this (1) and this (2) which mentions:
Note: When you create a default bucket, you also get a staging bucket with the same name except that staging. is prepended to it. You can use this staging bucket for temporary files used for staging and test purposes; it also has a 5 GB limit, but it is automatically emptied on a weekly basis.
App Engine also creates a bucket that it uses for temporary storage when it deploys new versions of your app. This bucket, named staging.project-id.appspot.com, is for use by App Engine only. Apps can't interact with this bucket.
The artifacts bucket, {region}.artifacts.{app-id}.appspot.com is used to build the docker container that GAE Flexible deploys. Even when I deployed an Standard app, this bucket was created. Deleting this bucket will likely increase your deploy times as the new deployment cannot build on your previous deployment.
Given the aforementioned, I think the behavior you are facing is totally expected.
I've seen the same behavior for my app, I'm not sure if you have reached to any conclusion since you posted this question.
I deployed an app in an specific region but Google creates multi-region buckets automatically, in my case two of them are "multi-region" at least in the same region I stated, but for the artifact, I get a "multi-region" bucket in a completely different region, which is the one I'm getting charged by. Either way I'm not going over the free quotas so I don't understand why they are charging me for.
I was thinking on trying a "bucket move", manually recreating a single region bucket with the same name/properties and transferring the actual content to this new one, but I'm not sure if I'll break something, I'm still learning on these GCP topics.

GAE shutdown or restart all the active instances of a service/app

In my app (Google App Engine Standard Python 2.7) I have some flags in global variables that are initialized (read values from memcache/Datastore) when the instance start (at the first request). That variables values doesn't change often, only once a month or in case of emergencies (i.e. when google app engine Taskqueue or Memcache service are not working well, that happened not more than twice a year as reported in GC Status but affected seriously my app and my customers: https://status.cloud.google.com/incident/appengine/15024 https://status.cloud.google.com/incident/appengine/17003).
I don't want to store these flags in memcache nor Datastore for efficiency and costs.
I'm looking for a way to send a message to all instances (see my previous post GAE send requests to all active instances ):
As stated in https://cloud.google.com/appengine/docs/standard/python/how-requests-are-routed
Note: Targeting an instance is not supported in services that are configured for auto scaling or basic scaling. The instance ID must be an integer in the range from 0, up to the total number of instances running. Regardless of your scaling type or instance class, it is not possible to send a request to a specific instance without targeting a service or version within that instance.
but another solution could be:
1) Send a shutdown message/command to all instances of my app or a service
2) Send a restart message/command to all instances of my app or service
I use only automatic scaling, so I'cant send a request targeted to a specific instance (I can get the list of active instances using GAE admin API).
it's there any way to do this programmatically in Python GAE? Manually in the GCP console it's easy when having a few instances, but for 50+ instances it's a pain...
One possible solution (actually more of a workaround), inspired by your comment on the related post, is to obtain a restart of all instances by re-deployment of the same version of the app code.
Automated deployments are also possible using the Google App Engine Admin API, see Deploying Your Apps with the Admin API:
To deploy a version of your app with the Admin API:
Upload your app's resources to Google Cloud Storage.
Create a configuration file that defines your deployment.
Create and send the HTTP request for deploying your app.
It should be noted that (re)deploying an app version which handles 100% of the traffic can cause errors and traffic loss due to:
overwriting the app files actually being in use (see note in Deploying an app)
not giving GAE enough time to spin up sufficient instances fast enough to handle high income traffic rates (more details here)
Using different app versions for the deployments and gradually migrating traffic to the newly deployed apps can completely eliminate such loss. This might not be relevant in your particular case, since the old app version is already impaired.
Automating traffic migration is also possible, see Migrating and Splitting Traffic with the Admin API.
It's possible to use the Google Cloud API to stop all the instances. They would then be automatically scaled back up to the required level. My first attempt at this would be a process where:
The config item was changed
The current list of instances was enumerated from the API
The instances were shutdown over a time period that allows new instances to be spun up and replace them, and how time sensitive the config change is. Perhaps close on instance per 60s.
In terms of using the API you can use the gcloud tool (https://cloud.google.com/sdk/gcloud/reference/app/instances/):
gcloud app instances list
Then delete the instances with:
gcloud app instances delete instanceid --service=s1 --version=v1
There is also a REST API (https://cloud.google.com/appengine/docs/admin-api/reference/rest/v1/apps.services.versions.instances/list):
GET https://appengine.googleapis.com/v1/{parent=apps/*/services/*/versions/*}/instances
DELETE https://appengine.googleapis.com/v1/{name=apps/*/services/*/versions/*/instances/*}

Resources