Can I leverage having my app engine and storage in the same bucket? - google-app-engine

I have an app engine instance which needs to read frequently from my storage (using http). The files on storage are updated very frequently so it's not relevant to deploy them with the app.
I can't see any way to leverage the "proximity" of having the app and resources in the bucket. Is there a way to gain an advantage.
By leverage/advantage I mean saving costs, saving time, etc. Thanks!

When you create an app, App Engine creates a default bucket that provides the first 5GB of storage for free. The default bucket also includes a free quota for Cloud Storage I/O operations. See Pricing, quotas, and limits for details. You will be charged for storage over the 5GB limit.You can make use of this bucket when performing the l/O operations as per your requirement to read from these resources.
The name of the default bucket is in the following format:
project-id.appspot.com
Also as mentioned in the comments above by DazWilkin you may try to make use of another Cloud Storage bucket (in the same region as the app) for reading operations.

Related

Migrating google cloud App Engine Standard zones

I am trying to deploy an a flask application entirely in the free tier of GCP.
I have deployed it on App Engine Standard in the us-west2 zone, and am now getting charged for cloud storage. It turns out cloud storage only has a free tier in the us-east1, us-west1, and us-central1 zones.
I cannot seem to figure out how to migrate or redeploy my app in the us-west1 region. There is plenty of documentation around migrating zones, but none of it seems to apply to App Engine Standard. Does GCP allow migration of App Engine Standard apps, and if so, how can I do so?
indeed, is not possible to move the region of an app once is set, the documentation states that:
You cannot change an app's region after you set it. App Engine Locations.
But, as well states that:
Cloud Storage Location
When you create an app, App Engine creates a default bucket in Cloud Storage. Generally, the location of this bucket is the region matching the location of your App Engine app.
regarding buckets, seems that is possible to rename it and move it to a different region, so you can give it a try moving your bucket back to free tier and see if that help with your billing, otherwise, as stated in the previous response, you will have to recreate your app basically from scratch.
Moving and renaming buckets
When you create a bucket, you permanently define its name, its geographic location, and the project it is part of. However, you can effectively move or rename your bucket:
--If there is no data in your old bucket, delete the bucket and create another bucket with a new name, in a new location, or in a new project.
-- If you have data in your old bucket, create a new bucket with the desired name, location, and/or project, copy data from the old bucket to the new bucket, and delete the old bucket and its contents. The steps below describe this process.
If you want your new bucket to have the same name as your old bucket, you must temporarily move your data to a bucket with a different name. This lets you delete the original bucket so that you can reuse the bucket name.
Moving data between locations incurs network usage costs. In addition, moving data between buckets may incur retrieval and early deletion fees, if the data being moved are Nearline Storage, Coldline Storage, or Archive Storage objects.
Regards.
First of all, it's not zone, but region in the Google Cloud semantic. But anyway, I understood.
And you can't change the region of a App Engine. You need to delete your project and to create it again. Or to create a new project and set the correct region from the init. Don't forget to save your data is you delete your project.
App Engine is a 13+ years old app and Google Cloud didn't think about this migration from the beginning. It's the weight of the legacy!

Multi-region Cloud Storage charges

I've read a few similar posts but haven't really understood why I am getting charged.
I recently (middle of May) switched from Python 2.7 environment with webapp2 to Python 3 with Flask. I switched from using old ndb to the new ndb. I am using Standard (not Flex) This application has very limited use and I thought was limited to a single region. The project size is 176.3 KB
At the end of the May the following was added to my bill:
Standard Storage US Multi-region 0D5D-6E23-4250 0.33
gibibyte month
I'm wondering why I have 'multi-region' anything. I thought I was set up for a single region. I never got billed for anything using Python 2.7 and webapp2 with old ndb. This application is small with limited use. When I delete the region bucket it just comes back when a new version is deployed. I thought there was a 5GB free use limit?
Could you please check the buckets in the Console Storage browser? When you create an app sometimes is created as well a bucket named multiregionName.artifacts.projectID.appspot.com. This bucket is Multi-region. You can check whether this bucket was created with the new versions you upgraded. If you check in the overview of the Bucket you can get the Creation time, Location-type and Location.
Also, according to GAE Default Cloud Storage bucket:
The Default Cloud Storage bucket has a free quota for daily usage as shown below. You create this free default bucket in the Google Cloud Console App Engine settings page for your project.
The following quotas apply specifically to use of the default bucket. See pricing for Cloud Storage Multi-Regional buckets for a description of these quotas.
In a nutshell, I would check the content of the multiregional bucket. If it has something investigate the actual usage and decide either to delete something or not.
Otherwise just calculate the estimated cost according to the pricing table and verify that is correct.
UPDATE
I created an GAE App and it created the following buckets:
joss-xxx.appspot.com
staging.joss-xxx.appspot.com Multiregion
us.artifacts.joss-xxx.appspot.com Multi-region
I tested deleting the buckets as you did and each time I deploy I get those three buckets again. So here the explanation:
joss-xxx.appspot.com is the default bucket for GAE, and has a 5GB free storage quota and is provided for your usage.
About the staging Bucket I found this (1) and this (2) which mentions:
Note: When you create a default bucket, you also get a staging bucket with the same name except that staging. is prepended to it. You can use this staging bucket for temporary files used for staging and test purposes; it also has a 5 GB limit, but it is automatically emptied on a weekly basis.
App Engine also creates a bucket that it uses for temporary storage when it deploys new versions of your app. This bucket, named staging.project-id.appspot.com, is for use by App Engine only. Apps can't interact with this bucket.
The artifacts bucket, {region}.artifacts.{app-id}.appspot.com is used to build the docker container that GAE Flexible deploys. Even when I deployed an Standard app, this bucket was created. Deleting this bucket will likely increase your deploy times as the new deployment cannot build on your previous deployment.
Given the aforementioned, I think the behavior you are facing is totally expected.
I've seen the same behavior for my app, I'm not sure if you have reached to any conclusion since you posted this question.
I deployed an app in an specific region but Google creates multi-region buckets automatically, in my case two of them are "multi-region" at least in the same region I stated, but for the artifact, I get a "multi-region" bucket in a completely different region, which is the one I'm getting charged by. Either way I'm not going over the free quotas so I don't understand why they are charging me for.
I was thinking on trying a "bucket move", manually recreating a single region bucket with the same name/properties and transferring the actual content to this new one, but I'm not sure if I'll break something, I'm still learning on these GCP topics.

How can I enforce rate limit for users downloading from Google Cloud Storage bucket?

I am implementing a dictionary website using App Engine and Cloud Storage. App Engine controls the backend, like user authentication etc., and Cloud Storage is used to store a JSON file for each dictionary entry.
I would like to rate limit how much a user can download in a given time period so they can't bulk download the JSON files and result in a big charge for me. Ideally, the dictionary would display a captcha if a user downloads too much at once, and allow them to keep downloading if they pass the captcha. What is the best way to achieve this?
Is there a specific service for rate limiting based on IP address or authenticated user? Should I do this through App Engine and only access Cloud Storage through App Engine (perhaps slower since it's using some of my dynamic resources to serve static content)? Or is it possible to have the frontend access Cloud Storage and implement the rate limiting on Cloud Storage directly? Is a Cloud bucket the right service for storage, here? And how can I allow search engine indexing bots to bypass the rate limiting?
As explained by Doug Stevenson in this post
"There is no configuration for limiting the volume of downloads for
files stored in Cloud Storage."
and explaining further:
"If you want to limit what end users can do, you will need to route
them through some middleware component that you build that tracks how
they're using your provided API to download files, and restrict what
they can do based on their prior behavior. This is obviously
nontrivial to implement, but it's possible."

Bucket of Staging files after deploying an app engine

After deploying a google app engine, at least 4 buckets are created in the google cloud storage:
[project-id].appspot.com
staging.[project-id].appspot.com
artifacts.[project-id].appspot.com
vm-containers.[project-id].appspot.com
What are they, and will they incur storage cost? Can they be safely deleted?
I believe the "artifacts" bucket is what they're referring to here. A key point is the following:
Once deployment is complete, App Engine no longer needs the container images. Note that they are not automatically deleted, so to avoid reaching your storage quota, you can safely delete any images you don't need.
I discovered this after (to my great surprise) Google started charging me money every month. I saw that the "artifacts" bucket had a directory named "images". (I naively thought that it had something to do with graphics or photographs, which was quite mysterious as my app doesn't do anything with graphics.)
Staging buckets are described in the App Engine's documentation when Setting Up Google Cloud Storage.
I am quoting relevant information here for future viewers:
Note: When you create a default bucket, you also get a staging bucket
with the same name except that staging. is prepended to it. You can
use this staging bucket for temporary files used for staging and test
purposes; it also has a 5 GB limit, but it is automatically emptied on
a weekly basis.
So in essence, when you create either an app Engine Standard or Flexible, you get these two buckets. You can delete the buckets (I deleted the staging one) and I was able to recover it by running gcloud beta app repair.
They are not mandatory for a GAE app - one has to explicitly enable GCS for a GAE app for some of these to be created.
At least a while back only the 1st 2 were created by default (for a standard environment python app) when GCS was enabled and they are by default empty.
It is possible that the others are created by default as well these days, I'm not sure. But they could also be created by and used for something specific you're doing in/for your app - only you can tell that.
You can check what's in them via the Storage menu in the developer console. That might give a hint as for their usage. For my apps which have such buckets created - they're empty.
From Default Google Cloud Storage bucket:
Applications can use a Default Google Cloud Storage bucket, which has
free quota and doesn't require billing to be enabled for the app. You
create this free default bucket in the Google Cloud Platform Console
App Engine settings page for your project.
The free quota is 5 GB, so as long as you don't reach that you're OK.
Now there is a matter of one bucket mentioned in the docs vs the multiple ones actually seen - debatable, I'm not sure what to suggest.
In short - I'd check the content of these directories. If they're not empty I'd check the estimated costs for any indication that the free 5 GB quota might not be applicable for them. If that's the case I'd investigate the actual usage and decide if to delete something or not.
Otherwise I'd just leave them be.
An update on what staging is for (at least in Python GAE Standard):
https://cloud.google.com/appengine/docs/standard/python3/using-cloud-storage
App Engine also creates a bucket that it uses for temporary storage when it deploys new versions of your app. This bucket, named staging.project-id.appspot.com, is for use by App Engine only. Apps can't interact with this bucket.
Still can't figure out what artifacts is for.
artifacts.[project-id].appspot.com These files in the bucket are created by the google container registry.
WARNING: Deleting them will cause you to lose access to your container registry.

Allowing an authenticated user to download a big object stored on Google Storage

I have some big files stored on Google Storage. I would like users to be able to download them only when they are authenticated to my GAE application. The user would use a link of my GAE such as http://myapp.appspot.com/files/hugefile.bin
My first try works for files which sizes are < 32mb. Using the Google Storage experimental API, I could read the file first then serve it to the user. It required my GAE application to be a team member of the project which Google Storage was enabled. Unfortunately this doesn’t work for large files, and it hogs bandwidth by first downloading the file to GAE and then serving it to the player.
Does anyone have an idea on how to carry out that?
You can store files up to 5GB in size using the Blobstore API: http://code.google.com/appengine/docs/python/blobstore/overview.html
Here's the Stackoverflow thread on this: Upload file bigger than 40MB to Google App Engine?
One thing to note, is reading blobstore can only be done in 32MB increments, but the API provides ways to accessing portions of the file for reads: http://code.google.com/appengine/docs/python/blobstore/overview.html#Serving_a_Blob
FYI in the upcoming 1.6.4 release of AppEngine we've added the ability to pass a Google Storage object name to the blobstore.send_blob() to send Google Storage files of any size from you AppEngine application.
Here is the pre-release announcement for 1.6.4.

Resources