Google Cloud Storage - Public object url e super slow updating - google-app-engine

I have a bucket with READ permission to allUsers and it's working fine but the public url link https://storage.googleapis.com/example_bucket/example.png takes ages to update: if I change the image in storage for a different one with the same name, the bucket view shows the correct image as well as the not public image url https://storage.cloud.google.com/example_bucket/example.png however the public url shows the old image and it takes a long time to update. Could someone explain if this is normal or if I'm doing something wrong?

You can set the cache-control when you are uploading the object:
Using gsutil when
Uploading
gsutil -D -h Cache-Control:"Cache-Control:private, max-age=0, no-transform" cp file gs://BUCKET/file
Editing: gsutil set meta
gsutil setmeta -h Cache-Control:"Cache-Control:private, max-age=0, no-transform" gs://BUCKET/file
Or through the console:
Currently, there is no way to set a default cache-control for the bucket.
You might be interested in taking a look into this Viewing / Editing Metadata

Objects created with READ permission to allUsers by default are served with cache-control: public, max-age=3600. With this cache-control in place updates to the object could not be reflected at caches for an hour.

I was coming across this same issue while serving up profile images for my users. I fixed this by chaining ?ignoreCache=1 onto the public url.

Related

React App url parameter with S3 and CloudFront

My apologies if the information that I have provided is vague as I am not so experience with AWS and React.
I have a React Application being deployed on S3 and CloudFront as per what is suggested in the following link.
Use S3 and CloudFront to host Static Single Page Apps (SPAs) with HTTPs and www-redirects
So most of the things are working fine. I have 403 and 404 errors being redirected to index.html. However the issue comes in where I have query parameters in my url. eg. https://example.com/example?sample=123 when I enter the url in my browser the query string gets removed from the url. The end result I got is https://example.com/example I have read some articles about forwarding query parameters but it's not working for me.
AWS Documentation - Query String Parameters
Hope I will be able to get some advise here. Thanks in advance.
The example?sample=123 is redirected to example because S3 sees example?sample=123 as path (a folder named example?sample=123), it will throw 404 as there is no such folder.
As you have mentioned, you have configured 404 -> index.html, the browser then goes back to example, which is very likely the default page of your react app.
Overall it looks like your query string is being cleared, actually it is lost during the redirection.
The solution includes three parts:
React
You can follow these two great tutorials, one for NextJs and another for RCA.
The way it works is to detect #! in the path, keep and store the query string after redirection.
S3
As included in the two links above, you have to set the redirection rule of the S3 Bucket, to add a #!/ prefix before the path on 403 or 404, it helps React to determine which parts of the url include query string. You can configure it in Properties -> Static website hosting -> Redirection rules – optional. You need to also set index.html as the Index document and enable static web hosting with the correct permission configured.
CloudFront
In General, set Default Root Object to index.html, make sure you don't make it as /index.html.
In Origin, set Origin domain to the S3 Static Web Hosting URL (http://[bucket-name].s3-website.[region].amazonaws.com, do not choose the bucket itself.
In Behavior, change Viewer to Redirect HTTP to HTTPS, set Origin request policy - optional to AllViewer to let all query strings go through.
Hope it helps.

Using aws s3 and cloudfront to host React application.Can anyone suggest configuration to access dynamic urls?

For e.g. website url is https://www.myreactapp.com. It has some other pages with dynamic get parameters.
https://www.myreactapp.com/category/1
https://www.myreactapp.com/category/2
It's giving me Access Denied error
I had the same issue where i'm trying to access content at run time using ajax.
Set S3 bucket Access as "Objects can be public", No need to set "Public" Access for Static website hosting.
Use S3 Origin if you want CloudFront to deliver any objects that place in S3 bucket. But if you generate run time content, its batter to use Custom Origin.
For Custom Origin keep Note: https://docs.aws.amazon.com/general/latest/gr/s3.html#s3_website_region_endpoints
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/DownloadDistS3AndCustomOrigins.html
The solution for the OP was to update the origin from S3 to Custom Origin domain. This allowed expected behaviour to work.
Validated that bucket was public, error 403 was caused by key not existing.

S3 signed urls change everytime on load even though keys are the same;no caching

I have encountered a problem where I need to cache image urls in browser after first load so that on every refresh same files are not fetched again and instead they are obtained from browser cache. The blocker here is that s3 signed url contains 'date' and 'signature' parameters which change on every request and hence I cannot cache it. Is there some workaround to this?
Sample urls :
https://bucket.region-name.amazonaws.com/67/13/14/design1.png_1547473003445/V0/thumbnail/design1.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIATY536SGM6CWFJ%2F20190116%2Fap-south-1%2Fs3%2Faws4_request&X-Amz-Date=20190116T093449Z&X-Amz-Expires=900&X-Amz-Security-Token=FQoGZXIvYXdz%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDOQ2d%2FbrCADhrzi5LSKsAVrViSpTmyeTwlv8mgStqYJquyL2u4i3zqSOAFRE8fbHy7EbxH5yAmmWM94clRMm9to9LJDaxP96tAM4Za%2BFSzfr3fBTpHy%2Fq8N8fMT4%2FLv3Q5oX1k%2Fj9meYHpcH539LOLu8LmRuXGrPlbuHb7l4z7ZAWFB5MvootGvp0pfcEh6BqXr9R0iygJq3LWwoBhr5A9dRqSsLfWx9KTRTFi9KBkI%2FYtZCjEejdaVsExooufX74QU%3D&X-Amz-Signature=314c2ab2be7db4a90a3b20a79e2b53e8b915ca612ad3c6794136a4dac0fe6119&X-Amz-SignedHeaders=host
refereshed one :
https://bucket.region-name.amazonaws.com/67/13/14/design1.png_1547473003445/V0/thumbnail/design1.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA536UOMZINFU%2F20190116%2Fap-south-1%2Fs3%2Faws4_request&X-Amz-Date=20190116T093659Z&X-Amz-Expires=900&X-Amz-Security-Token=FQoGZXIvYXdz%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaDF4rMeLkeIEnbQWu6CKsAWTaVEcUia6oXeuaDObUF5Cirhzko1le9KGQfPKs5ZIwWk6o0qzHIzCe9uYdfBSmanXrfDxRHK33zbccphSwQkPI8mp%2Fl%2FGljXZALFDKZdRny4DnF2MXfy5WiFKBVSYHZ5onNZxSA4VrgNHeYbe6drI6QwMR9cHij13D8RK2XYDlmM6oVaCjGdMgL4QdpdHangaV0ZEq2GfOAYFfIps9nCM0WCH3Z3%2BpJFVfF9kouvb74QU%3D&X-Amz-Signature=5c5dead27c779299f5aef84e60e7c88a13f4cada9baa77e74e62b51caa1c0099&X-Amz-SignedHeaders=host
The presigned URLs are used to give temporal permissons to a URL in a private bucket, you could opt for a public bucket or and an Authentication endpoint to the API.

Google Edge Cache : is it compatible with HTTPS?

After some configuration :
setting response header Cache-control,
deploying app with a custom domain name,
I managed to leverage the server-side Edge Cache of Google Front-End for some HTTP traffic on a sample app.
The cache hits appear in the Logs console as 204, while non-cached responses are 200.
My question is : can I expect the same behavior for a company website which enforces HTTPS ?
I guess it depends how the Google datacenter distibuted architecture works, and where the SSL certificates are stored, but my networking/security skills are limited.
I can confirm to you that the edge caching works for server side requests served over https as well, even though I have no insights on how it works inside the GFE.
I just ran a quick query in the logs of one of our app with a filter set to status:204 and to only see the hits doing 204 on a specific servlet (such as we do not see all the static content):
I do not think there is a way to see that the serving was https or add a query filter on the logs, but I manually verified that some of these are for served over https.
As you mentioned, cache control headers are required to get this working. Here are the cache control headers we set:
Cache-Control: public, max-age=3600
Pragma: Public

Getting mp3 data from google app engine's blobstore via ajax

I'm trying to store audio files in google app engine's blobstore and play them in a browser. The problem I'm running into is that the data I'm getting in the browser is the actual mp3 data. I was expecting to get a url to play the mp3 in the blobstore. So, my question is, what do I need to change to get a url to play the blob instead of the audio data?
Here is my server side handler.
class ServeBlobHandler(blobstore_handlers.BlobstoreDownloadHandler):
def get(self):
user = users.get_current_user()
query = db.GqlQuery("SELECT * FROM AudioData Where userId = :1", user.user_id())
results = query.fetch(limit=300)
for dStoreEntry in results:
entityBlobInfo = dStoreEntry.audioBlob
self.send_blob(entityBlobInfo)
This is the client side.
$.ajax({
url : '/serve_blob/audio/',
type : 'GET',
dataType : 'text',
success : function(data) {
alert('GET, audio data : \n '+ data );
}
});
The URL of the page that you're currently fetching the data from is the URL of the MP3. You'll need to use a web-based player of some sort to play it.
What Content-type header does your browser get for mp3 request? I'm guessing it's application/octet-stream
See what Blobstore docs say about upload:
If you don't specify a content type, the Blobstore will try to infer it from the
file extension. If no content type can be determined, the newly created blob is
assigned content type application/octet-stream
Go to GAE admin pages and check Blob Viewer to see under what content type was assigned to your mp3 files.
Get JPlayer - http://www.jplayer.org/
And then your example should work fine. We use it with appengine blobstore in java and it's great. The url from the blobstore will work in jplayer.
You can also set cache headers on your blob urls if you rewrite them to remove any query parameters and save yourself the costs of serving each stream.

Resources