We are storing images in Google Cloud Storage. We've generated a link, using the image service getServingUrl(). This link worked for some amount of time (a few hours) and then stopped working. We've got reports that the link is still accessible in the US, but not the UK.
Here's the link: http://lh3.googleusercontent.com/HkwzeUinidFxi-z6OO4ANasEuVYZYduhMiPG2SraKstC5Val0xGdTqvePNMr_bs7FLvj1oNzZjSWZe4dKcugaZ5hzaqfWlw=s36
Is anybody else experiencing this problem at all? If yes, has anyone cut them a ticket to investigate?
This is a known behavior for years. The getServingUrl() generates a temporary link to a CDN that is not guaranteed to last forever.
You have to generate a link on every request or from time to time or to use other solutions.
We ended up moving our images to S3 + CloudFront from Amazon. You can consider https://cloud.google.com/storage/ & https://cloud.google.com/cdn/
The serving URLs do not expire. I have a website with roughly 500k images that has been using the same image URLs for about 4 years now. All image links are still functioning and were only generated once (not on every request)
Related
I'm having issues with a NextJS project hosted on a Vercel pro plan. Routes that are set up with ISR just keep timing out. I have revalidate set to 120s currently but I've tried it with 60s, 30s, and 6000s too just in case I was calling it too often.
I'm using Strapi as the headless CMS that serves the API for NextJS to build pages from and is deployed on Render's German region. The database Strapi uses is a mongodb databse hosted on MongoDB Atlas and deployed on MongoDB's Ireland AWS region. (I've also set the Serverless Functions on Vercel to be served from London, UK but I'm not sure if that affects ISR?)
There are 3 different dynamic routes with about 20 pages each and on build-time they average 6999ms, 6508ms and 6174ms respectively. Yet at run-time, if I update some content in the Strapi CMS and wait the 120s that I've set for revalidate the page hardly ever gets rebuilt. If I look at the vercel dashboard "Functions" tab that shows realtime logs, I see that many of the pages have attempted to rebuild all at the same time and they are all hitting the 60s time-out limit.
I also have the vercel logs being sent to LogTail and if I filter logs for the name of the page that I've edited, I can see that it returns a 304 status code before 120s has passed as expected but then after 120s it tries to fetch and build the new page and nearly always returns the time-out error.
So my first question is, why are so many of them trying to rebuild at the same time if nothing has changed in the CMS for all of those pages but the 1 I've deliberately changed myself? And secondly, why at build time does it only take an average of 6000ms to build a page but during ISR they are hitting the 60s time-out limit?
Could it be because so many rebuilds are being triggered that they are all end up causing each other to time-out? If so, then how to I tackle that first issue?
Here is a screenshot of my vercel realtime logs. As you can see, many of the pages are all trying to rebuild at once but I'm not sure why, I've only changed the data for one page in this instance.
To try and debug the issue, I decided to create a Postman Flow for building one of the dynamic routes and then added up the time for each api call that is needed to build the page and I get 7621ms on average after a few tests. Here is a screenshot of the Postman console:
I'm not that experienced with NextJs ISR so I'm hoping I'm just doing something wrong or I've not got a setting correct on vercel etc. but after looking on stackoverflow and other websites, I believe I'm using ISR as expected. If anybody has any ideas or advice about what might be going on, I'd very much appreciate it.
I've recently been using filepond for an enterprise web application that allows end-users to upload a maximum of 1,500 images (medium size, avg 200Kb, max 500Kb).
There is a very low degree of backend processing once an image is uploaded other than its temporary storage in a database. We later perform asynchronous processing picking up the files from that temporary storage. But the current challenge we are seeing is that the browser serialization is extending the upload up to 2 hours! We've been able to decrease this time close to 1 hour by increasing the max parallel uploads in filepond already, but this is still far from acceptable (the target is 20min), and we still see the serialization occurring in Chrome Dev Tools with such a volume of images being uploaded.
With this in mind, I'm currently looking for a new filepond plugin to zip the dropped files and then upload a single archive to the backend, without the user bothering to do that himself. I couldn't find anything related at filepond's plugins page and most listed there seem to be related to image transformation. Hopefully the jszip library could do the trick. Am I on the right track? Any further suggestions?
Other things in the radar our team is exploring:
creating multiple DNS endpoints to increase the number of parallel requests by the browser;
researching CDN services alternatives;
Thanks a bunch!
I'm running PageSpeed Insights on my website and one big error that I get sometimes is
Reduce initial server response time
Keep the server response time for the main document short because all
other requests depend on it. Learn more.
React If you are server-side rendering any React components, consider
using renderToNodeStream() or renderToStaticNodeStream() to allow
the client to receive and hydrate different parts of the markup
instead of all at once. Learn more.
I looked up renderToNodeStream() and renderToStaticNodeStream() but I didn't really understand how they could be used with Gatsby.
It looks like a problem others are having also
The domain is https://suddenlysask.com if you want to look at it
My DNS records
Use a CNAME record on a non-apex domain. By using the bare/apex domain you bypass the CDN and force all requests through the load balancer. This means you end up with a single IP address serving all requests (fewer simultaneous connections), the server is proxying to the content without caching, and the distance to the user is likely to be further.
EDIT: Also, your HTML file is over 300KB. That's obscene. It looks like you're including Bootstrap in it twice, you're repeating the same inline <style> tags over and over with slightly different selector hashes, and you have a ton of (unused) utility classes. You only want to inline critical CSS if possible; serve the rest from an external file if you can't treeshake it.
Well the behavior is unexpected, I ran the pagespeed insights of your site and it gave me a warning on first test with initial response time of 0.74 seconds. Then i used my developer tools to look at the initial response time on the document root, which was fairly between 300 to 400ms. So I did the pagespeed test again and the response was 140ms. The test was passed. After that it was 120ms.
See the attached image.
I totally think there is no problem with the site. Still if you wanna try i would recommend you to change the server or your hosting for once, try and go for something different. I don't know what kind of server you have right now where the site is deployed. You can try AWS S3 and CloudFront, works well for me.
I would like to know how do CDNs serve private data - images / videos. I came across this stackoverflow answer but this seems to be Amazon CloudFront specific answer.
As a popular example case lets say the problem in question is serving contents inside of facebook. So there is access controlled stuff at an individual user level and also at a group of users level. Besides, there is some publicly accessible data.
All logic of what can be served to whom resides on the server!
The first request to CDN will go to application server and gets validated for access rights. But there is a catch - keep this in mind:
Assume that first request is successful and after that, anyone will be able to access the image with that CDN URL. I tested this with Facebook user uploaded restricted image and it was accessible with the CDN URL by others too even after me logging out. So, the image will be accessible till the CDN cache expiry time.
I believe this should work - all requests first come to the main application server. After determining whether access is allowed or not, a redirect to the CDN server or access-denied error can be shown.
Each CDN working differently, so unless you specify which CDN you are looking for its hard to tell.
Current procedure to serve image is as follows:
Store image on google cloud storage
Get blob_key: google.appengine.ext.blobstore.create_gs_key(filename)
Get url: google.appengine.api.images.get_serving_url(blob_key,size=250,secure_url=True)
To remove the image, after retrieving the blob_key:
Delete serving url:
google.appengine.api.images.delete_serving_url(blob_key)
Delete google cloud storage file: 'cloudstorage.delete(filename)'
Issue
The issue is that the url is still serving for an undefined amount of time, even though the underlying image does not exist on google cloud storage anymore. Most of the time the url returns 404 in ~24hrs, but have also seen 1 image still serving now (~2wks).
What are the expectations about the promptness of the delete_serving_url call? Any alternatives to delete the url faster?
I can address one of your two questions. Unfortunately, it's the less helpful one. :/
What are the expectations about the promptness of the delete_serving_url call?
Looking at the Java documentation for getServingUrl, they clearly spell out to expect it to take 24 hours, as you observed. I'm not sure why the Python documentation leaves this point out.
If you wish to stop serving the URL, delete the underlying blob key. This takes up to 24 hours to take effect.
The documentation doesn't explain why one of your images would still be serving after 2 weeks.
It is also interesting to note that they don't reference deleteServingUrl as part of the process to stop serving a blob. That suggests to me that step (1) in your process to "delete the image" is unnecessary.