I have website developed using nodejs and react server side rendering.
When i am trying to scrape my website with facebook debugger, facebook takes more than 10 sec and times out. However I noticed that my web server responds to facebook scraper request within few milliseconds.
Also, Page performance is fine and it serves in less than 2 secs at max.
Note: FB debugger can scrape my website homepage (which is static file) without any issues. Not sure what is causing facebook debugger to timeout.
Any ideas?
I had path which was blocking the request, fixing that but fixed the issue.
#souhailhimself thank you for pointing in the right direction.
Related
basically I am facing a problem regarding my Django and selenium website.
My Django function contains selenium code to go and scrape details sent by my front end.
It works perfectly fine on my PC. The scrapper must run until the end of the page and it usually takes hours before reaching the end.
It is basically calling my API endpoint which starts scraping and then returns "done" as a response.
Now I want to host it on ubuntu server on AWS EC2 instance. Which I did and works perfectly fine.
But as I said the scrape works for hours, and till the end of the page it doesn't return anything in response and the request stays alive for hours, which as you know is against the security feature of Nginx and the request dies with a 503 error. The default time for Nginx request is I think 30 seconds or something but I tried changing it but no luck.
Is there a different way of hosting scrappers?
This scrapper will be used by one person but I am thinking of making it public so more people will use it in the future. That's why I would like to know how to make it work with not only 1 user but also multiple. Although first I would like to know how can I make it work with 1 user.
The tech stack:
Python Django(Selenium, BeautifulSoup4, Requests), Reactjs(Axios), Nginx, Gunicorn
I really appreciate you for reading till the end and I would appreciate your feedback and help even more. Thank you for any help!
I'm having issues with a NextJS project hosted on a Vercel pro plan. Routes that are set up with ISR just keep timing out. I have revalidate set to 120s currently but I've tried it with 60s, 30s, and 6000s too just in case I was calling it too often.
I'm using Strapi as the headless CMS that serves the API for NextJS to build pages from and is deployed on Render's German region. The database Strapi uses is a mongodb databse hosted on MongoDB Atlas and deployed on MongoDB's Ireland AWS region. (I've also set the Serverless Functions on Vercel to be served from London, UK but I'm not sure if that affects ISR?)
There are 3 different dynamic routes with about 20 pages each and on build-time they average 6999ms, 6508ms and 6174ms respectively. Yet at run-time, if I update some content in the Strapi CMS and wait the 120s that I've set for revalidate the page hardly ever gets rebuilt. If I look at the vercel dashboard "Functions" tab that shows realtime logs, I see that many of the pages have attempted to rebuild all at the same time and they are all hitting the 60s time-out limit.
I also have the vercel logs being sent to LogTail and if I filter logs for the name of the page that I've edited, I can see that it returns a 304 status code before 120s has passed as expected but then after 120s it tries to fetch and build the new page and nearly always returns the time-out error.
So my first question is, why are so many of them trying to rebuild at the same time if nothing has changed in the CMS for all of those pages but the 1 I've deliberately changed myself? And secondly, why at build time does it only take an average of 6000ms to build a page but during ISR they are hitting the 60s time-out limit?
Could it be because so many rebuilds are being triggered that they are all end up causing each other to time-out? If so, then how to I tackle that first issue?
Here is a screenshot of my vercel realtime logs. As you can see, many of the pages are all trying to rebuild at once but I'm not sure why, I've only changed the data for one page in this instance.
To try and debug the issue, I decided to create a Postman Flow for building one of the dynamic routes and then added up the time for each api call that is needed to build the page and I get 7621ms on average after a few tests. Here is a screenshot of the Postman console:
I'm not that experienced with NextJs ISR so I'm hoping I'm just doing something wrong or I've not got a setting correct on vercel etc. but after looking on stackoverflow and other websites, I believe I'm using ISR as expected. If anybody has any ideas or advice about what might be going on, I'd very much appreciate it.
We are storing images in Google Cloud Storage. We've generated a link, using the image service getServingUrl(). This link worked for some amount of time (a few hours) and then stopped working. We've got reports that the link is still accessible in the US, but not the UK.
Here's the link: http://lh3.googleusercontent.com/HkwzeUinidFxi-z6OO4ANasEuVYZYduhMiPG2SraKstC5Val0xGdTqvePNMr_bs7FLvj1oNzZjSWZe4dKcugaZ5hzaqfWlw=s36
Is anybody else experiencing this problem at all? If yes, has anyone cut them a ticket to investigate?
This is a known behavior for years. The getServingUrl() generates a temporary link to a CDN that is not guaranteed to last forever.
You have to generate a link on every request or from time to time or to use other solutions.
We ended up moving our images to S3 + CloudFront from Amazon. You can consider https://cloud.google.com/storage/ & https://cloud.google.com/cdn/
The serving URLs do not expire. I have a website with roughly 500k images that has been using the same image URLs for about 4 years now. All image links are still functioning and were only generated once (not on every request)
I have been recently stuck into a strange issue, thrown by a Multi platform Hybrid App in Visual Studio App.. My Development Environment details are as follows :
Visual Studio 2013 Release 3
Cordova 4.0
Angularjs 1.4
Ionic 1.4
Nokia Lumia 1320 [Windows 8.1 OS]
I have a web app that will be interacting with the mobile app, deployed on a server machine that can be accessed both by an internal enterprise network, as well as from internet.
Now the problem is, when i am [the mobile device is] connected to the internal network, the $http call fails with a status code of 0. Internal dig down reveals that the actual returned status code is -1.
However, when i switch over to mobile data in the phone, the ajax call goes smooth and finishes successfully. Now, if i switch back to internal network, it again starts working perfectly. !!!!
The http call is quite simple and uses promise API... I also have some request interceptors.
Any explanations for this strange behavior, or more appropriately a solution for the same ??
After Scratching my head for over 2 days, i was finally able to conclude that it was my browser that was the culprit.
As i said, i was using Windows Phone 8.1, which uses Internet Explorer 11 as the default rendered. Also, My Web server was actually behind a Proxy Server [Apache HTTP].
Now, the Real problem was that the ajax call was returning response status code as 0. And the reason for that was that The Ajax Call was being suspended by The Apache HTTP Proxy Server, because of some tunneling issue. Please note that this was specifically happening with IE11 and Apache HTTP Server.
This was happening since I was using POST request on a HTTPS Based Proxy Server.
Now the solution is too non-technical, but that's what saved my life. In order to save your life from this issue, You must
1. Either convert your POST Request to GET Request
2. Or Before making a POST Request to the Server, make a GET Request to the same server.
In my case, i went with the second approach and it saved my life. Posting this as answer so that it saves someone else too.
You can refer to the following links for more details.
IE10/IE11 Abort Post Ajax Request After Clearing Cache with error “Network Error 0x2ef3”
Making XHR Request to HTTPS domains with WinJS
I am working on gwt application with google app engine as a server. I have put the logs and appstat filter, so that i could know that what is the latency time.
Suppose if i open a link it is showing that 118ms for appstats (real=118ms) and 90ms for coming back through my login filter, so it is totally 208ms but in the firebug net panel it is showing 705ms for onload for the first request,
Any body have any idea, please let me know.
Thanks
MSNaidu
It would be a really difficult to figure our whether firebug is recording incorrectly or if there is genuine lag. You should try cross confirming by using Chrome SpeedTracer and Chrome's Dev Tools Network, Timeline and Audits features.