Getting huge random latency in app engine - google-app-engine

I run an API in app engine. Sometimes it takes a request only ~50ms to complete and sometimes it takes 10-15 seconds!
Here's what it looks like in the Google Chrome console:
As you can see, some requests are very fast, and some very slow.
Using StackDrive Trace I can confirm that it takes the API 10 seconds or longer sometimes. I tried automatically making requests each second to see if it speeds up after the first request, but it still seems random.
So the next thing I tried is measuring if the API itself is slow due to my own code. I tested it, but it seems to be very fast and not the cause of the problem. Neither do i make any requests inside my API that could be slowing it down (other than a database request).
I am still trying to figure out what it is that is causing this massive latency, but it seems like it happens in between the request being made on the frontend and the request being received on the backend.
I would highly appreciate any help and suggestions!
EDIT 1
Seems like the 204 No Content responses are also slow sometimes.
Here's more strange behavior. On the frontend I make several requests at once to load a page. For every request there is almost exactly a one second delay:
I still have not even figured out the cause of this problem, help is still appreciated.
EDIT 2
My timeline doesn't seem to break down the way it does for Alex:
I tried adding this to all http headers:
'Cache-Control': 'no-cache',
Pragma: 'no-cache'
Which sadly is not solving my problem either.
EDIT 3
The 10 second latency is probably caused by 10 requests all being fired at once, each taking 1 second.
So my first question is:
Can a single app engine f1 instance not handle multiple (concurrent) requests at once?
And my second question:
Why does it take over 1 second (sometimes over 2 seconds) to process a single request?
I did another test to find out if it is my code that is slowing down the requests. I deployed a .net core MVC controller with only 1 task. All it does is return "Hello world". Here are the results (using this method):
> curl.exe -s -o --url "http://api.---.com/test" -w "#curl-format.txt"
time_namelookup: 0,000001
time_connect: 0,109000
time_appconnect: 0,000000
time_pretransfer: 0,109000
time_redirect: 0,000000
time_starttransfer: 1,203000
--------
time_total: 1,203000

In your fast requests, the responses are 204 (No Content) and they are 45 bytes. The slow requests are responding with 200 and are actually returning something.
Is there some kind of caching that's effecting this?
EDIT 1: Since your server was returning 204's I was more referring to any caching that you implemented on the client-side. I see that you found the trace screen (https://console.cloud.google.com/traces/traces), have you tried clicking on one of them? It gives you a breakdown like this:
that should tell you where the request is spending its time

Related

FastAPI slows file download over time

I've written a Flask app with a FastAPI backend driven by uvicorn and it serves both HTTP requests and also socket data (for server driven messages to all clients).
I'm using the FileResponse method to return the html.
If I start the server and head to the IP, then it loads reasonably quickly at around 2 seconds. A day or so later, this time has increased to around 15 seconds. As time goes on it becomes slower and slower until I decide to restart the server. Note that this is all running on the same network and it's all downloaded via ethernet - no Wi-Fi.
Inspecting what's happening, it appears as though it's taking ages to download a 1.6MB resource. But what I don't understand is why it becomes progressively slow over time. If there's a cached version of the page, then it remains quick.
I imagine it has something to do with the fact that the FileResponse class asynchronously streams a file as the response so after some time it may stream the javascript file in bits (which is what I see when I inspect). Does anyone know how to make the FileResponse just send the whole file in one go?

Azure Functions - App Service Plan (Intermittent Slow Calls)

We are using Azure Functions. We are running on App Service Plan (not consumption model). My issue is that we are seeing a strange delay on 'some' web calls into out services.
For example i have one http get trigger, it returns a list of objects from another web service (so there is outbound web traffic from my function). If i call the service 10 times. i'll get maybe 6 respoonses come back in bewteen 400 and 600 ms) but then 4 of those calls will take bewteen 7000 and 8000 ms). Its actually quite consitent in the number. It seems bizarre , its either half a second or 7.5 seconds. I have tested the backend system and its not that. So its something around the function app itself. Any thoughts or suggestions welcome.
(Copying Noel's comment as the answer)
Thanks for the suggestion on the Kudo call that was a great help. It was actually due to a linked VNet. (Which i didnt mention in my post) It seems to be swallowing up outbound traffic and sending it on a very slow trip, due to what looks like a bad config around dns.settings.

Chrome network Timing , how to improve Content Download

I was checking for XHR calls timing in Chrome DevTools to improve slow requests but I found out that 99% of the response time is wasted on content download even though the content size is less than 5 KB and the application is running on localhost(Working on my local machine so no Network issues).
But when replaying the call using Replay XHR menu, the Content download period drops dramatically from 2.13 s to 2.11 ms(as shown in the screen shots below). Data is not cached at browser level.
Example of Call Timing
Same Example Replayed
Can someone explain why the content download timing is slow and how to improve it?
The Application is an ASP.NET mvc 5 solution combined with angularJS.
The Web Server Details:
- Windows Server 2012 R2
- IIS 8
Thank you in advance for your support!
I can't conclusively tell you the cause of this, but I can offer some variables that you can investigate, which might help you figure out what's going on.
Caching
I know you said that the data is not getting cached at the browser level, but I'd suggest checking that again. Because the fact that the initial request takes 2s, and then the repeat request only takes 2ms really does sound like caching.
How to check:
Go to Network panel.
Look at Size column for the request. If you see from memory or from disk cache, it was served from the cache.
Slow development server or machine
My initial thought was that you're doing more work on your development machine than it can handle. Maybe the server requires more resources than your machine can handle. Maybe you have a lot of other programs running and your memory / CPU is getting maxed.
How to check:
Run your app on a more powerful server and see if the pattern persists.
Frontend app is doing too much work
I'm not sure this last one actually makes sense, but it's worth a check. Perhaps your Angular app is doing a crazy amount of JS work during the initial request, and it's maxing out your CPU. So the entire browser is stalling when you make the initial request.
How to check:
Go to Performance panel.
Start recording.
Do the action that causes your app to make the initial request.
Stop recording.
Check the CPU chart. If it's completely maxed out, then your app is indeed doing a bunch of work.
Please leave a comment and let me know if any of these helped.
I have also been investigating this issue on Chrome (currently 91.0.4472.164) as the content download times appear to be vastly different based on the context of the download. When going directly to a resource or attempting to update rendered content as the result of a web call, the content download time can take up to 10x the duration when made from other client applications or when simply saving the data off as a variable in Chrome.
I created a quick, hacky Spring Boot web application that demonstrates the problem that I have made public on github: https://github.com/zielinskin/h2-training-simple
The steps in the readme should hopefully be sufficient to demonstrate the vast performance differences.
I believe that Chrome will need to resolve this performance issue as it has nothing to do with the webserver or ui framework being used.
The "Content Download" includes both the time taken to download the content and also the time for the server to upload the content. You can test out the following cases to see what is the cause. Usually it is a combination of all them.
Case 1: server delay
Assume running server and client on localhost with 0 network delay, and small data.
time0 client receives a response with header content-length = 20
time5 server > client: 10 bytes of data
time5 client receives data
Case 2: network delay
Use hard-coded dummy data to speed up server
time0 client receives a response with header content-length = 20
time0 server > client: 10 bytes of data
time5 client receives data
Case 3: client is too busy
Isolate the query by trying something like curl google.com -v in terminal to access the URL directly. You can use Chrome Dev tool and Firefox Dev tools to copy the request as shown below.

Backends logs depth

I have a long-running process in a backend and I have seen that the log only stores the last 1000 logging calls per request.
While this might not be an issue for a frontend handler, I find it very inconvenient for a backend, where a process might be running indefinitely.
I have tried flushing logs to see if it creates a new logging entry, but it didn't. This seems so wrong, that I'm sure there must be a simple solution for this. Please, help!
Thanks stackoverflowians!
Update: Someone already asked about this in the appengine google group, but there was no answer....
Edit: The 'depth' I am concerned with is not the total number of RequestLogs, which is fine, but the number of AppLogs in a RequestLog (which is limited to 1000).
Edit 2: I did the following test to try David Pope's suggestions:
def test_backends(self):
launched = self.request.get('launched')
if launched:
#Do the job, we are running in the backend
logging.info('There we go!')
from google.appengine.api.logservice import logservice
for i in range(1500):
if i == 500:
logservice.flush()
logging.info('flushhhhh')
logging.info('Call number %s'%i)
else:
#Launch the task in the backend
from google.appengine.api import taskqueue
tq_params = {'url': self.uri_for('backend.test_backends'),
'params': {'launched': True},
}
if not DEBUG:
tq_params['target'] = 'crawler'
taskqueue.add(**tq_params)
Basically, this creates a backend task that logs 1500 lines, flushing at number 500. I would expect to see two RequestLogs, the first one with 500 lines in it and the second one with 1000 lines.
The results are the following:
I didn't get the result that the documentation suggests, manually flushing the logs doesn't create a new log entry, I still have one single RequestLog with 1000 lines in it. I already saw this part of the docs some time ago, but I got this same result, so I thought I wasn't understanding what the docs were saying. Anyways, at the time, I left a logservice.flush() call in my backend code, and the problem wasn't solved.
I downloaded the logs with appcfg.py, and guess what?... all the AppLogs are there! I usually browse the logs in the web UI, I'm not sure if I could get a confortable workflow to view the logs this way... The ideal solution for me would be the one that is described in the docs.
My apps autoflush settings are set to the default, I played with them when at some time, but I saw that the problem persisted, so I left them unset.
I'm using python ;)
The Google docs suggest that flushing should do exactly what you want. If your flushing is working correctly, you will see "partial" request logs tagged with "flush" and the start time of the originating request.
A couple of things to check:
Can you post your code that flushes the logs? It might not be working.
Are you using the GAE web console to view the logs? It's possible that the limit is just a web UI limit, and that if you actually fetch the logs via the API then all the data will be there. (This should only be an issue if flushing isn't working correctly.)
Check your application's autoflush settings.
I assume there are corresponding links for Java, if that's what you're using; you didn't say.
All I can think that might help is to use a timed/cron script like the following to run every hour or so from you workstation/server
appcfg.py --oauth2 request_logs appname/ output.log --append
This should give you a complete log - I haven't tested it myself
I did some more reading and it seems CRON is already part of appcfg
https://developers.google.com/appengine/docs/python/tools/uploadinganapp#oauth
appcfg.py [options] cron_info <app-directory>
Displays information about the scheduled task (cron) configuration, including the
expected times of the next few executions. By default, displays the times of the
next 5 runs. You can modify the number of future run times displayed
with the -- num_runs=... option.
Based on your comment, I would try.
1) Write you own logger class
2) Use more than one version

How to ensure that a bot/scraper does not get blocked

I coded a simple scraper , who's job is to go on several different pages of a site. Do some parsing , call some URL's that are otherwise called via AJAX , and store the data in a database.
Trouble is , that sometimes my ip is blocked after my scraper executes. What steps can I take so that my ip does not get blocked? Are there any recommended practices? I have added a 5 second gap between requests to almost no effect. The site is medium-big(need to scrape several URLs)and my internet connection slow, so the script runs for over an hour. Would being on a faster net connection(like on a hosting service) help ?
Basically I want to code a well behaved bot.
lastly I am not POST'ing or spamming .
Edit: I think I'll break my script into 4-5 parts and run them at different times of the day.
You could use rotating proxies, but that wouldn't be a very well behaved bot. Have you looked at the site's robots.txt?
Write your bot so that it is more polite, i.e. don't sequentially fetch everything, but add delays in strategic places.
Following guidelines set in robots.txt is a good first step. There are tools such as import.io and morph.io. There are also packages/ plugins for servers. For example x-ray; a node.js which have options to assist in quickly writing responsible scrapers e.g. throttle, delays, max connections etc.

Resources