How to get gzip or brotli encoding on Google App Engine - google-app-engine

Does Google App Engine allow compression of the results? For example, I have the following curl request:
$ curl --location --request GET 'https://premiere-stage2.uk.r.appspot.com/' \
> --header 'Accept-Encoding: gzip, deflate, br'
And the response is not compressed. Compare this with something like:
$ curl --location -X GET 'https://google.com' --header 'Accept-Encoding: gzip, deflate, br'
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
Or, is there something manual I need to set up? I would think the last resort would be to do the compression in the application endpoints themselves, or is that how it needs to be done?

Expanding on John Hanley's suggestion in a comment, there are two parts to this.
You have to set the Accept-Encoding header in the request.
Second, the response itself should have the proper content- or mime-type, such as text/html or whatever it needs to be. Often the web server will ignore compression if the mime-type isn't in a certain list.
Third, to ensure that the headers in both the requests and responses are correct you can use the -v flag in curl.
Finally, it seems the content needs to be over a certain size for the web server to bother compressing it. So, for example, if the content-length is 3, it's not going to be compressed, though I'm not sure exactly what this is.
Putting it all together:
$ curl --location --request GET 'https://premiere-stage2.uk.r.appspot.com/html'
--header 'Accept-Encoding: gzip, deflate, br'
-v
References:
curl
GAE (a bit buried, under the Go documentation)

According to documentation
For example, the server may automatically send a gzipped response depending on the value of the Accept-Encoding request header. The application itself does not need to know which content encodings the client can accept.
That gives the impression it should. But the same documentation also says
In addition, the following headers are removed from incoming requests because they relate to the transfer of the HTTP data between the client and server:
Accept-Encoding
I tested against our production site and in FireFox, Web Developer Tool shows Accept-Encoding: gzip, deflate, br as a request header and a response header of content-encoding: gzip
However, when I tested against local/dev of our site, Web Developer Tool shows Accept-Encoding: gzip, deflate, br as a request header but the response header didn't include content-encoding: gzip. In addition, printing the headers in Flask/Python, gave a value of None for Accept-Encoding

Related

C sockets, proxy GET requests returning as 404

I'm creating a simple proxy server and I've run into an issue with getting responses back from a website.
I've set up my server to accept connections on a specified port that gets connected to through the browser proxy config. The server is able to receive the request, connect to the specified website, send the request, and receive a reply.
I'm forwarding the request from the browser to the website without modification, but the replies I receive are always 404 errors.
This is the request I'm fowarding to the website, there is a \r\n after every line and \r\n\r\n after the final line.
GET http://www.mywebpage.com/ HTTP/1.1
Host: www.mywebpage.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Cookie: __utma=1.35811746.1525489860.1537250282.1539467023.3; __utmz=1.1537250282.2.2.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); _fbp=fb.1.1553849756364.1600689742
Upgrade-Insecure-Requests: 1
My send receive code looks like this.
sendError = send(serverSock, requestString , strlen(requestString) , 0 );
returnedSize = recv(serverSock, buffer, sizeof(buffer), 0);
I'm forwarding the request from the browser to the website without modification
...
GET http://www.mywebpage.com/ HTTP/1.1
The absolute URL you use in your request target should only be used for proxies. Normal servers expect the origin form, i.e. only the path and optional query but not the full URL. Method, host and port should thus be stripped:
GET / HTTP/1.1

405 Err on OPTIONS preflight for upload_url on Google Appengine SDK on different port #

I have a Google AppEngine project that works fine in production but not locally.
There is a React browser application running locally on port 3001 and a python api service running on 9090.
When I attempt to upload files via the React client, I first call an REST endpoint that returns the blobstore get_upload_url() to the client. This url is something like: http://localhost:9090/_ah/upload/aghkZXZ-... <-- note the port is that of the python service
When I fashion a POST request to that url from the browser client to actually upload the file, I get a 405 on the OPTIONS preflight check. So far as I understand, this is due to the ports being different. This only occurs in the local App Engine SDK since I am using dispatch.yaml settings in production to have everything on the same domain/port.
I had dug into the SDK code a while ago and put a hack in place. (https://gist.github.com/blainegarrett/4d3b3081d09b4ff7be00765eb32b0d94)
However, since upgrading Google Cloud to 218.0.0, the hack was overwritten and I'm back to square one.
Here are the headers to the blobstore upload url:
OPTIONS /_ah/upload/aghkZXZ-Tm9uZXIiCxIVX19CbG9iVXBsb2FkU2Vzc2lvbl9fGICAgICA77ALDA HTTP/1.1
Host: localhost:9090
Connection: keep-alive
Origin: http://localhost:3001
Access-Control-Request-Method: POST
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36
Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
I am currently using vanilla XMLHttpRequest() for the upload call specifically.
Does anyone have any suggestion on how to either get around the preflight check when the ports are different and/or to allow OPTIONS checks on the upload url in a less hacky way?
Update: I'd still like to hear an answer regarding the 405 on the SDK, but I was able to dodge the preflight check by getting rid of the xhr progress listener. My original assertion that the port difference was triggering the preflight check was incorrect. It was the progress callback.
xhr.upload.addEventListener('progress', function(e) { .. }
See research on: CORS request is preflighted, but it seems like it should not be

GCP HTTP Load Balancer returns 502 error if POST data is large

We have an API server and are using HTTP Load Balancer. We found that the L7 Load balancer returns 502 error if HTTP request's data is large.
We have confirmed that it works when accessing the API without the Load Balancer (accessing the API Server directly.)
This question might be a similar issue. HTTP Load Balancer cuts out part of a large request body
Someone said that using L4 Network Load Balancer is a possible solution but we don't want to use it for some reasons e.g. URL based load balancing and cross-region load balancing.
// Response OK (data size is 1024)
curl -H "Content-Type: application/json" -X POST -d '{"xx}' https://xxxxxxxxxxxxxxx.com/xx/xxxxxxxxxxxx/xxxxxxxxx
// Response NG (data size is 1025)
curl -H "Content-Type: application/json" -X POST -d '{"xx}' https://xxxxxxxxxxxxxxx.com/xx/xxxxxxxxxxxx/xxxxxxxxx
It seems that LB has some limitation about the size of post data. Tests show the limit is around 1024 bytes.
Update1
#chaintng saved me. Someone on the linked post says that curl adds "Expect: 100-continue Header" if the post data is over 1024 byte.
// Response NG (data size is 1025. without "Expect: ")
curl -H "Content-Type: application/json" -X POST -d '{"xx}' https://xxxxxxxxxxxxxxx.com/xx/xxxxxxxxxxxx/xxxxxxxxx
// Response OK (data size is 1025. with "Expect: ")
curl -H "Expect: " -H "Content-Type: application/json" -X POST -d '{"xx}' https://xxxxxxxxxxxxxxx.com/xx/xxxxxxxxxxxx/xxxxxxxxx
reference from this question Curl to Google Compute load balancer gets error 502
It's because CURL has default value when request large POST body defining header as Expect: 100-continue
Which is not support in Google L7 Load Balancing (stated in this document https://cloud.google.com/compute/docs/load-balancing/http/)
All you have to do is ignoring this behaviour by set the header before execute curl.
For e.g. in PHP
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Expect:']);

Internet Explorer 11 replaces Authorization header

What would cause Internet Explorer to replace the HTTP header
Authorization : Bearer <server-provided-token>
with
Authorization : Negotiate <some token>
when making an AJAX request?
Details
In Internet Explorer, some AJAX requests that are configured to contain the header Authorization: Bearer ... are being sent by Internet Explorer with the header Authorization: Negotiate ... instead.
For example, Fiddler shows that the first two of three requests contain the Authorization : Bearer... header, while the third suddenly contains the Authorization : Negotiate... header. The first two requests are successful, and the third fails because the request can't be properly authenticated.
All of the requests are constructed using the same client-side code, and are made one after another (within the span of a second). I have verified that the Authorization header correctly contains the Bearer token in all three cases up until the point the request is provided to the browser.
Also, I'm not seeing the same behavior in Chrome; it's only occurring in IE.
Request 1
GET http://localhost/myapp/api/User HTTP/1.1
Accept: application/json, text/plain, */*
Authorization: Bearer oEXS5IBu9huepzW6jfh-POMA18AUA8yWZsPfBPZuFf_JJxq-DKIt0JDyPXSiGpmV_cpT8FlL3D1DN-Tv5ZbT73MTuBOd5y75-bsx9fZvOeJgg04JcO0cUajdCH2h5QlMP8TNwgTpHg-TR9FxyPk3Kw6bQ6tQCOkOwIG_FmEJpP89yrOsoYJoCfrAoZ7M4PVcik9F9qtPgXmWwXB2eHDtkls44wITF_yM_rPm5C47OPCvMVTPz30KwoEPi6fHUcL3qHauP-v9uypv2e48TyPHUwLYmNFxyafMhBx4TkovnRcsdLHZiHmSjMq0V9a2Vw70
Referer: http://localhost/client/login.html
Accept-Language: en-US
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: localhost
DNT: 1
Connection: Keep-Alive
Request 2
POST http://localhost/myapp/api/Permissions HTTP/1.1
Referer: http://localhost/client/#/Dashboard
Content-Type: application/json
Authorization: Bearer oEXS5IBu9huepzW6jfh-POMA18AUA8yWZsPfBPZuFf_JJxq-DKIt0JDyPXSiGpmV_cpT8FlL3D1DN-Tv5ZbT73MTuBOd5y75-bsx9fZvOeJgg04JcO0cUajdCH2h5QlMP8TNwgTpHg-TR9FxyPk3Kw6bQ6tQCOkOwIG_FmEJpP89yrOsoYJoCfrAoZ7M4PVcik9F9qtPgXmWwXB2eHDtkls44wITF_yM_rPm5C47OPCvMVTPz30KwoEPi6fHUcL3qHauP-v9uypv2e48TyPHUwLYmNFxyafMhBx4TkovnRcsdLHZiHmSjMq0V9a2Vw70
Accept: application/json, text/plain, */*
Accept-Language: en-US
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: localhost
Content-Length: 1419
DNT: 1
Connection: Keep-Alive
Pragma: no-cache
<Post Data Removed>
Request 3
GET http://localhost/myapp/api/UserPreferences/Dashboard HTTP/1.1
Referer: http://localhost/client/#/Dashboard
Content-Type: application/json
Authorization: Negotiate YHsGBisGAQUFAqBxMG+gMDAuBgorBgEEAYI3AgIKBgkqhkiC9xIBAgIGCSqGSIb3EgECAgYKKwYBBAGCNwICHqI7BDlOVExNU1NQAAEAAACXsgjiBgAGADMAAAALAAsAKAAAAAYBsR0AAAAPVk1ERVZFTlYtU1JTQ0VSSVM=
Accept: application/json, text/plain, */*
Accept-Language: en-US
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Connection: Keep-Alive
DNT: 1
Host: localhost
The requests are being made via the AngularJS $http service, and the back-end is ASP.NET Web API hosted in IIS.
We had a problem where Internet Explorer was caching credentials. We could fix the problem by using the following script:
document.execCommand('ClearAuthenticationCache', 'false');
see: Wikipedia
I've just come across this issue too.
What was odd is that it worked fine on my development machine, it was when I deployed it the issue arose.
Again it worked fine in Chrome, Firefox etc.
It turns out that the issue is that IE was detecting the site was on the localintranet zone and was therefore trying to automatically trying log on (it was set by group policy - this is an internal app).
My workaround was that (luckily) it was only autodetecting local intranet zone when using a server name that wasn't an FQDN (e.g. myserver) - but using the full A
I had the same problem in a knockoutjs application, it worked fine in Chrome and Firefox but not in IE.
I also used Fiddler and noticed that the first ajax call used Bearer as intended and returned successfully. But then IE started to loop and send the subsequent ajax calls over and over again with the Negotiate authorization instead!
In my case it was some sort of timing issue in IE, I solved it by making the ajax calls that loaded data during rendering synchronous.
me.loadLimits = function () {
$.ajax({
type: 'GET',
dataType: 'json',
contentType: 'application/json',
url: '/api/workrate/limits',
headers: me.headers,
async: false,
success: function (result) {
...
I also encountered this issue when I was kicking off multiple data loads in my angular app.
I worked around this by detecting the browser and if IE, delayed each request by 50ms based on the index of the call:
return $q(function(resolve, reject) {
var delay = self.widget.useDelayLoading ? self.widget.index * 50 : 0;
setTimeout(function() {
restService.genericApi(self.widget.url, false).queryPost(json).$promise
.then(
function(r) { resolve(r); },
function(e) { reject(e); }
);
}, delay);
});
Interestingly, when I used $timeout, I had to increase the delay to 100ms.
We had faced similar issue with angular and web api. Issue happens when the system tries to access some resource at root level which had Windows Authentication enabled. In our case, application was trying to get the favicon from IIS root. Once this request gets unauthorized, IE will try getting the resouce with negotiation header; though it fails again. But from this point onwards, IE keep sending negotiate header instead of our bearer token. This is due to the settings in IE, which I think is in Internet Options -> Advanced tab -> Enable Integrated Windows Authentication in the Security section (not sure, I forgot the exact stuff).
Fix was either give anonymous access to root level or to the resource location which app is trying to access (bad option) or have document.execCommand('ClearAuthenticationCache', false); in the app.js file.
In my case, IE alternated between sending a bad request, followed by a good request on a second attempt, then followed by a bad request again and so on.
After trying several approaches to causing IE to retry - it appears that returning a 307 (Temporary redirect) with the same request url in the Location header solves the issue.
e.g. for a request to "http://myUrl/api/service/"
HTTP 307 Temporary Redirect
Location: http://myUrl/api/service/
IE retries the call with the proper data.
Edit: This method might be dangerous as it might create an infinite loop. A possible solution to work around it, is to return some counter as part of the url in the Location header and analyze it when receiving the call again.

Get file path from tcp packet

I have an app wich download a file from a server, receiving it in tcp packets and I want to found the path of file on the server. With wireshark I read in the first packet some information like date, domain, file name and as path I read path=/ but it isn't in domain.com/filename (404). Is there any way to get the real path where the file is on the server?
edit:
All I found comprehensible in the first packet:
HTTP/1.1 200 OK
Date: Sat, 30 Aug 2014 14:35:55 GMT
Server: Apache/2.2.3 (CentOS)
X-Powered-By: PHP/5.3.24
Set-Cookie: frontend=m90hqgtsu70hk9pprd39sllqk4; expires=Sat, 30-Aug-2014 25:35:55 GMT; path=/; domain=www.exaple.com; HttpOnly
Content-Disposition: attachment; filename="xxx.y"
Content-Length: 46458848
Connection: close
Content-Type: application/octet-stream
The request:
GET /index.php/rest/server?method=download&sessionId=xxx&userId=a#a.com&deviceToken=xxx&sku=filename&version=2
HTTP/1.1
Connection: Keep-Alive
Accept Encoding: gzip
Accept-Language: it-IT,en,*
User-Agent: Mozilla/5.0
Host: www.domain.com
The file is being downloaded using HTTP (read RFC 2616). The packet you are looking at is a response. The domain and path information you are looking for is not in the response, it is in the request instead:
GET /index.php/rest/server?method=download&sessionId=xxx&userId=a#a.com&deviceToken=xxx&sku=filename&version=2 HTTP/1.1
Connection: Keep-Alive
Accept Encoding: gzip
Accept-Language: it-IT,en,*
User-Agent: Mozilla/5.0
Host: www.domain.com
So the URL to request the file would be http://www.domain.com/index.php/rest/server?method=download&sessionId=xxx&userId=a#a.com&deviceToken=xxx&sku=filename&version=2.
The filename you see in the response is the actual filename for the file. But not all responses will include such a filename, so be prepared for that. If there is no Content-Disposition header (or it does not have a filename attribute), look for a name attribute in the Content-Type header. If none, you will have to parse the request URL (see RFC 3986) looking for a filename in its Path component (in the above URL, that is /index.php/rest/server).
The domain and path pieces you see in the response are not related to the file at all. They belong to a cookie (see RFC 6265) that is used to persist server-side data between HTTP requests.
If the server does not voluntarily provide the path you are looking for there is no way to find out. The file it sends might not even be on disk. It might be generated data or data cached in application memory.
If the response does not contain the path (and that is unlikely because no server I know of would send it) you can't do anything to find it.

Resources