AppEngine dev_appserver - urllib2.urlopen issue with localhost url - google-app-engine

UPDATE
App Engine SDK 1.9.24 was released on July 20, 2015, so if you're still experiencing this, you should be able to fix this simply by updating. See +jpatokal's answer below for an explanation of the exact problem and solution.
Original Question
I have an application I'm working with and running into troubles when developing locally.
We have some shared code that checks an auth server for our apps using urllib2.urlopen. When I develop locally, I'm getting rejected with a 404 on my app that makes the request from AppEngine, but the request succeeds just fine from a terminal.
I have appengine running on port localhost:8000, and the auth server on localhost:8001
import urllib2
url = "http://localhost:8001/api/CheckAuthentication/?__client_id=dev&token=c7jl2y3smhzzqabhxnzrlyq5r5sdyjr8&username=amadison&__signature=6IXnj08bAnKoIBvJQUuBG8O1kBuBCWS8655s3DpBQIE="
try:
r = urllib2.urlopen(url)
print(r.geturl())
print(r.read())
except urllib2.HTTPError as e:
print("got error: {} - {}".format(e.code, e.reason))
which results in got error: 404 - Not Found from within AppEngine
It appears that AppEngine is adding the schema, host and port to the PATH portion of the url I'm trying to hit, as this is what I see on the auth server:
[02/Jul/2015 16:54:16] "GET http://localhost:8001/api/CheckAuthentication/?__client_id=dev&token=c7jl2y3smhzzqabhxnzrlyq5r5sdyjr8&username=amadison&__signature=6IXnj08bAnKoIBvJQUuBG8O1kBuBCWS8655s3DpBQIE= HTTP/1.1" 404 10146
and from the request header we can see the whole scheme and host and port are being passed along as part of the path (header pieces below):
'HTTP_HOST': 'localhost:8001',
'PATH_INFO': u'http://localhost:8001/api/CheckAuthentication/',
'SERVER_PORT': '8001',
'SERVER_PROTOCOL': 'HTTP/1.1',
Is there any way to not have the AppEngine Dev server hijack this request to localhost on a different port? Or am I not misunderstanding what is happening? Everything works fine in production where our domains are different.
Thanks in advance for any assistance helping to point me in the right direction.

This is an annoying problem introduced by the urlfetch_stub implementation. I'm not sure what gcloud sdk version introduced it.
I've fixed this by patching the gcloud SDK - until Google does.
which means this answer will hopefully be irrelevant shortly
Find and open urlfetch_stub.py, which can often be found at ~/google-cloud-sdk/platform/google_appengine/google/appengine/api/urlfetch_stub.py
Around line 380 (depends on version), find:
full_path = urlparse.urlunsplit((protocol, host, path, query, ''))
and replace it with:
full_path = urlparse.urlunsplit(('', '', path, query, ''))
more info
You were correct in assuming the issue was a broken PATH_INFO header. The full_path here is being passed after the connection is made.
disclaimer
I may very easily have broken proxy requests with this patch. Because I expect google to fix it, I'm not going to go too crazy about it.
To be very clear this bug is ONLY related to LOCAL app development - you won't see this on production.

App Engine SDK 1.9.24 was released on July 20, 2015, so if you're still experiencing this, you should be able to fix this simply by updating.
Here's a brief explanation of what happened. Until 1.9.21, the SDK was formatting URL fetch requests with relative paths, like this:
GET /test/ HTTP/1.1
Host: 127.0.0.1:5000
In 1.9.22, to better support proxies, this changed to absolute paths:
GET http://127.0.0.1:5000/test/ HTTP/1.1
Host: 127.0.0.1:5000
Both formats are perfectly legal per the HTTP/1.1 spec, see RFC 2616, section 5.1.2. However, while that spec dates to 1999, there are apparently quite a few HTTP request handlers that do not parse the absolute form correctly, instead just naively concatenating the path and the host together.
So in the interest of compatibility, the previous behavior has been restored. (Unless you're using a proxy, in which case the RFC requires absolute paths.)

Related

Dealing with the environment url in the "build" version of react

I'm trying to deploy a react-django app to production using digitalocean droplet. I have a file where I check for the current environment (development or production), and based on the current environment assign the appropriate url to use to connect to the django backend like so:
export const server = enviroment ? "http://localhost:8000" : "domain-name.com";
My app is working perfectly both on development and production modes in local system (I temporarily still used http://localhost:8000 in place of domain-name.com). But I observed something rather strange. It's the fact that when I tried to access the site (still in my local computer) with "127.0.0.1:8000" ON THE BROWSER, the page is blank with a console error "No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' ....".
When I changed it back to "http://localhost:8000", everything was back working. My worry is isn't 127.0.0.1:8000 the same as http://localhost:8000? From this I conclude that whatever you have in the domain-name.com place when you build your react frontend is exactly what will be used.
Like I said, I'm trying to deploy to a digital ocean droplet, and I plan to install ssl certificate so the site could be served on https. Now my question is given the scenario painted above, what should be the right way to write the url in production? Should it be "serverIP-address", "domain-name.com", "http://domain-name.com", "https://domain-name.com" ?.
I must mentioned that I had previously attempted to deploy to the said platform using the IP-address in the domain-name.com place. After following all the steps. I got a 502 (Bad gateway) error. However, I'm not saying using Ip address was responsible for the error in that case.
Please I would appreciate any help especially from someone who had previously deployed a react-django app to the said platform. Thanks
From this I conclude that whatever you have in the domain-name.com
place when you build your react frontend is exactly what will be used.
Not exactly true, the domain from which the react app is served will be used. If you build it local and upload it to the server and configure domain.com to serve it, then domain.com will be used for cors. The best idea is to allow all CORS until your project is deployment ready. Once done, whitelist the domain.com
The solution actually lies in providing the host(s) allowed to connect to the Back-end in the setting.py file like so: CORS_ALLOWED_ORIGINS = [ domain-name.com, https:domain-name.com , ... ] etc. That way, you wouldn't be tied to using the url provided in the react environment variable. Though I have not deployed to the server, my first worry within the local machine is taken care off.

Why is Google IAP putting double-digits request cookies in my headers?

I have an app running on Google app engine (Flask, python 3, flexible environment) using the Identity-Aware proxy to allow everyone in our organization (which uses GSuite) to control access. Recently we've been getting 413 errors.
When I looked at the cookies of the failing requests I expected to see one request cookie prefixed with GCP_IAAP_AUTH_TOKEN. Instead I see 11, each one slightly different. Their combined sizes put us over the 15kb header size limit indicated in the link below, causing a 413 error.
https://cloud.google.com/appengine/docs/flexible/go/how-requests-are-handled
I don't understand why there are so many cookies, or how to make them go away. Our users all use Chrome, and many but not all of them are intermittently running into this error. Those that aren't, when their cookies are inspected, show only a couple cookies with this prefix. See below for an example of what this collection of cookies looks like:
Eleven IAP cookies in a single header
Posting what ended up solving this particular instance of the problem in case something like it occurs to other people in the future.
The original IAP code for our project was written in 2018. At the time, IAP had a known issue requiring re-logging in every hour. The suggested workaround from this thread was to use a hidden iframe.
https://issuetracker.google.com/issues/69386592?pli=1
We followed that guidance, but Google fixed the underlying issue in June of 2019. Now, following that guidance causes a gradual accumulation of session cookies in the headers. Removing the no-longer-needed offending iframe code solved the problem.

Custom domain http redirects to https when I don't want it to, why is it doing this?

I am trying to get a custom domain to work with Google App Engine 1.9.7 without SSL
I have done all the prerequisites;
Domain is verified with the proper TXT records.
Domain is configured in the GAE Cloud Console with the proper subdomain www.
Application is deployed the appspot.com domain and works.
But when I try to got to http://www.customdomain.com it immediately redirects to https://www.customdomain.com and I get the following error:
net::ERR_SSL_PROTOCOL_ERROR
I know that for SSL I need to set up a certificate.
I don't have any of my webapp modules configured to be secure.
I don't want SSL right now, I don't need it right now.
I found this little nugget after reading the instructions again and again:
It's okay for multiple domains and subdomains to point to the same
application. You can design your app to treat them all the same or
handle each one in a different way.
This is exactly what I want to do but I can't find any information on how to actually do this?
How do I get it to stop redirecting to the http to https?
I ran into the same problem. You say "I don't have any of my webapp modules configured to be secure." If that's the case, sorry, can't help you.
Otherwise the most likely cause for your problem would be: A "secure: always" flag for the respective handler in your app.yaml in the handlers section. Like so:
handlers:
- url: /*
secure: always
Remove the line with the "secure: always". Details in the official Google docs here (table item "secure").
How to run into this problem? I ran into it, because I copied the app.yaml from one of my other apps that didn't need to run on a custom domain, yet needed the SSL always.
For a Django/Python GAE app, by the way, the same problem is caused like this:
handlers:
- url: /.*
script: google.appengine.ext.django.main.app
secure: always
Same answer here: Remove or change the "secure" line. Python version just tested as described. Always works on the appspot.com domain, only without secure flag on a custom domain.
Just pointing out the above, as other people might run into this problem and come to this threat for help.
What I had to do was to shut down all instances and remove all versions, then do a fresh deployment from scratch, then I stopped having this problem.

http request from Google App Engine

I'm trying to make http requests from my Google App Engine webapp, and discovered I have to use URLConnection since it's the only whitelisted class. The corresponding Clojure library is clojure.contrib.http.agent, and my code is as follows:
(defroutes example
(GET "/" [] (http/string (http/http-agent "http://www.example.com")))
(route/not-found "Page not found"))
This works fine in my development environment- the browser displays the text for example.com. But when I test it out with Google's development app server:
phrygian:example wei$ dev_appserver.sh war
2010-09-28 14:53:36.120 java[43845:903] [Java CocoaComponent compatibility mode]: Enabled
...
INFO: The server is running at http://localhost:8080/
It just hangs when I load the page. No error, or anything. Any idea what might be going on?
http-agent creates threads so that might be why it does not work.
From the API documentation:
Creates (and immediately returns) an Agent representing an HTTP
request running in a new thread.
You could try http-connection, which is a wrapper around HttpURLConnection, so this should work.
Another alternative is to try clj-http. The API seems to be a bit more high-level, but it uses Apache HttpComponents which might be blacklisted.
I am guessing http.async.client is a definite no-go due to its strong asynchronous approach.
You might want to try appengine.urlfetch/fetch from appengine-clj (http://github.com/r0man/appengine-clj, also in clojars)

Google app engine: fails on deployment but works perfectly locally -- nonresponsive html form submit button

I have a small test App running on GAE under the default free quota. It runs fine locally. When deployed on GAE (appspot), some parts of it do not work. Appspot dashboard does not show any error in the logs. Added code to trap quota limits is not triggered. Why is there a variation between the App running locally, versus failing when uploaded? There are no server error screens when deployed, only non-responsive buttons. (One non-responsive html form submit button, which works fine locally.) I am using Eclipse to run the App locally and also to deploy the same identical code.
The Appspot log is below. The *.jsp script getmoreinputs.jsp is supposed to collect data from a form, upon submit, it is supposed to trigger a servlet named /Calculate. The servlet works perfectly when tested locally, but is not triggered in the live deployment.
Any help would be appreciated.
#
1.
08-22 07:57PM 12.475 /getmoreinputs.jsp?cp=true&iv=true 200 23ms 16cpu_ms 0kb Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 (.NET CLR 3.5.30729),gzip(gfe)
See details
98.236.17.99 - - [22/Aug/2010:19:57:12 -0700] "GET /getmoreinputs.jsp?cp=true&iv=true HTTP/1.1" 200 923 "http://black-scholes.appspot.com/" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 (.NET CLR 3.5.30729),gzip(gfe)" "black-scholes.appspot.com" ms=23 cpu_ms=17 api_cpu_ms=0 cpm_usd=0.000615
My test of your insight suggests: GAE is indifferent to / or // in pathname for submit=
When all else fails, try occult methods. Without any change to code, I complied the GAE App with a new version number. Then I went to Appspot dashboard and deleted the old version (which was not working). Amazing! The deployment of the new version worked perfecly, exactly as the local run. There was no change to the code.
GAE has problems. When deployment != Local, my new rule is: clean out all old versions at deployment site. In my case, I had only 1 old version.
Joe
As I mentioned in my answer to your previous question:
Check your urls are not having a double slash ('//') e.g. /user//listall. This works on dev server but not when you deploy it on app engine.
What I meant by this was, even if your url may not directly show a // , after appending the url suffix to the hostname this is pretty much possible. So I suggest you to try getmoreinputs.jsp instead of /getmoreinputs.jsp (note the '/' removed).
I suspect when this is being appended internally to http://black-scholes.appspot.com/ it is creating a url which looks like http://black-scholes.appspot.com//getmoreinputs.jsp which will not work on app engine when deployed. However this works on the dev server locally. Please give a try.

Resources