URL redirection with webapp2 - google-app-engine

I'm developing an application with webapp2 to be deployed on Google App Engine. URLs will always be preceded by a language identifier, such as:
http://www.mydomain.com/en/foo
http://www.mydomain.com/en/bar
I would like to automatically redirect any request that doesn't start with a language identifier to the corresponding English version. For example, the following URLs should redirect to the URLs above:
http://www.mydomain.com/foo
http://www.mydomain.com/bar
Currently, I'm using webapp2_extras to set up one redirect for every possible URL, which is creating a lot of code duplication. The problem is that, to my understanding, URL redirection in webapp2 needs to be defined on a per-handler basis.
How can I go about redirecting all requests that don't match a regular expression (language identifier in my case) to the corresponding modified URL (adding en/ in my case)?

what you are searching for is a middlware. here an example.

Old question but it seems like setting routes and catching exceptions would be a good way to go for this: http://webapp-improved.appspot.com/guide/exceptions.html#exceptions-in-the-wsgi-app
Routes for the http://www.mydomain.com/en/foo cases and any http://www.mydomain.com/foo cases will be a 404 exception, which you can address with a handler, redirecting to the appropriate "en" page.

Related

How to escape # character in HAProxy config?

I'm trying to modularize my front-end which is in Angular-JS, with it we are using HA-proxy as a load balancer and K8s.
Each ACL in the HA-proxy configuration is attached to a different service in K8s and since we are using Angular with the (hash-bang enabled), in the HA-proxy configuration file we use that as a way to identify the different modules.
Below is my configuration, in HA-proxy which is failing because I can't escape the # in the file even after following the HA Documentation.
acl login-frontend path_beg /\#/login
use_backend login-frontend if login-frontend
acl elc-frontend path_beg /\#/elc
use_backend elc-frontend if elc-frontend
I have tried escaping it as /%23/login and /'#'/admin but without success.
Any idea would be greatly appreciated.
The fragment (everything followed by a # character) as defined in RFC 3986
As with any URI, use of a fragment identifier component does not
imply that a retrieval action will take place. A URI with a fragment
identifier may be used to refer to the secondary resource without any
implication that the primary resource is accessible or will ever be
accessed.
and it is used on the client side, therefore a client (a browser, a curl, ...) does not send it with a request. As reference: Is the URL fragment identifier sent to the server?
So there is no point to route/acl with it. The reason why haproxy provide an escape sequence for that is you may want to include it with a body, a custom header... but again, you will not obtain that part from the request line (the first line with URI).
What is really happening here is the user is requesting from HAProxy / and Angular, in the user's browser, is then parsing the #/logic and #/elc part to decide what to do next.
I ran into a similar problem with my Ember app. For SEO purposes I split out my "marketing" pages and my "app" pages.
I then mounted my Ember application at /app and had HAProxy route requests to the backend that serviced my Ember app. A request for "anything else" (i.e. /contact-us) was routed to the backend that handled marketing pages.
/app/* -> server1 (Ember pages)
/ -> server2 (static marketing pages)
Since I had some urls floating around out there on the web that still pointed to things like /#/login but really they should now be /app/#/login what I had to do was edit the index.html page being served by my marketing backend and add Javascript to that page that parsed the url. If it detected a /#/login it forced a redirect to /app/#/login instead.
I hope that helps you figure out how to accomplish the same for your Angular app.

Semantics of dispatch.yaml

I'm looking at various pages about dispatch.yaml, most of which contain similar information and examples:
https://cloud.google.com/appengine/docs/flexible/nodejs/how-requests-are-routed#routing_with_a_dispatch_file
https://cloud.google.com/appengine/docs/python/config/dispatchref
https://cloud.google.com/appengine/docs/go/config/dispatchref
etc.
I happen to be using node.js on GAE Flexible Environment, but I think it would be the same for every language and environment.
The problem is that these pages don't really specify how dispatch.yaml works. In particular:
Are rules applied in the order given? I'm assuming that the first matching rule is the one used, but nothing seems to say so.
Do leading glob (wildcard) characters match only the domain name, or could they match the first part of the URL's path? If the rule is */hello, would that match myapp.appspot.com/path/hello? I'm guessing not, based on some vague hints in the docs, but it isn't very clear.
If no rule in dispatch.yaml matches the URL, will it be routed to the default service? I would think it would have to, but again, these pages don't say.
Do URLs get rewritten based on the rules before they're sent to the service? If the rule is */path/* and the URL is https://myapp.appspot.com/path/hello, will the service see it as /path/hello or as /hello? I'm guessing the former.
I'm doing some trial and error now, so I may be able to answer my own question soon. I'm also submitting this to Google through their documentation feedback system.
Things I know so far:
Yes, rules are tried in order. So for example, if you want one URL to go to a specific service, and all other URLs to go to another service, you should specify the specific one first:
dispatch:
- url: "*/specific"
module: specific
- url: "*/*"
module: general
If you put those rules in the opposite order, module specific will never be used, because the URL /specific will be caught by the wildcard rule.
Unknown
Yes. You can test this by making a request not matching any dispatch.yaml rule and watching the default's service logs.
No rewriting. If the rule is */path/* and the actual URL is https://myapp.appspot.com/path/hello, your service should still handle /path/hello, not /hello.
Just to fill in the blank (feel free to paste this into the accepted answer):
No. It only matches the start of the path.
I created two apps with the following resources:
default -> /abc/def/test.html -> <h1>default</h1>
other -> /abc/def/test.html -> <h1>other</h1>
And 1 route:
<dispatch>
<url>*/def/*</url>
<module>other</module>
</dispatch>
When I hit {app engine}/abc/def/test.html I got "default"

Using sw-precache with client-side URL routes for a single page app

How would one configure sw-precache to serve index.html for multiple dynamic routes?
This is for an Angular app that has index.html as the entry point. The current setup allows the app to be accessable offline only through /. So if a user go to /articles/list/popular as an entry point while offline they won't be able to browse it and would be given you're offline message. (although when online they'd be served the same index.html file on all requests as an entry point)
Can dynamicUrlToDependencies be used to do this? Or does this need to be handled by writing a separate SW script? Something like the following would do?
function serveIndexCacheFirst() {
var request = new Request(INDEX_URL);
return toolbox.cacheFirst(request);
}
toolbox.router.get(
'(/articles/list/.+)|(/profiles/.+)(other-patterns)',
serveIndexCacheFirst);
You can use sw-precache for this, without having to configure runtime caching via sw-toolbox or rolling your own solution.
The navigateFallback option allows you to specify a URL to be used as a "fallback" whenever you navigate to a URL that doesn't exist in the cache. It's the service worker equivalent to configuring a single entry point URL in your HTTP server, with a wildcard that routes all requests to that URL. This is obviously common with single-page apps.
There's also a navigateFallbackWhitelist option, which allows you to restrict the navigateFallback behavior to navigation requests that match one or more URL patterns. It's useful if you have a small set of known routes, and only want those to trigger the navigation fallback.
There's an example of those options in use as part of the app-shell-demo that's including with sw-precache.
In your specific setup, you might want:
{
navigateFallback: '/index.html',
// If you know that all valid client-side routes will begin with /articles
navigateFallbackWhitelist: [/^\/articles/],
// Additional options
}
Yes, I think you can use dynamicUrlToDependencies, as mentioned in the documentation of the directoryIndex option: https://github.com/GoogleChrome/sw-precache#directoryindex-string.

Does LinkedIn Share API support escaped fragment URL's (hasbang url)

Using the customized URL method is it possible to share a URL that implements the Escaped Fragment protocol?
For example a url in the following format:
https://www.example.com/#!/my-angularjs-page
So far from my experience, LinkedIn always removes the hashtag thus only retaining the domain name part of the URL:
https://www.example.com/
And therefore only the homepage is being shared.
References:
https://developer.linkedin.com/docs/share-on-linkedin
As of March 2016 the answer is no, LinkedIn share API customised URL method doesn't support hashbang URL's.
You need to escape the characters, whenever you want to use #, &, ?, and a few other reserved characters as the value of a GET parameter in a URL. It's called URL-encoding. The standard LinkedIn format is...
https://www.linkedin.com/sharing/share-offsite/?url={url}
So, for your site, notice how I am doing the encoding of these special characters...
https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fwww.example.com%2F%23!%2Fmy-angularjs-page
# becomes %23. Afterall, how is the browser supposed to know that # there is for the URL you are sharing, or for the LinkedIn page itself? It needs to be able to distinguish, and it does that with encoding.
More Info: Official LinkedIn Share Documentation

Google App Engine optional slash redirect

I have a java app running on Google App Engine... I'd like to make the trailing slash optional for directories... so navigating to www.domain.com/test and www.domain.com/test/ would yield the same thing.
How do I achieve that?
I know about the app.yaml configuration file but I am running a Java app not python..
See this post. Works for me, though looks like a hack. I think it worth posting issue to google, as thee servlet specification requires adding trailing slashes when attempting to find a proper welcome-file.
The easiest way to do this would be to create a filter that intercepts requests and appends the forward slash if necessary. Generally it's better to send a redirect rather than serving the same content, so you don't end up with two canonical URLs for everything, and all your contents indexed twice.
What constitutes a 'directory' depends on your application, of course, and there's no hard and fast rule for figuring that out.

Resources