How Google cloud achieved scalability through virtualization? - google-app-engine

I have a question about how Google app engine achieve scalability through virtualization. For example when we deploy a cloud app to Goodle app engine, by the time the number of users of our app has been increased and I think Google will automatically generate a new virtual server to handle user request. At the first time, the cloud app runs on one virtual server and now it runs on two virtual servers. Google achieved
scalability through virtualization so that any one system in the Google
infrastructure can run an application’s code—even two consecutive
requests posted to the same application may not go to the same server
Does anyone know how an application can run on two virtual servers on Google. How does it send request to two virtual server and synchronizes data, use CPU resources,...?
Is there any document from Google point out this problem and virtualization implement?

This is in now way a specific answer since we have no idea how Google does this. But I can explain how a load balancer works in Apache which operates on a similar concept. Heck, maybe Google is using a variable of Apache load balancing. Read more here.
Basically a simply Apache load balancing structure consists of at least 3 servers: 1 head load balancer & 2 mirrored servers. The load balancer is basically the traffic cop to outside world traffic. Any public request made to a website that uses load balancing will actually be requesting the “head” machine.
On that load balancing machine, configuration options basically determine which slave servers behind the scenes send content back to the load balancer for delivery. These “slave” machines are basically regular Apache web servers that are—perhaps—IP restricted to only deliver content to the main head load balancer machine.
So assuming both slave servers in a load balancing structure are 100% the same. The load balancer will randomly choose one to grab content from & if it can grab the content in a reasonable amount of time that “slave” now becomes the source. If for some reason the slave machine is slow, the load balancer then decides, “Too slow, moving on!” and goes to the next machine. And it basically makes a decision like that for each request.
The net result is the faster & more accessible server is what is served first. But because the content is all proxied behind the load balancer the public accesses, nobody in the outside world knows the difference.
Now let’s say the site behind a load balancer is so heavily trafficked that more servers need to be added to the cluster. No problem! Just clone the existing slave setup to as many new machines as possible, adjust the load balancer to know that these slaves exist & let it manage the proxy.
Now the hard part is really keeping all machines in sync. And that is all dependent on site needs & usage. So a DB heavy website might use MySQL mirroring for each DB on each machine. Or maybe have a completely separate DB server that itself might be mirroring & clustering to other DBs.
All that said, Google’s key to success is balancing how their load balancing infrastructure works. It’s not easy & I have no clue what they do. But I am sure the basic concepts outlined above are applied in some way.

Related

Service Fabric (On-premise) Routing to Multi-tenancy Containerized Application

I'm trying to get a proof of concept going for a multi-tenancy containerized ASP.NET MVC application in Service Fabric. The idea is that each customer would get 1+ instances of the application spread across the cluster. One thing I'm having trouble getting mapped out is routing.
Each app would be partitioned similar to this SO answer. The plan so far is to have an external load balancer route each request to the SF Reverse Proxy service.
So for instance:
tenant1.myapp.com would get routed to the reverse proxy at <SF cluster node>:19081/myapp/tenant1 (19081 is the default port for SF Reverse Proxy), tenant2.myapp.com -> <SF Cluster Node>:19081/myapp/tenant2, etc and then the proxy would route it to the correct node:port where an instance of the application is listening.
Since each application has to be mapped to a different port, the plan is for SF to dynamically assign a port on creation of each app. This doesn't seem entirely scaleable since we could theoretically hit a port limit (~65k).
My questions then are, is this a valid/suggested approach? Are there better approaches? Are there things I'm missing/overlooking? I'm new to SF so any help/insight would be appreciated!
I don't think the Ephemeral Port Limit will be an issue for you, is likely that you will consume all server resources (CPU + Memory) even before you consume half of these ports.
To do what you need is possible, but it will require you to create a script or an application that will be responsible to create and manage configuration for the service instances deployed.
I would not use the built-in reverse proxy, it is very limited and for what you want will just add extra configuration with no benefit.
At moment I see traefik as the most suitable solution. Traefik enables you to route specific domains to specific services, and it is exactly what you want.
Because you will use multiple domains, it will require a dynamic configuration that is not provided out of the box, this is why I suggested you to create a separate application to deploy these instances. A very high level steps would be:
You define your service with the traefik default rules as shown here
From your application manager, you deploy a new named service of this service for the new tenant
After the instance is deployed you configure it to listen in a specific domain, setting the rule traefik.frontend.rule=Host:tenant1.myapp.com to the correct tenant name
You might have to add some extra configurations, but this will lead you to the right path.
Regarding the cluster architecture, you could do it in many ways, for starting, I would recommend you keep it simple, one FrontEnd node type containing the traefik services and another BackEnd node type for your services, from there you can decide how to plan the cluster properly, there is already many SO answers on how to define the cluster.
Please see more info on the following links:
https://blog.techfabric.io/using-traefik-reverse-proxy-for-securing-microservices-on-azure-service-fabric/
https://docs.traefik.io/configuration/backends/servicefabric/
Assuming you don't need an instance on every node, you can have up to (nodecount * 65K) services, which would make it scalable again.
Have a look at Azure API management and Traefik, which have some SF integration options. This works a lot nicer than the limited built-in reverse proxy. For example, they offer routing rules.

With AngularJS based single page apps hosted on premise, how to connect to AWS cloud servers

Maybe this is a really basic question, but how do you architect your system such that your single page application is hosted on premise with some hostname, say mydogs.com but you want to host your application services code in the cloud (as well as database). For example, let's say you spin up an Amazon EC2 Container Service using docker and it is running NodeJS server. The hostnames will all have ec2_some_id.amazon.com. What system sits in from of the Amazon EC2 instance where my angularjs app connects to? What architecture facilitate this type of app? Especially AWS based services.
One of the important aspects setting up the web application and the backend is to server it using a single domain avoiding cross origin requests (CORS). To do this, you can use AWS CloudFront as a proxy, where the routing happens based on URL paths.
For example, you can point the root domain to index.html while /api/* requests to the backend endpoint running in EC2. Sample diagram of the architecture is shown below.
Also its important for your angular application to have full url paths. One of the challenges having these are, for routes such as /home /about and etc., it will reload a page from the backend for that particular path. Since its a single page application you won't be having server pages for /home and /about & etc. This is where you can setup error pages in CloudFront so that, all the not found routes also can be forwarded to the index.html (Which serves the AngularJS app).
The only thing you need to care about is the CORS on whatever server you use to host your backend in AWS.
More Doc on CORS:
https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
Hope it helps.
A good approach is to have two separated instances. It is, an instance to serve your API (Application Program Interface) and another one to serve your SPA (Single Page Application).
For the API server you may want more robust service because it's the one that will suffer the most receiving tons of requests from all client instances, so this one needs to have more performance, band, etc. In addition, you probably want your API server to be scalable when needed (depends on the load over it); maybe not, but is something to keep in mind if your application is supposed to grow fast. So you may invest a little bit more on this one.
The SPA server in the other hand, is the one that will only serve static resources (if you're not using server side rendering), so this one is supposed to be cheaper (if not free). Furthermore, all it does is to serve the application resources once and the application actually runs on client and most files will end up being cached by the browser. So you don't need to invest much on this one.
Anyhow, you question about which service will fit better for this type of application can't be answered because it doesn't define much about that you may find the one that sits for the requisites you want in terms of how your application will be consumed by the clients like: how many requests, downloads, or storage your app needs.
Amazon EC2 instance types

Does Google App Engine provide any ways to test the application performance against high amount of traffic?

I have deployed an application to Google App Engine which will be consumed by millions of users.
I want to test the application against high amount of traffic before go live just to make sure i have provided the correct configuration that supports auto load balancing and scale-ability.
While going through google documentations. App Engine should handle all of this headache, but i have to be sure 100%.
Is there are anything should i put in mind before go live (database connection, other resources in the cloud storage,..., etc.)?
Thanks.
You should look into making sure your are using the Cloud SQL instance effectively. For example, how many total connections do you expect to have from your app engine to MySQL?
There's ultimately a limit on the number of concurrent connections that a MySQL server can handle. You would want to make sure your application is designed such that you are reusing connections when possible.
I would recommend performing a load test to determine the limits of your application and its dependencies.

How does google container engines load balancer interact with replication controllers?

I've read through the google container engine and load balancer docs.
What I've understood:
I can encapsulated common pods into one replication controller
I can add network load balancing very easily to my container engine app
Given I have various services sitting behind a nginx reverse proxy.
What I want to say: This is my set of services behind nginx. Please push them to each node and connect nginx to the load balancer. Hence, when one node fails it can serve the others and so on.
Questions:
Do I understand the idea of load balancers and replication controllers in that context correctly?
If yes, do I assume right that only the frontend parts of the application go into the replication controller while non-replica services (such as the postgres database or the redis cache) are pushed into a service?
How would I set that up? I do not find a point in the docs where it actually connects a load balancer to my container entrypoints.
In general I'm a bit new to the concepts and may struggle with basics.
I don't really get your idea behind your first question. This is how I see this:
The loadbalancer is just used to open your service to the outside, to set a public IP to your nginx pods in your case. Here is how this works (also the answer to your question 3.): https://cloud.google.com/container-engine/docs/load-balancer
The replication controller is used to make sur you have always the right number of pods running for the associated pod. So the best way to make any pod run indefinitely is in this context. To answer your question 2., I would make sur that all your pods run with an associated replication controller, postgres db and redis included, just to be certain that you have always an instance of them running.
The service makes it easy to communicate between "internal" pods, and also has some kind of internal load balancing if the associated pod is replicated. In your case, your db and redis pods (each controlled by a replication controller) will indeed have to be managed also by a service to make them available to your nginx pods.
If you want an example of a complete "stack" running I would suggest this link: https://cloud.google.com/container-engine/docs/tutorials/guestbook even if it is quite simple you can get the main ideas behind all this.

Google App Engine internal network

Is it possible to route HTTP traffic between google app engine applications without going through the public internet?
For example, if I'm running a Web Service API on one application and want to build a second application on top of it without traffic going through the internet - for performance reasons.
Between separate apps running on different domains? I suspect not.
But you can use backends to do different work behind the scenes:
Backends are special App Engine instances that have no request deadlines, higher memory and CPU limits, and persistent state across requests. They are started automatically by App Engine and can run continously for long periods. Each backend instance has a unique URL to use for requests, and you can load-balance requests across multiple instances.
When I look at the logs between the backend and the front end instances I see IPs like
0.1.0.3
So yes, those communication paths are internal. I'd hazard a guess that as so much of the internet is google you could say requests between different apps might not travel on the public internet.
Logs indicate low latency communication between front and back ends, not under any particular load however. Your milage may vary.
Backends in Python

Resources