Replicate/synchronize backend instances across regions in Google Cloud - google-app-engine

I set up Cross-Region Load Balancing in Google Cloud and is working fine.
How can I replicate/synchronize the back-end instances to have the exact same content (now each instance has it's own content) ? What should i search for ?
P.S.: i want the behavior of "Managed instance group" but cross region. Load-balance multiple instances on different regions, all with the same content.

There are 3 types of global load balancing (cross multiple regions) on GCE:
HTTP(S) LB, working at L7 - HTTP(S) layer.
SSL Proxy LB, working at L3 - SSL layer.
TCP Proxy LB, working at L3 - TCP layer.
Just use the same instance template for all the managed instance groups.

Related

What is the best approch to setup custom domain with static IP for AppEngine

I had done an investigation on below
Adding custom domain through AppEngine settings - Doesn't seem to be static IP, it uses Google NS.
Setting up VM and run as a proxy - Seems to be convoluted method and security/maintenance overhead.
HTTPS load balancer with internet NEG I am still investigating and it said
You should do this when you want to serve content from an origin that is hosted outside of Google Cloud, and you want your external HTTP(S) load balancer to be the frontend.
Any suggesions/thoughts for this solution will be greatly appricated to chose right solution for this
Static IP for AppEngine/Cloud Functions can be achieved by HTTPS Load Balancer with "Serverless Network End Group" backend.
LB also helps multi-region serving for AppEngine and other serverless components.
This is similar to Internet NEG with HTTPS LB, serverless NEG can be mapped to Google internal services like Cloud run/CF, AppEngine. It was also possible to map multiple AppEngine services from the same GCP project.
I was able to gain early access to serverless NEG on my project and test on my side. I will update this post when Serverless NEG available for public access.
Edit (7/7/2020): Serverless NEG is available in Beta now and is available for everyone to access, See
Serverless network endpoint groups overview
Setting up serverless NEGs
As you can see at this documentation, "App Engine does not currently provide a way to map static IP addresses to an application. In order to optimize the network path between an end user and an App Engine application, end users on different ISPs or geographic locations might use different IP addresses to access the same App Engine application". So, there is no way to set a static IP address to App Engine but you can use a use a pool of IP address. In the shared link, you can find the way to use ranges of IP address in App Engine. This other link, explain a bit more how to do it.

Google App Engine firewall and internal access and error 403

We have 2 app engine app (flex and standard) running on separate projects and we want project A to request project B with https to xxx.appspot.com URL.
Our firewall on both projects Denies all IPs(*) and whitelisted App Engine internal addresss (10.1.0.41, 0.1.0.40, 10.0.0.1 and 0.1.0.30) as explained in the doc.
Yet we receive a "403 error forbidden access" (which disappears when disabling the firewall).
This post is similar to mine but the responses didn't help me.
Is there anything else I can do ?
Did anyone got this to work ?
Thank you in advance.
As you may already know, GCP Projects represent a trust boundary within an organization. Hence, inter-project communication between App Engine services would require Public IP communication or using Shared VPC, which allows connecting between networks from different projects. There should be no internal communication between App Engine Services over different projects. Hence, whitelisting App Engine internal IP addresses might not be useful in this situation.
About using Public App Engine IP addresses, as illustrated in this document. App Engine hosts services on a dynamic public IP address of a Google load balancer. Due to that, the IP address can be changed any time and any Static IP can not be provided. For outbound services, a large pool of IP addresses are used which you can obtain as outlined in this document

Service Fabric (On-premise) Routing to Multi-tenancy Containerized Application

I'm trying to get a proof of concept going for a multi-tenancy containerized ASP.NET MVC application in Service Fabric. The idea is that each customer would get 1+ instances of the application spread across the cluster. One thing I'm having trouble getting mapped out is routing.
Each app would be partitioned similar to this SO answer. The plan so far is to have an external load balancer route each request to the SF Reverse Proxy service.
So for instance:
tenant1.myapp.com would get routed to the reverse proxy at <SF cluster node>:19081/myapp/tenant1 (19081 is the default port for SF Reverse Proxy), tenant2.myapp.com -> <SF Cluster Node>:19081/myapp/tenant2, etc and then the proxy would route it to the correct node:port where an instance of the application is listening.
Since each application has to be mapped to a different port, the plan is for SF to dynamically assign a port on creation of each app. This doesn't seem entirely scaleable since we could theoretically hit a port limit (~65k).
My questions then are, is this a valid/suggested approach? Are there better approaches? Are there things I'm missing/overlooking? I'm new to SF so any help/insight would be appreciated!
I don't think the Ephemeral Port Limit will be an issue for you, is likely that you will consume all server resources (CPU + Memory) even before you consume half of these ports.
To do what you need is possible, but it will require you to create a script or an application that will be responsible to create and manage configuration for the service instances deployed.
I would not use the built-in reverse proxy, it is very limited and for what you want will just add extra configuration with no benefit.
At moment I see traefik as the most suitable solution. Traefik enables you to route specific domains to specific services, and it is exactly what you want.
Because you will use multiple domains, it will require a dynamic configuration that is not provided out of the box, this is why I suggested you to create a separate application to deploy these instances. A very high level steps would be:
You define your service with the traefik default rules as shown here
From your application manager, you deploy a new named service of this service for the new tenant
After the instance is deployed you configure it to listen in a specific domain, setting the rule traefik.frontend.rule=Host:tenant1.myapp.com to the correct tenant name
You might have to add some extra configurations, but this will lead you to the right path.
Regarding the cluster architecture, you could do it in many ways, for starting, I would recommend you keep it simple, one FrontEnd node type containing the traefik services and another BackEnd node type for your services, from there you can decide how to plan the cluster properly, there is already many SO answers on how to define the cluster.
Please see more info on the following links:
https://blog.techfabric.io/using-traefik-reverse-proxy-for-securing-microservices-on-azure-service-fabric/
https://docs.traefik.io/configuration/backends/servicefabric/
Assuming you don't need an instance on every node, you can have up to (nodecount * 65K) services, which would make it scalable again.
Have a look at Azure API management and Traefik, which have some SF integration options. This works a lot nicer than the limited built-in reverse proxy. For example, they offer routing rules.

How to restrict public access to google app engine flexible environment?

I have many microservices in app engine only for internal use. But, by default, app engine opens service-project.appspot.com domain to public, and anyone can access them via http or https.
Is there a way to restrict access only for certain IP address?
The trivial way i can think of is checking source IP address in application code.
Or, I can create custom docker image with nginx configuration which checks source ip address. But, these are not quite clean solutions because access control is actually independent from application, and I don't want to hard code static IP address inside the container.
I assumed there is a way to setup firewall rule for app engine, but I could not find it. Identity-Aware Proxy seems like another option, but it is not available for app engine flex.
I know this is cold comfort, but we're working on re-enabling App Engine flex support for IAP. It's going to be more than just a few days, though.
https://cloud.google.com/appengine/docs/flexible/java/migrating#users has some options that might be more palatable than hardcoding IPs. You won't be able to use GCE firewall rules because the appspot.com traffic is coming through Cloud HTTP Load Balancer, so the GCE instance firewall only sees the IP of the load balancer. If you do want to verify IPs within your app, use X-Forwarded-For as described at https://cloud.google.com/compute/docs/load-balancing/http/#components .
Hope this helps! --Matthew, Cloud IAP engineer

How Google cloud achieved scalability through virtualization?

I have a question about how Google app engine achieve scalability through virtualization. For example when we deploy a cloud app to Goodle app engine, by the time the number of users of our app has been increased and I think Google will automatically generate a new virtual server to handle user request. At the first time, the cloud app runs on one virtual server and now it runs on two virtual servers. Google achieved
scalability through virtualization so that any one system in the Google
infrastructure can run an application’s code—even two consecutive
requests posted to the same application may not go to the same server
Does anyone know how an application can run on two virtual servers on Google. How does it send request to two virtual server and synchronizes data, use CPU resources,...?
Is there any document from Google point out this problem and virtualization implement?
This is in now way a specific answer since we have no idea how Google does this. But I can explain how a load balancer works in Apache which operates on a similar concept. Heck, maybe Google is using a variable of Apache load balancing. Read more here.
Basically a simply Apache load balancing structure consists of at least 3 servers: 1 head load balancer & 2 mirrored servers. The load balancer is basically the traffic cop to outside world traffic. Any public request made to a website that uses load balancing will actually be requesting the “head” machine.
On that load balancing machine, configuration options basically determine which slave servers behind the scenes send content back to the load balancer for delivery. These “slave” machines are basically regular Apache web servers that are—perhaps—IP restricted to only deliver content to the main head load balancer machine.
So assuming both slave servers in a load balancing structure are 100% the same. The load balancer will randomly choose one to grab content from & if it can grab the content in a reasonable amount of time that “slave” now becomes the source. If for some reason the slave machine is slow, the load balancer then decides, “Too slow, moving on!” and goes to the next machine. And it basically makes a decision like that for each request.
The net result is the faster & more accessible server is what is served first. But because the content is all proxied behind the load balancer the public accesses, nobody in the outside world knows the difference.
Now let’s say the site behind a load balancer is so heavily trafficked that more servers need to be added to the cluster. No problem! Just clone the existing slave setup to as many new machines as possible, adjust the load balancer to know that these slaves exist & let it manage the proxy.
Now the hard part is really keeping all machines in sync. And that is all dependent on site needs & usage. So a DB heavy website might use MySQL mirroring for each DB on each machine. Or maybe have a completely separate DB server that itself might be mirroring & clustering to other DBs.
All that said, Google’s key to success is balancing how their load balancing infrastructure works. It’s not easy & I have no clue what they do. But I am sure the basic concepts outlined above are applied in some way.

Resources