App Engine Standard, Serverless VPCs, Cloud Memorystore giving significant amount of timeouts - google-app-engine

We configured our App Engine Standard python 3 service to connect to Cloud Memorystore via the Serverless VPC service (per the documentation, and other stack overflow threads). (I've included the app.yaml config below). This all worked well, unless an instance went idle for a little while. Over time we saw a high volume of:
Long unexplained hangs when making calls to Memorystore, even though they eventually worked
redis.exceptions.ConnectionError: Error 110 connecting to 10.0.0.12:6379. Connection timed out.
redis.exceptions.TimeoutError: Timeout reading from socket
These happened to the point where I had to move back to App Engine Flexible, where the service runs great without any of the above problems.
My conclusion is that Serverless VPC does not handle the fact that the redis client tries hard to leave the connection to redis open all the time. I tried a few variations of timeout settings, but nothing that helped. Has anyone successfully deployed App Engine Standard, Memorystore, and Serverless VPC?
env_variables:
REDISHOST: <IP>
REDISPORT: 6379
network:
name: "projects/<PROJECT-ID>/global/networks/default"
vpc_access_connector:
name: "projects/<PROJECT-ID>/locations/us-central1/connectors/<VPC-NAME>
Code used to connect to Memorystore (using redis-py):
REDIS_CLIENT = redis.StrictRedis(
host=REDIS_HOST,
port=REDIS_PORT,
retry_on_timeout=True,
health_check_interval=30
)
(I tried various timeout settings but couldn't find anything that helped)

I created a Memorystore instance and a Serverless VPC Access connector as stated in the docs (https://cloud.google.com/vpc/docs/configure-serverless-vpc-access), then deployed this sample (https://github.com/GoogleCloudPlatform/python-docs-samples/tree/master/appengine/standard_python37/redis) from Google Cloud Platform Python doc samples repo to App Engine Standard after making some modifications:
This is my app.yaml:
runtime: python37
# Update with Redis instance details
env_variables:
REDIS_HOST: <memorystore-ip-here>
REDIS_PORT: 6379
# Update with Serverless VPC Access connector details
vpc_access_connector:
name: 'projects/<project-id>/locations/<region>/connectors/<connector-name>'
# [END memorystore_app_yaml_standard]
I edited the code on main.py and used the snippet that you use to connect to the memorystore instance. It ended up like this:
redis_client = redis.StrictRedis(
host=redis_host, port=redis_port,
password=redis_password,
retry_on_timeout=True,
health_check_interval=30
)
I edited the requirements.txt. I changed “redis==3.3.8” for “redis>=3.3.0”
Things to note:
Make sure to use “gcloud beta app deploy” instead of “gcloud app deploy” since it is needed in order for the Serverless VPC Access connector to work.
Make sure that the authorized network you set to the memorystore instance is the same that you select for the Serverless VPC Access connector
This works as expected for me, could you please check if this works for you?

You may try to use min idle instance option, so you will have at least one idle instance to wait to serve your traffic. Bear in mind that this may change your billing cost. Also here you can find a billing calculator.
If the min idle instances are set to 0 there are no available instances to serve your traffic when the requests are starting and may this be the reason of having exceptions.

Related

DogstatsD not sending JVM runtime metrics from Google App Engine flex environment

According to DataDog JVM metrics collection is enabled by default for Java tracer v0.29.0+
https://docs.datadoghq.com/tracing/metrics/runtime_metrics/java/
My agent is running and trace metrics are coming in fine, but I am not seeing the data coming in on the JVM metrics tab in the APM section.
I confirmed with DD helpdesk that everything is configured correctly for a containerized environment. I was expecting the JVM metrics to automatically like this doc describes:
https://docs.datadoghq.com/tracing/metrics/runtime_metrics/java/
app.yaml
DD_AGENT_HOST: "our_gcp_host"
DD_TRACE_AGENT_PORT: "80"
DD_ENV: "dev"
DD_SERVICE: "our_service_tag"
dd-app.yaml
service: dd-agent
runtime: custom
env: flex
env_variables:
DD_APM_ENABLED: "true"
DD_APM_NON_LOCAL_TRAFFIC: "true"
DD_APM_RECEIVER_PORT: 8080 // custom port configuration
DD_DOGSTATSD_NON_LOCAL_TRAFFIC: 'true'
DD_DOGSTATSD_PORT: 8125
network:
forwarded_ports:
- 8125/udp
I posted this so that I can answer this question. It was a few days of investigation, but we figured it out.
The solution is to deploy the agent to a compute engine instance. According to my colleague that figured it out the reason for this is:
Despite the fact app engine and docs say you can port forward, it looks like it doesn't actually allow the port to be accessible via the dns, just the ips which change as instances go up/down. We made a compute engine instance of the dd-agent and set our api to it's ip.
GCP isn't honest about port forwarding in App Engine. You can port forward but the app engine dns can't be used so you would have to use the instance ips. It also looks like udp load balancers may not work with app engine which makes the entire idea behind the port forwarding kinda pointless.
Try it out! We saw our metrics show up immediately.

app engine unable to redirect traffic via cloud nat static ip address

I am trying to send email using client's on-prem SMTP server using app engine standard. For this we have created Serverless VPC access connector in default network and Cloud NAT with static ip address to send egress traffic. Client has whitelisted static ip address and port. Following is code snippet in app engine
msg.set_content('This is a HTML email')
msg.add_alternative(cleared_html_content, subtype='html')
try:
context = ssl._create_unverified_context()
print("starting conectn")
with smtplib.SMTP('xx.xxxx.edu', 2525) as server:
server.starttls(context=context)
server.send_message(msg)
print("sent almost")
except Exception as e:
print('Error: ', e)
Following is app.yaml
runtime: python37
entrypoint: gunicorn -t 120 -b :$PORT main:app
vpc_access_connector:
name: projects/xxxxxxxxx/locations/us-central1/connectors/yyyyyyyyy
When i run my app using app engine url, I am getting following error in logs viewer
Error: (554, b"xxx.xxxxx.edu\nYour access to this mail system has been rejected due to the sending MTA's poor reputation. If you believe that this failure is in error, please contact the intended recipient via alternate means."
Also i have created cloud function with same code as in app engine to test and surprisingly email was sent to intended recepient with out any issue. When i checked cloud NAT logs, it has all details when triggered via cloud function (in short it is using static ip address) but there are no logs related to app engine trigger. So i think my app engine traffic is not going via static ip address and not sure how to mention that in app.yaml
There might be code issue in email function as well but since it is working in cloud function, i really doubt about my app.yaml and not email python code. Any help is really appreciated
I understood that your SMTP IP was public. There is a caveat to know with serverless VPC connector.
With Cloud Function, and Cloud Run, you have the capacity to choose if only private IP or Public and Private IP are routed through the serverless VPC Connector
With app engine, I didn't find a clear description of the egress control, but I guess that only private IP (RFC1918) are routed through the VPC, and not the public one. And so, your Cloud Nat isn't used and thus you aren't authorised on the SMTP server of your school.
Edit 1:
You have 3 solutions to solve this
You can create a Cloud Functions (or a Cloud Run service) that your App Engine calls when you need to send an email.
You can switch from App Engine to Cloud Run (use the new beta command gcloud beta run deploy --source=. --region=<REGION> --platform=managed <Service Name>). Like this, you can deploy as with App Engine. The same Container engine builder as App Engine is used (Buildpack). You have to adapt the content of the app.yaml file (share it if you need help). However, up to now, IAP isn't compliant with Cloud Run. If you want to use it, wait!
Create a VPN between your VPC and your school network. Like this, you will call your SMTP server with a private IP. On the smtp server, grant only the serverless VPC connector range to access it. And you no longer need a Cloud NAT configuration.

Google App Engine Manual Scaling Prevents Restart

I have a python app engine that handles api results and it's stateful. However it seems that after a few hours of inactivity (no requests), the server shuts off, resetting all states, and when a new request is made, it's listening again.
But the states are reset. I want the server to actively remain unchanged 24/7 and not reset/restart as I want to maintain states.
I have configured as per documentation but it's still restarting, I am not sure what's wrong
Here is my app.yaml:
runtime: python37
entrypoint: python main.py
manual_scaling:
instances: 1
In App Engine the general recomendation is to create stateless applications as mentioned on the documentation
Your app should be "stateless" so that nothing is stored on the instance.
As an alternative for the application not to get restarted you can deploy it on Compute Engine, As that service is a Virtual Machine you can have total control of the states.

Is it possible to use a fully managed service (Cloud Run or App Engine) with firewall in GCP?

Problem. I'm looking for an agile way to shoot a docker container (stored on GCR.IO) to a managed service on GCP:
one docker container gcr.io/project/helloworld with private data (say, Cloud SQL backend) - can't face the real world.
a bunch of IPs I want to expose it to: say [ "1.2.3.4" , "2.3.4.0/24" ].
My ideal platform would be Cloud Run, but also GAE works.
I want to develop in agile way (say deploy with 2-3 lines of code), is it possible run my service secretly and yet super easily? We're not talking about a huge production project, we're talking about playing around and writing a POC you want to share securely over the internet to a few friends making sure the rest of the world gets a 403.
What I've tried so far.
The only think that works easily is a GCE vm with docker-friendly OS (like cos) where I can set up firewall rules. This works, but it's a lame docker app on a disposable VM. Machine runs forever and dies at reboot unless I stabilize it on cron/startup. Looks like I'm doing somebody else's job.
Everything else I've tried so far failed:
Cloud Run. Amazing but can't set up firewall rules on it, or Cloud Director, .. seems to work only with IAP which is painful to set up.
GAE. Works with multiple IPs and can't detach public IPs or firewall it. I managed to get the IP filtering within the app but seems a bit risky. I don't [want to] trust my coding skills :)
Cloud Armor. Only supports a HTTPS Load Balancer which I don't have. Nor I have MIGs to point to. I want simplicity.
Traffic Director and need a HTTP L7 balancer. But I have a docker container, on a single pod. Why do I need a LB?
GKE. Actually this seems to work: [1] but it's not fully managed (I need to create cluster, pods, ..)
Is this a product deficiency or am I looking at the wrong products? What's the simplest way to achieve what I want?
[1] how do I add a firewall rule to a gke service?
Please limit your question to one service. Not everyone is an expert on all Google Cloud services. You will have a better chance of a good answer for each service if they are separate questions.
In summary, if you want to use Google Cloud Security Groups to control IP based access you need to use a service that runs on Compute Engine as security groups are part of the VPC feature set. App Engine Standard and Cloud Run do not run within your project's VPC. This leaves you with App Engine Flex, Compute Engine, and Kubernetes.
I would change strategies and use Google Cloud Run managed by authentication. Access is controlled by Google Cloud IAM via OAuth tokens.
Cloud Run Authentication Overview
I have agreed with the John Hanley’s reply and I have up-voted his answer.
Also, I’ve learned that you are looking how to restrict access to your service through GCP.
By setting a firewall rules, You can limit access to your service by limiting the Source IP range as Allowed source, so that only this address will be allowed as source IP.
Please review another thread in Server Fault [1], stating how to “Restrict access to single IP only”.
https://serverfault.com/questions/901364/restrict-access-to-single-ip-only
You can do quite easily with a Serverless NEG for Cloud Run or GAE
If you're doing this in Terraform you can follow this article

How to do API calls with Google App Engine or Cloud Composer when the API only allows restricted IPs

I have jobs and APIs hosted on cloud composer and App Engine that works fine. However for one of my job I would need to call an API that is IP restricted.
As far as I understand, I see that there's no way to have a fixed IP for app engine and cloud composer workers and I don't know what is the best solution then.
I thought about creating a GCE with a fixed IP that would be switched on/off by the cloud composer or app engine and then the API call would be executed by the startup-script. However, it restrains this to only asynchronous tasks and it seems to add a non desired step.
I have been told that it is possible to set up a proxy but I don't know how to do it and I did not find comprehensive docs about it.
Would you have advice for this use-case ?
Thanks a lot for your help
It's probably out of scope to you, but you could whitelist the whole range of app engine ip by performing a lookup on _cloud-netblocks.googleusercontent.com
In this case you are whitelisting any app engine applications, so be sure this api has another kind of authorization and good security. More info on the App Engine KB.
What I would do is install or implement some kind of API proxy on GCE. It's a bummer to have a VM on 24/7 for this kind of task so you could also use an autoscaler to scale to 0 (not sure about this one).
As you have mentioned: you can set up a TCP or UDP proxy in GCE as a relay, and then send requests to the relay (which then forwards those requests to the IP-restricted host).
However, that might be somewhat brittle in some cases (and introduces a single point of failure). Therefore, another option you could consider is creating a private IP Cloud Composer environment, and then using Cloud NAT for public IP connectivity. That way, all requests from Airflow within Composer will look like they are originating from the IP address of the NAT gateway.

Resources