Stopping spot instance temporarily - amazon-ec2-spot-market

This is about this recent announcement:
https://aws.amazon.com/about-aws/whats-new/2020/01/amazon-ec2-spot-instances-stopped-started-similar-to-on-demand-instances/
But I can see a smililar feature introduced in the year 2017
https://aws.amazon.com/about-aws/whats-new/2017/09/amazon-ec2-spot-can-now-stop-and-start-your-spot-instances/
1) I will like to know the difference between the two announcements.
2) When I tried to stop my spot instance, I got this error:
Error stopping instances
You can't stop the Spot Instance 'i-0f298e1710169xxxx' because it is in a fleet, which does not support stop
I will like to stop the instance to save cost.

When capacity is no longer available at your preferred price instance will be terminated. 2017 announcement allows instances to be stopped in the event of interruption, instead of being terminated with maintain interruption behavior option.
This preserves attached volumes, and when capacity becomes available at preferred price - instance is started up back rather than new instance being spun up.
2020 announcement allows you to manually stop / start spot instances at will at any time, not just when it is about to be terminated due to no capacity. This feature is available for persistent Spot requests.
AWS Spot Fleet takes control over instances lifetime and manages them automatically, so these are not persistent Spot requests and Fleet Spot instances cannot be stopped manually.
To minimize costs you can configure Spot Fleet max cost, on-demand instance types, allocation strategy, etc.

The error you got (your point 2) may be solved with the following.
In this article, it can be read:
This feature (Spot service stop instead of terminate your Amazon EBS-backed Spot Instances) is available for persistent Spot requests and Spot Fleets with the maintain fleet option enabled.

Related

How to prevent downtime in App Engine Flex when instances are automatically restarted

Situation
custom runtime (Docker/Node) on App Engine Flex
manually scaled to 1 single instance as we manage the resources ourselves (2 cpu / 6 gb ram)
liveness and readiness checks are configured
as expected, vm instances are automatically restarted on a weekly basis to apply OS / system updates
this is visible in the Activity pane of the Google Cloud Console
Stackdriver logs confirm this activity (e.g. shutdown-script: INFO Starting shutdown scripts. and startup-script: INFO Starting startup scripts.)
no instance is available during these restarts, resulting in 503 errors when visiting the application running on the instance
Goal
to have some control on the amount of instances to prevent downtime
e.g. temporarily scale to 2 instances while 1 instance is restarting
keeping control of the available resources (cpu / ram)
Question
We've considered simply having 2 instances available at all times, but are worried both would be restarted at the same time since they are part of the same instance group.
What would allow us to keep everything up and running while still controlling the amount of instances / resources used?
I have a flex app with two instances running for similar reasons. For me, an instance will occasionally exceed memory limits and need to be restarted. Since I have a second instance, there should always be an instance available.
I hadn't considered the Google updates to my instances. I just checked my recent history, and Google restarted my two instances yesterday. The restarts were 7 minutes apart so, at least in this example, my users always had an instance available to them.
I suspect that Google does not simultaneously restart all of your instances. This would create a brief period of downtime for all flex customers, and nobody wants downtime for a cloud service.
UPDATE:
This is a guess, but I expect that when Google updates a flex instance, it will create a new instance and only shutdown the old instance after the new instance is available. At least, if I were running Google, that is how I would do it. That way you have 100% uptime and you will very briefly have an extra instance running. This would even work with a single flex instance.
Maybe you should try Automatic scaling showed here: Scaling instances.
This allows your application to automatically create instances based on request rate, response latencies, and other application metrics. When one of your instances are gets shut down, another instance could be created in order to "cover" the missing instance. Thus, your service won't get interrupted.

Frequent restarts on GAE application flex environment

I have a GAE application that's set up as a flexible instance, which is expected to be restarted on a weekly basis (and a continually unhealthy instance can be restarted): https://cloud.google.com/appengine/docs/flexible/java/how-instances-are-managed
However, we're seeing this restart ("npm run build" command) several times per week! For example in the past three weeks we've had 9 restarts, and I've confirmed that the log entries leading up are successful 200 responses (no sign of trouble)- all for the active version serving traffic (and not for the other versions that are stopped).
Has anyone seen this symptom before or know of something else that can cause frequent restarts?
Let me know if any other info would be helpful.
An instance restart in the Google App Engine flexible environment can occur for several reasons:
According to the GAE documentation, there is no guarantee that an instance runs indefinitely, it can be restarted due to hardware maintenance, software updates or unforeseen issues. Besides that, as you stated, all instances are restarted on a weekly basis.
An instance can also be restarted if it fails to respond to a specified number of consecutive health check requests.
In case that you observe a unusual number of restarts I recommend you to open a ticket in Google Cloud Platform Support. They have internal tools that are able to check what is going on in the instance and figure out why the restarts are happening.
#DianeKaplan's comment:
Contacting GCP support has given me some a few helpful nuggets so far:
The automatic weekly restart of an instance due to maintenance can occur around different times (so it may only be 5 days since the last one, for example)
our deployments (which result in new GAE versions) make Google Builds
In some cases, a VM was being created overnight and then immediately deleted, where it didn't look like autoscaling was needed. Still looking into this, but was pointed towards the Google Cloud Console section Home > Activity as a good place to find clues

Do App Engine Flexible Environment VM instance restarts take advantage of automatic scaling?

I'm developing my first App Engine Flexible Environment application.
The docs explain that virtual machines are restarted weekly:
VM instances are restarted on a weekly basis. During restarts
Google's management services will apply any necessary operating system
and security updates.
Will restarts result in downtime for apps with automatic scaling enabled? If so, are there any steps I can take to avoid downtime?
For example, I could frequently migrate traffic to new instances so that no instance runs for more than one week.
Well, Later I checked with the Google support team and here the recommendation from them to avoid the downtime.
My questions are:
The weekly update is not fixed in time. Maybe there is a range in time in which I should expect the reboot of the instances? (ie: every Friday during the night).
The weekly update involves all the instances, independently from when they were created? (ie: an instance created 1 hour or 1 day before the weekly update will be restarted?).
How do we suppose to handle such a problem? it returns 502 for all request in the meantime.
1.- At this moment there is no way to know when the weekly restart is going to happen. GCP determine when is necessary and it does the restart of certain instances (once per week).
2.- No, as long as you have more than 1 one instance running you won’t see all of them being restarted at the same time.
3.- What we recommend to avoid downtime due to weekly restarts is having more than 1 instance as a minimum instance. Try to set at least 2 instances as a minimum.
I hope, this information is useful to others.
The answer to your question is in the docs:
App Engine attempts to keep manual scaling instances running indefinitely, but there is no uptime guarantee. Hardware or software failures that cause early termination or frequent restarts can occur without warning and can take considerable time to resolve. Your application should be able to handle such failures.
Here are some good strategies for avoiding downtime due to instance restarts:
Use load balancing across multiple instances.
Configure more instances than required to handle normal traffic.
Write fall-back logic that uses cached results when a manual scaling instance is unavailable.
Reduce the amount of time it takes for your instances to start up and shutdown.
Duplicate the state information across more than one instance.
For long-running computations, checkpoint the state from time to time so you can resume it if it doesn't complete.

Why google shuts down my residents instance even the minimum idling instance set to 1

I've just started playing with GAE. Today I just noticed that GAE shuts down my residents instance even the minimum idling instance set to 1, which causes a cold-start for the next request.
So here is the settings:
1. one simple frontend app, no other stuff
2. minimum idling instance set to 1
3. billing enabled and no charges, which means the shutdown is not because of budget
issue
4. an outside java process making a simple request every hour
From the instance chart in the admin console, it's obvious that at -1.5hour and -0.5hour time point, GAE spawned another dynamic instance to serve the outside request or something, and shutdown both the residents and dynamic instances after 15 minutes. The zero-instance situation remained for another 15 minutes until a residents instance was created again.
Who has similar issues or any ideas? Thanks.
Yes, that has happened to us all. Resident instances do not shutdown, unless manually forced to. From GAE Console:
Idle Instances (another way to referring to resident instance) are pre-loaded with your application code, so when a new Instance is needed, it can serve traffic immediately, thus, avoiding high latency during load spikes.
Resident instances start serving pages, if they become too busy they become a dynamic instance and another resident instance is started in its place. For that reason you will sometimes see the age of a resident instance younger than their dynamic counterparts.

What is a Google App Engine instance?

I am trying to estimate the monthly costs for having GAE for in-app store and I do not really understand what is an instance and what can I do within one instance.
Can I just have one instance with multiple threads to deal with multiple clients? And as I have 28 hours of free instance per app per day (http://cloud.google.com/pricing/), does it mean that I would not pay for my server app running all the time?
An instance is an instance of a virtual server, running your code, that is able to serve requests to clients. This is usually done in parallel (Goroutines, Java threads, Python threads with 2.7) for most efficient usage of available resources.
Response times depends on what you're doing in your code, and it's usually IO dependent. If you have a waterfall of serial database lookups, it takes longer than if you only have a single multiget and perhaps an async write.
Part of the deal with GAE is that Google handles the elasticity for you. If there are a lot of connections waiting, new instances will start as needed (until your quota is exhausted). That means it can be difficult to estimate cost upfront, because you don't know exactly how efficient your code is and how much resources you'll need. I recommend a scheme where more usage means more income, and income per request is higher than cost per request. :)
You can tweak settings, saying you want requests to wait in queue, or always have a couple of spare instances ready to serve new requests, which will affect cost for you and response times for users.
In an IaaS scenario you could say that you will use five instances and that's the cost, but in reality you might need only 1 at night local time, and 25 the rest of the day, which means your users would most likely see dropped connections or otherwise have a negative user experience.
A free instance is normally able to handle test traffic during development without exhausting the quota.
Well AppEngine may decide you need to have more than one instance running to handle the requests and so will start another one. You won't be able to limit it to one running instance. In fact, it's sometimes unclear why AE starts another instance when it seems like the requests are low, but it will if it decides it needs another warm instance to be ready to handle requests if the serving instance(s) are too near their limit.

Resources