Resource Limit Exceeded- xgb.deploy - amazon-sagemaker

When I tried to deploy example from Amazon SageMaker
xgb_predictor = xgb.deploy(initial_instance_count=1,
instance_type='ml.m4.xlarge')
ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateEndpoint operation: The account-level service limit 'ml.m4.xlarge for endpoint usage' is 0 Instances, with current utilization of 0 Instances and a request delta of 1 Instances. Please contact AWS support to request an increase for this limit.
Any idea how to fix this?
Thank you

AWS is using soft limits to prevent customers from making a mistake that might cause them more money than they expected. When you are starting to use a new service, such as Amazon SageMaker, you will hit these soft limits and you need to ask specifically to raise them using the "Support" link on the top right side of your AWS management console.
Here is a link to guide on how to do that: https://aws.amazon.com/premiumsupport/knowledge-center/manage-service-limits/
You will usually get the limit increased within a couple of days. In the meanwhile, you can choose a smaller instance (such as t2) that are often available.

What #Guy answered is correct, if you want to deploy then you need to request a limit incrase for the specific instance type,
Steps to Request Limit Increase for ml.m4.xlarge
visit aws console https://console.aws.amazon.com/
click on support on the top right corner
click create a case (orange button)
select Service Limit Increase radio button
For Limit Type, Search and Select SageMaker Notebook Instances
select the same region as the region that is displayed on the top right corner of your amazon console.
Write a short Use case description
For Limit, Select ml.[x].[x] (in your case, ml.m4.xlarge)
New Limit Values 1
It may take 48 hours for this manual support ticket to turn around.

Related

How to programmatically scale up app engine?

I have an application which uses app engine auto scaling. It usually runs 0 instances, except if some authorised users use it.
This application need to run automated voice calls as fast as possible on thousands of people with keypad interactions (no, it's not spam, it's redcall!).
Programmatically speaking, we ask Twilio to initialise calls through its Voice API 5 times/sec and it basically works through webhooks, at least 2, but most of the time 4 hits per call. So GAE need to scale up very quickly and some requests get lost (which is just a hang up on the user side) at the beginning of the trigger, when only one instance is ready.
I would like to know if it is possible to programmatically scale up App Engine (through an API?) before running such triggers in order to be ready when the storm will blast?
I think you may want to give warmup requests a try. As they load your app's code into a new instance before any live requests reach that instance. Thus reducing the time it takes to answer while you GAE instance has scaled down to zero.
The link I have shared with you, includes the PHP7 runtime as I see you are familiar with it.
I would also like to agree with John Hanley, since finding a sweet spot on how many idle instances you have available, would also help the performance of your app.
Finally, the solution was to delegate sending the communication through Cloud Tasks:
https://cloud.google.com/tasks/docs/creating-appengine-tasks
https://github.com/redcall-io/app/blob/master/symfony/src/Communication/Processor/QueueProcessor.php
Tasks can try again hitting the app engine in case of errors, and make the app engine pop new instances when the surge comes.

Load testing a Google App Engine Application using JMeter

I've created an application and I'd like to test how well it scales to large numbers of users.
To run my application a user has to go to the homepage, sign in to a Google account, click a button and then upload a video file.
First of all, is this possible to emulate using JMeter? I'm signed into my Google account locally but am not sure whether simulated users will have access to it?
Secondly, I've recorded a session in JMeter doing the actions above and have run the test with 10 simulated users, however, the App Engine dashboard doesn't detect any activity. I've followed the steps mentioned here but obviously with details of my application etc.
Here's a screenshot of the summary report.
Is there anything obvious I might be doing wrong? Am I using JMeter in the correct way to test the application as desired?
Apologies for my JMeter inexperience.
This is not something you will be able to record and replay, my expectation is that your application is protected by OAuth so you will need some token in order to execute your calls.
Not knowing the details of your application implementation it's quite hard to guess what's went wrong, I would recommend
Running your test with 1 user and 1 loop first to ensure that it's doing what it is supposed to be doing by adding View Results Tree listener and inspecting request and response details for each sampler (especially for failed ones).
Once you figure out what's wrong with this particular request - amend JMeter configuration so it would be successful. Repeat until you're happy with the test end-to-end.
Add load only after that and be careful as test might be sensitive to extra users/loops, especially if you're using a single login account (which is not recommended)
References:
How to Handle Correlation in JMeter
How to Run Performance Tests on OAuth Secured Apps with JMeter

AppStats for managed VMs

We were running on AppEngine but recently moved over to Managed VMs. For some reason AppStats is no longer available? We just get a 404 not found error when browsing to our appstats URL. Is appstats not supported on Managad VMs? If not, is there a way of isolating poorly performing endpoints within our application?
One way to isolate poorly performing endpoints is to use the advanced filter search in the GCP Logs Viewer. It is a little hard to find at first.
To get there, in your Google Cloud console, navigate to Logging for your project. At the right of the text box for "Filter by label or text search" you will see a small dropdown arrow. Click that and select "Convert to advanced filter". This will allow you to write your own sql-ish query where you can find requests that took longer than n to complete.
For example, add the following to the filter:
protoPayload.latency>"0.300s"
This will return a list of all requests that took longer than 300 milliseconds to process. If you have Cloud Trace enabled, you can click on the request response time to see the timeline for the individual service calls.

How do I set a cost limit in Google Developers Console

Some functions in the Google Developers Console, like the Analytics API, are free until you reach a quota. Other functions, like Google Cloud Storage, create costs from the first click.
When I upload a file under https://console.developers.google.com/ > Storage > Cloud Storage > Storage Browser and I make this file publicly available, I pay about $0.12 per GB traffic.
But theoretically the traffic to this link could explode, e.g. because of sudden popularity. Therefore I would like to set something like a daily or monthly cost limit.
Q: How do I protect myself from overly high costs in the Google Developers Console?
You cannot. I asked Google about this, here's their response, from May 7 2016:
(GCE = Google cloud engine. No spending limits.
GAE = Google app engine — yes it has spending limits.)
... you are eligible for support on ... only ...
... [various helpful links] ...
That been said, at the moment there is no a feature that allows you to
configure a limited budget on GCE. This feature is certainly available
for GAE [1]. As you mentioned in your comments, you either can totally
shut down your VMs (will depend on your use case) or set the VMs to
send you alerts if they reach a certain traffic limit [2].
Sincerely,
Someone's first name
Technical Solutions Representative
Google Cloud Platform
[1] https://cloud.google.com/appengine/docs/quotas
[2] https://cloud.google.com/monitoring/support/notification-options
#wmdry, you wrote: "traffic to this link could explode" — I'm afraid of this too. That's why I asked Google about this. And I'm planning to avoid Google's CDN because of this, and use another CDN provider instead, which has spending limits. Because, unlike Nginx, I don't see any way for me to rate limit / throttle Google's CDN.
I do plan to use GCE (Google Cloud Engine) though. Therefore, right now I'm reading about how to rate limit my Nginx server. Because if I just configure Nginx correctly, then those $0.12 / GB you mentioned, cannot possible explode to ... like $10k in a month? What if Google sends a $10k bill when I'm back from an a few week's vacation, just because of my hobby project and a few people downloading a 1 MB movie over and over again forever (because: evil). Hmm, & the bigger & faster my servers, the higher the risk.
I hope Google will add spending limits, because I did want to use Google's CDN.
Update 2020: Apparently this does bite people from time to time — look here:
"Burnt $72k testing Firebase and Cloud Run and almost went bankrupt", Dec 08, 2020, https://news.ycombinator.com/item?id=25372336,
In that case, they could contact Google and in the end didn't need to pay.
As of July 2017 you can set budgets that send notifications via email but do not cap spending:
To set an alert-only budget, which will not cap spending:
Go to the Cloud Platform Console.
Open the console left side menu and click Billing
If you have more than one billing account, click the billing account name.
On the left, click Budgets & alerts.
Official help page: https://support.google.com/cloud/answer/6293540?hl=en
I found that Google's documentation now provides two methods to actually limit the cost of a GCP project. It involves the following setup:
Create a Cloud Function that checks the cost against the budget, and carries out a certain action if the cost exceeds the budget. Google's Documentation provides a sample code snip that can either shutdown all VM instances in a Project or disable the billing for a project. Shutting down all VMs would stop all VM-related cost but you get to keep your data (and still have to pay for the storage). Disabling the billing for a project would effectively zap all cost-related activities and you could lose data. You can name the Cloud Function "budget-enforcer".
The Google code snip as provided above has a hard coded ZONE variable. Remember to change it to match your zone!
Create a Service Account to run the Cloud Function "budget-enforcer". For shutting down VMs, the Service Account would need role "Compute Instance Admin (v1)". For disabling billing on a project, the Service Account would need role "Project Billing Manager".
Set a Topic for the Cloud Function (I call mine "proj-name-stop-vm" and "proj-name-disable-bill").
Set up a budget alert as usual, and connect it to one of the Pub/Sub topic above.
Please be noted that Google's documentation did mention that there could be a delay between the cost exceeds a budget and the function is triggered, so you should build in a buffer if you have an absolute hard cost limit. I use 90% of the budget as the trigger line for shutting down my instances.
The API usage can be limited with a hard limit:
Depending on the API, you can explicitly cap requests in a variety of
ways, including: requests per day, requests per 100 seconds, and
requests per 100 seconds per user. You might want to limit the
billable usage by setting caps. For example, to prevent getting billed
for usage beyond the free courtesy usage limits, you can set requests
per day caps
Source
You can combine budget pub/sub alerts with a cloud function that can disable billing on your entire account if a threshold is met.
Full Tutorial Here:
https://www.youtube.com/watch?v=KiTg8RPpGG4
GitHub Repo Here: https://github.com/aioverlords/Google-Cloud-Platform-Killswitch
To Disable Billing
const _disableBillingForProject = async projectName => {
const res = await billing.updateBillingInfo({
name: projectName,
resource: {
billingAccountName: ''
}, // Disable billing
});
console.log(res);
console.log("Billing Disabled");
return `Billing disabled: ${JSON.stringify(res.data)}`;
};
Simply go to the developer console:
https://console.developers.google.com/project
Select your project.
Select "billings & settings"
Enable billing.
Then go to Compute/AppEngine/Settings and set a daily budget.
Go to Google Cloud console, and then to Billing / Budgets and Alerts and create a new budget for one or all your projects. You can select which services should be included in the limit and set a monthly amount that should not be exceeded.

"Over quota" when using GCS json-api from App Engine

I am using Go on App Engine. In most cases, I use the file api to access GCS, which works great, except that deletes don't work so to delete files I use the JSON-API (specifically, the google-go-api-client). To authenticate, I use app engine service accounts. We are sometimes seeing an error come back of "Over quota:" with nothing after the colon. Since we are a paid app, what quota could this be? Is there a burst limit (e.g. no more than X requests in a single minute)? Is there any places where any such applicable quotas are documented?
The caching mechanism is broken for goauth2 and serviceaccount tokens. You can see the issue I created here for more detail: https://code.google.com/p/goauth2/issues/detail?id=28
I came across a "over quota" issue myself when requesting more than 60 service accounts a minute. I opened a ticket with AppEngine support (I pay for the silver package) and got this undocumented information out of them.
You can apply the patch yourself in your $GOPATH/src/code.google.com/p/goauth2/appengine/serviceaccount/cache.go file. This fixed the issue you described for my team.
Even i had found same problem and found two reasons:-
1.Daily budget
2.Logs retention
Solution:
for problem 1 increase the daily budget
for problem 2 increase the retention from 1 to higher GB
![enter image description here][1]

Resources