I am trying to configure autoscaling for my endpoint on sagemaker. However I cannot do it in the UI as the button is greyed out after I click on a model variant. I do however have another model setup before where the button is not greyed out.
Why is this not available for some models on sagemaker but it is for others?
Any type of model hosted on SageMaker endpoint does support autoscaling. Its not restricted by model types either. How ever i can think of the following reasons due to which it might be disabled for one of the endpoints.
Are you seeing these two models in the same AWS account and using the same IAM role to update the autoscaling policies? If not, that is the issue. Update your IAM policies and the button shall work.
SageMaker doesn't support autoscaling for burstable instances such as T2, because they already allow for increased capacity under increased workloads. Is your endpoint in question running on these burstable type instances? If yes, you know the reason now. Just change the instance to any other type and it should work fine.
Related
I want to deploy a model using an Asynchronous Inference endpoint which will auto-scale. However, I cannot find the information about what quotas are required for this to work without running out of resources.
Does scaling require some specific type of quotas, so that multiple jobs can be executed in parallel on different instances of the inference container?
It really isn't clear in the documentation whether quotas apply to Asynchronous Inference endpoints or not. Clearly, they apply to real-time inference endpoints, but Asynchronous Inference documentation does not seem to mention about it at all...
AutoScaling with Async endpoints is not different than autoscaling with other Inference options i.e your AWS Quotas need to reflect the right amount of instances you wish to scale to. For instance, if you configure the min and maximum instance count in your Async autoscaling config shown below you would need a 5 instances available at your disposal. [ Reference ]
response = client.register_scalable_target(
ServiceNamespace='sagemaker',
ResourceId=resource_id,
ScalableDimension='sagemaker:variant:DesiredInstanceCount', # The number of EC2 instances for your Amazon SageMaker model endpoint variant.
MinCapacity=0,
MaxCapacity=5
)
NOTE - I work for AWS SageMaker, but my opinions are my own.
Problem. I'm looking for an agile way to shoot a docker container (stored on GCR.IO) to a managed service on GCP:
one docker container gcr.io/project/helloworld with private data (say, Cloud SQL backend) - can't face the real world.
a bunch of IPs I want to expose it to: say [ "1.2.3.4" , "2.3.4.0/24" ].
My ideal platform would be Cloud Run, but also GAE works.
I want to develop in agile way (say deploy with 2-3 lines of code), is it possible run my service secretly and yet super easily? We're not talking about a huge production project, we're talking about playing around and writing a POC you want to share securely over the internet to a few friends making sure the rest of the world gets a 403.
What I've tried so far.
The only think that works easily is a GCE vm with docker-friendly OS (like cos) where I can set up firewall rules. This works, but it's a lame docker app on a disposable VM. Machine runs forever and dies at reboot unless I stabilize it on cron/startup. Looks like I'm doing somebody else's job.
Everything else I've tried so far failed:
Cloud Run. Amazing but can't set up firewall rules on it, or Cloud Director, .. seems to work only with IAP which is painful to set up.
GAE. Works with multiple IPs and can't detach public IPs or firewall it. I managed to get the IP filtering within the app but seems a bit risky. I don't [want to] trust my coding skills :)
Cloud Armor. Only supports a HTTPS Load Balancer which I don't have. Nor I have MIGs to point to. I want simplicity.
Traffic Director and need a HTTP L7 balancer. But I have a docker container, on a single pod. Why do I need a LB?
GKE. Actually this seems to work: [1] but it's not fully managed (I need to create cluster, pods, ..)
Is this a product deficiency or am I looking at the wrong products? What's the simplest way to achieve what I want?
[1] how do I add a firewall rule to a gke service?
Please limit your question to one service. Not everyone is an expert on all Google Cloud services. You will have a better chance of a good answer for each service if they are separate questions.
In summary, if you want to use Google Cloud Security Groups to control IP based access you need to use a service that runs on Compute Engine as security groups are part of the VPC feature set. App Engine Standard and Cloud Run do not run within your project's VPC. This leaves you with App Engine Flex, Compute Engine, and Kubernetes.
I would change strategies and use Google Cloud Run managed by authentication. Access is controlled by Google Cloud IAM via OAuth tokens.
Cloud Run Authentication Overview
I have agreed with the John Hanley’s reply and I have up-voted his answer.
Also, I’ve learned that you are looking how to restrict access to your service through GCP.
By setting a firewall rules, You can limit access to your service by limiting the Source IP range as Allowed source, so that only this address will be allowed as source IP.
Please review another thread in Server Fault [1], stating how to “Restrict access to single IP only”.
https://serverfault.com/questions/901364/restrict-access-to-single-ip-only
You can do quite easily with a Serverless NEG for Cloud Run or GAE
If you're doing this in Terraform you can follow this article
I am stuck as a developer trying to explore Salesforce for integrating with my CRM which is used by our clients as SaaS.
Background of what i want to achieve in the integration
The idea is that my CRM software allows many features that Salesforce does not and vice-versa. Due to this a typical client who is using my CRM (Saas) ends up using both the softwares. This ends into duplication of efforts where for eg: a customer created in my CRM has to be copied over to Salesforce manually from Salesforce UI.
The integration that i wish to provide will work like a 2 way integration where a customer created from my CRM gets created as Accounts in Salesforce and vice-versa. Same way sync of edits and deletes across the 2 system should work.
Problem that i am having
When i started exploring Salesforce integration, i found that Salesforce allows integration to be done in below ways
Apex trigger based system - I was able to achieve 2 way syncing with Salesforce and my CRM using Apex approach. But the problem is it requires me to access their web api's to send data from my CRM to Salesforce. This feature is only supported in higher pricing plans of Salesforce (Enterprise, Unlimited, Professional - you have to pay extra if you are using professional)
App based approach (eg: Slack): I am looking more towards this approach. As Slack integration works for almost all pricing plans and is supported well. What i could not conclude clearly is - how can i create an App for my CRM and get it listed on Salesforce? How does Salesforce allow an App based access from Slack to submit data into Salesforce system for lower pricing plans. Their documentation says that Api access is only available for higher pricing plans. Then how is this achieved? For eg: you can install Salesforce app into Slack and there after you can send messages to chatter service under individual accounts of Salesforce from Slack.
I am really not sure if i have given enough insight into the problem i am having. But i tried explaining as much as possible. In short i want to integrate 2 way with Salesforce and i am looking for possible solution that is supported at lower pricing plans as well. What type of integration should i go forward with?
Look into using an ETL provider that already is a salesforce technology partner (Boomi, Jitterbit, etc...).
These are already on appexchange.com, and as certified appexchange apps, can access data in salesforce Professional Edition (which does not allow open API access).
I am a newbie in Microsoft Azure platform. I want to create multiple databases dynamically (We are developing multi-tenant model. So, Each organization should have their own database. Whenever an organization is registered with our system, we need to create a new database dynamically). Please provide some insights on this.
By using Azure Resource Manager Templates you can reliably deploy the whole infrastructure required by each organisation. So if they need a webserver, database and middleware servers, you can define the whole thing in a template and reliably deploy that for every client.
(from the above link)
You can deploy, manage, and monitor all of the resources for your solution as a group, rather than handling these resources individually.
You can repeatedly deploy your solution throughout the development lifecycle and have confidence your resources are deployed in a consistent state.
You can use declarative templates to define your deployment.
You can define the dependencies between resources so they are deployed in the correct order.
You can apply access control to all services in your resource group because Role-Based Access Control (RBAC) is natively integrated into the management platform.
You can apply tags to resources to logically organize all of the resources in your subscription.
You can clarify billing for your organization by viewing the rolled-up costs for the entire group or for a group of resources sharing the same tag.
The link above has a lot of resources for learning how to use templates as well as the syntax and usage.
There are a large number of templates at the Azure ARM Template Github page and even some pre-existing templates to get you started deploying SQL Server to Azure (there's also mysql and postgress if you prefer)
Plus many others that you can work through to get accustomed to how they work.
you can use the AZURE SQL Database REST API to do so, its as simple as sending a PUT Request to a URL https://management.azure.com/subscriptions/{subscription-id}/resourceGroups/{resource-group-name}/providers/microsoft.sql/servers/{server-name}/databases/{database-name}?api-version={api-version}
Check out these links for more details
https://msdn.microsoft.com/en-us/library/azure/mt163571.aspx
https://msdn.microsoft.com/en-us/library/azure/mt163685.aspx
I've been thinking lately about the pros and cons of using AppEngine.
My concern would be, when we create application for GAE, the front-end code (the UI stuff) is served from the same application instance in the GAE cloud as with the Datastore codes.
The question would be when my applications grows:
For GAE:
Do I need to create multiple instance of my application?
If so, what do I need to manually update all instances?
For Appscale:
Do I also need to create multiple instance of my application?
If so, what do I need to manually update all instances?
GAE starts new frontend instances automatically, you even can't create or update frontend instances. You just need to configure min/max latency, min/max idle instances in Application Settings. See docs for performance settings
Btw, there are also Backend Instances that can be Resident and started manually from Admin Console. But it's useful only when you need something very specific
You seem to have missed the whole point of AppEngine, which is that Google takes care of scaling your app for you automatically. You seem to be confusing 'instance' with 'version' - you have control over which version of your app is serving, but Google dynamically creates and kills instances of that app depending on load. That's the main benefit of using AppEngine in the first place.