How do I register a multi-container model in Model Registry in AWS SageMaker? - amazon-sagemaker

I have created a multi-container model in SageMaker notebook and deployed it through an endpoint. But while attempting to do the same through a SageMaker Studio Project (build, train and deploy model template), I need to register the multi-container model through a 'sagemaker.workflow.step_collections.RegisterModel' step, which I am unable to do.
The boto3 client method,
create_model()
has a parameter 'InferenceExecutionConfig' where 'mode' is set to direct for a multi-container model.
InferenceExecutionConfig={
'Mode': 'Serial'|'Direct'
}
To my understanding, multi-container model is created through boto3 api call. I haven't found a way to create a 'sagemaker.model.Model' instance through SageMaker Python SDK which has the above parameter hence not being able to register it.
Using 'sagemaker.pipeline.PipelineModel' will result in a serial pipeline whereas I want to invoke each model directly though an endpoint.
Even the boto3 api method
create_model_package()
which creates and registers the model to a ModelPackageGroup does not have the 'InferenceExecutionConfig' parameter which I can use to register the multi-container model.
My aim is to create a SageMaker Studio Project with a multi-container endpoint which can be used to fetch outputs from two models in the project. If I'm missing something or there is some other approach to achieve this, please let me know that too.

Related

How to deploy the hugging face model via sagemaker pipeline

Below is the code to get the model from Hugging Face Hub and deploy the same model via sagemaker.
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'siebert/sentiment-roberta-large-english',
'HF_TASK':'text-classification'
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.g4dn.xlarge' # ec2 instance type
)
How can I deploy this model via sagemaker pipeline .
How can I include this code in sagemaker pipeline.
Prerequisites
SageMaker pipelines offer many different components with lots of functionality. Your question is quite general and needs to be contextualized to a specific problem.
You have to start with putting up a pipeline first.
See the complete guide "Defining a pipeline".
Quick response
My answer is to follow this official AWS guide that answers your question exactly:
SageMaker Pipelines: train a Hugging Face model, deploy it with a Lambda step
General explanation
Basically, you need to build your pipeline architecture with the components you need and register the trained model within the Model Registry.
Next, you have two paths you can follow:
Trigger a lambda that automatically deploys the registered model (as the guide does).
Out of the context of the pipeline, do the automatic deployment by retrieving the ARN of the registered model on the Model Registry. You can get it from register_step.properties.ModelPackageArn or in an external script using boto3 (e.g. using list_model_packages)

How do you resolve an "Access Denied" error when invoking `image_uris.retrieve()` in AWS Sagemaker JumpStart?

I am working in a SageMaker environment that is locked down. For example, my user account is prevented from creating S3 buckets. But, I can successfully run vanilla ML training jobs by passing in role=get_execution_role to an instance of the Estimator class when using an out-of-the-box algorithm such as XGBoost.
Now, I'm trying to use an algorithm (LightBGM) that is only available via the JumpStart feature in SageMaker, but I can't get it to work. When I try to retrieve an image URI via image_uris.retrieve(), it returns the following error:
ClientError: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied.
This makes some sense to me if my user permissions are being used when creating an object. But what I want to do is specify another role - like the one returned from get_execution_role - to perform these tasks.
Is that possible? Is there another work-around available? How can I see which role is being used?
Thanks,
When I encountered this issue, it was a permissions issue with a bucket that had changed.
In the SageMaker Python SDK source code , there is a cache that is located at in an AWS-owned bucket: jumpstart-cache-prod-{region}. and a manifest.json that translates the ECR path for the image for you.
If you look at the stack trace, it could be erroring out at the code that is looking for the manifest.
One place to look is if there are new restrictions placed in IAM, Included here is the minimum policy you need to access JumpStart (pretrained) models

How to test Spring database down?

I have this SpringBoot server app using PostgreSQL database if it's up and sending error response if it's down. So my app is running regardless the database connection.
I would very much like to test it (jUnit / mockmvc).
My question is very simple, yet I did not find the answer online:
how does one simulate a database connection loss in SpringBoot?
If anyone wants, I can supply code (project is up at https://github.com/k-wasilewski/workshop/)
Have you thought of Testcontainers? You can spin up your docker image through a Junit test and make your spring boot use that as your database.
Since you use junit, you can start/stop this container at will.
This will generate a test which creates the condition you are looking for and write code as to what to expect when the database is down.
Here are some links to get started,
Testcontainers and Junit4 with Testcontainers quickstart - https://www.testcontainers.org/quickstart/junit_4_quickstart/
Spring boot documentation - Use Testcontainers for integration testing
https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#howto-testcontainers
Testcontainer github link example for springboot app
https://github.com/testcontainers/testcontainers-java/tree/master/examples/spring-boot
Testcontainer - Generic container javadoc. You can find methods for start/stop
container here. call from your Junit.
https://javadoc.io/static/org.testcontainers/testcontainers/1.12.4/org/testcontainers/containers/GenericContainer.html
You can implement your own Datasource based on DelegatingDataSource and then let it throw exceptions instead of delegating when ever you want to.
I've done this before by creating a Spring Boot test configuration class that created the DataSource and wrapped it in a Java proxy. The proxy simply passed method calls down to the underlying DataSource, until a certain flag was set. Once the flag was set, then any method called on the proxy would throw an exception without calling the underlying DataSource. Essentially, this allowed me to "bring the database down" or "up" simply by flipping the flag.

Office 365 Multi Geo - Issue with fetching PreferredDataLocation property for a user from Azure Active Directory

I am trying to fetch PreferredDataLocation (PDL) for a user from Azure Active Directory.
I used Graph v1.0 but do not receive PDL value in the response:
https://graph.microsoft.com/v1.0/users/{upn}?$select=preferredDataLocation
But when I use Graph Beta, I receive PDL value in the response:
https://graph.microsoft.com/beta/users/{upn}?$select=preferredDataLocation
Does that mean that fetching PDL is not supported in Microsoft Graph v1.0?
I also tried using Microsoft Graph SDK, but there is no property exposed for getting PDL.
Is there a way we can fetch PDL using MS Graph SDK?
The PreferredDataLocation property of a User is only returned/supported by the /beta endpoints. Since the SDKs currently only support the production API, PreferredDataLocation isn't exposed in the object model.
Once this feature makes it into v1.0, subsequent builds of the SDK should include it. If there is an unreasonable delay in a new SDK build, you can also request that it be added. From the SDK docs:
When new features are added to the library
Generation happens as part of a manual process that occurs once a significant change or set of changes has been added to the Graph. This may include:
A new workload comes to v1.0 of Graph (Microsoft Teams, Batching, etc.)
There is a significant addition of functionality (Delta Queries, etc.)
However, this is evaluated on a case-by-case basis. If the library is missing v1.0 Graph functionality that you wish to utilize, please file an issue.

Where to find the OSB Business service configuration details in the underlying database?

In OSB Layer when the endpoint uri is changed, I need to alert the core group that the endpoint has changed and to review it. I tried SLA Alert rules but it does not have options for it. My question is, the endpoint uri should be saved somewhere in the underlying database. If so what is the schema and the table name to query it.
URI or in fact any other part of OSB artifact is not stored in relational database but rather kept in memory in it's original XML structure. It can be only accessed thru dedicated session management API. Interfaces you will need to use are part o com.bea.wli.sb.management.configuration and com.bea.wli.sb.management.query packages. Unfortunately it is not as straightforward as it sounds, in short, to extract URI information you will need to:
Create session instance(SessionManagementMBean)
Obtain ALSBConfigurationMBean instance that operates on SessionManagementMBean
Create Query object instance(BusinessServiceQuery) an run it on ALSBConfigurationMBean to get ref object to osb artifact of your interest
Invoke getServiceDefinition on your ref object to get XML service
definition
Extract URI from XML service definition with XPath
Downside of this approach is that you are basically pooling configuration each time you want to check if anything has changed.
More information including JAVA/WLST examples can be found in Oracle Fusion Middleware Java API Reference for Oracle Service Bus
There is also a good blog post describing OSB customization with WLST ALSB/OSB customization using WLST
The information about services and all its properties can be obtained via Java API. The API documentation contains sample code, so you can get it up and running quite quickly, see the Querying resources paragraph when following the given link.
We use the API to read the service (both proxy and business) configuration and for simple management.
As long as you only read the properties you do not need to handle management sessions. Once you change the values, you need to start a session and activate it once you are done -- a very similar approach to Service bus console.

Resources