How to Enable SageMaker Debugger in the SageMaker AutoPilot - amazon-sagemaker

I'd like to (a) plot SHAP values out of the SageMaker (b) AutoML pipeline. To achieve (a), debugger shall be used according to: https://aws.amazon.com/blogs/machine-learning/ml-explainability-with-amazon-sagemaker-debugger/.
But how to enable the debug model in the AutoPilot without hacking into the background?

SageMaker Autopilot doesn't support SageMaker Debugger out of the box currently (as of Dec 2020). You can hack the Hyperparameter Tuning job to pass in a debug parameter.
However, there is a way to use SHAP with Autopilot models. Take a look at this blog post explaining how to use SHAP with SageMaker Autopilot: https://aws.amazon.com/blogs/machine-learning/explaining-amazon-sagemaker-autopilot-models-with-shap/.

Related

Amazon SageMaker multi GPU: No objective found

I have a question on Sagemaker multi GPU - IHAC running their code in single gpu instances (ml.p3.2xlarge) but when they select ml.p3.8xlarge(multi gpu), it is running into the following error:
“Failure reason: No objective metrics found after running 5 training jobs. Please ensure that the custom algorithm is emitting the objective metric as defined by the regular expression provided.”
Their code handles multi gpu usage and currently works well on their machine outside of AWS. Do you have any documentation that you can point me to help them address the problem? They are currently using PyTorch for all of their model development.
Looks like they are running Hyperparameter Optimization (HPO) on Sagemaker and no metrics is being emitted by their code that allows HPO to tune. It is a problem with how they specify regular expression objective metric, for more details see SageMaker Estimator Metrics Definitions.
Essentially use a tool like https://regex101.com to validate the regex they use extracts the objective number from their training logs.

fastai distributed training in SageMaker?

For anyone with experience with fastai’s distributed training (either within SageMaker or outside it):
Are there any material benefits to using it over PyTorch DDP (which it’s built on top of)?
What would be the easiest way to incorporate this inside a SM training job?
It requires the training script to be run like python -m fastai.launch scriptname.py ...args... so using Script Mode is not immediately straightforward. Pointing to a .sh file for the entry_point in the PyTorch estimator means that SageMaker will not pip install from a provided requirements.txt so the user must control all of this inside their bash script.
Can SM Distributed Data Parallel be used with the fastAI distributed training? Or do we need to utilize Pytorch DDP instead of fastai in order to use SM DDP?

Amazon SageMaker Model Monitor for Batch Transform jobs

Couldn't find the right place to ask this, so doing it here.
Does Model Monitor support monitoring Batch Transform jobs, or only endpoints? The documentation seems to only reference endpoints...
We just launched the support.
Here are the sample notebook:
https://github.com/aws/amazon-sagemaker-examples/tree/main/sagemaker_model_monitor/model_monitor_batch_transform
Here is the what's new post:
https://aws.amazon.com/about-aws/whats-new/2022/10/amazon-sagemaker-model-monitor-batch-transform-jobs/

Continuous Training in Sagemaker

I am trying out Amazon Sagemaker, I haven't figured out how we can have Continuous training.
For example if i have a CSV file in s3 and I want to train each time the CSV file is updated.
I know we can go again to the notebook and re-run the whole notebook to make this happen.
But i am looking for an automated way, with some python scripts or using a lambda function with s3 events etc
You can use boto3 sdk for python to start training on lambda then you need to trigger the lambda when csv is update.
http://boto3.readthedocs.io/en/latest/reference/services/sagemaker.html
Example python code
https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-train-model-create-training-job.html
Addition: You dont need to use lambda you just start/cronjob the python script any kind of instance which has python and aws sdk in it.
There are a couple examples for how to accomplish this in the aws-samples GitHub.
The serverless-sagemaker-orchestration example sounds most similar to the use case you are describing. This example walks you through how to continuously train a SageMaker linear regression model for housing price predictions on new CSV data that is added daily to a S3 bucket using the built-in LinearLearner algorithm, orchestrated with Amazon CloudWatch Events, AWS Step Functions, and AWS Lambda.
There is also the similar aws-sagemaker-build example but it might be more difficult to follow currently if you are looking for detailed instructions.
Hope this helps!

how to integrate bugzilla and HP quality center?

I'm working on integrating Bugzilla with HP Qc. I'm performing this by using perl script by directly manipulating the database using sql commands. I want to use the web services of Bugzilla. I have gone through the Bugzilla webservice API but tat wasn't enough to get started. I'm a beginner and this is the first project of my career. How do I go about this?
Check out the Perl script bz_webservice_demo.pl in Bugzilla's contrib directory, it shows how to talk to Bugzilla via XMLRPC.
There are a few things you could do:
Export defects from Bugzilla into a spreadsheet and upload it into Quality Center
Use the Open Test Architecture API (OTAClient.dll) to update defects in Quality Center
Use the HP Synchronization Server and build an adapter
Using the HP Synchronizer is probably the only "real" way to do it. Though you could potentially build your own sync mechanism, potentially using just OTA and a message queue.
There may be an existing adapter available from proficom-ag based on a presentation I found via a web search

Resources