How to make inference on local PC with the model trained on AWS SageMaker by using the built-in algorithm Semantic Segmentation?

I have trained a model on AWS SageMaker by using the built-in algorithm Semantic Segmentation. This trained model named as model.tar.gz is stored on S3. So I want to download this file from S3 and then use it to make inference on my local PC without using AWS SageMaker anymore. Since the built-in algorithm Semantic Segmentation is built using the MXNet Gluon framework and the Gluon CV toolkit, so I try to refer the documentation of mxnet and gluon-cv to make inference on local PC.
It's easy to download this file from S3, and then I unzip this file to get three files:
hyperparams.json: includes the parameters for network architecture, data inputs, and training. Refer to Semantic Segmentation Hyperparameters.
Both model_algo-1 and model_best.params are the trained models, and I think it's the output from net.save_parameters (Refer to Train the neural network). I can also load them with the function mxnet.ndarray.load.
Refer to Predict with a pre-trained model. I found there are two necessary things:
Reconstruct the network for making inference.
Load the trained parameters.
As for reconstructing the network for making inference, since I have used PSPNet from training, so I can use the class gluoncv.model_zoo.PSPNet to reconstruct the network. And I know how to use some services of AWS SageMaker, for example batch transform jobs, to make inference. I want to reproduce it on my local PC. If I use the class gluoncv.model_zoo.PSPNet to reconstruct the network, I can't make sure whether the parameters for this network are same those used on AWS SageMaker while making inference. Because I can't see the image in detail.
As for loading the trained parameters, I can use the load_parameters. But as for model_algo-1 and model_best.params, I don't know which one I should use.

The following code works well for me.
import mxnet as mx
from mxnet import image
from import test_transform
import gluoncv
# use cpu
ctx = mx.cpu(0)
# load test image
img = image.imread('./img/IMG_4015.jpg')
img = test_transform(img, ctx)
img = img.astype('float32')
# reconstruct the PSP network model
model = gluoncv.model_zoo.PSPNet(2)
# load the trained model
# make inference
output = model.predict(img)
predict = mx.nd.squeeze(mx.nd.argmax(output, 1)).asnumpy()


What is the recommended way to create a Custom Sink for AWS Sagemaker Feature Store in Apache Flink?

I want to create a Custom Apache Flink Sink to AWS Sagemaker Feature store, but there is no documentation for how to create custom sinks on Flink's website. There are also multiple base classes that I can potentially extend (e.g. AsyncSinkBase, RichSinkFunction), so I'm not sure which to use.
I am looking for guidelines regarding how to implement a custom sink (both in general and for my specific use-case). For my specific use-case: Sagemaker Feature Store has a synchronous client with a putRecord call to send records to AWS Sagemaker FS, so I am ideally looking for a way to create a custom sink that would work well with this client. Note: I require at at least once processing guarantees, as Sagemaker FS is DynamoDB (a key-value store) under the hood.
Java Client:
Example of the putRecord call using the Python client:
What I've Found so Far
Some older articles which say to use org.apache.flink.streaming.api.functions.sink.RichSinkFunction and SinkFunction
Some connectors using classes in org.apache.flink.connector.base.sink.writer (e.g. AsyncSinkWriter, AsyncSinkBase)
This section of the Flink docs says to use the SourceReaderBase from org.apache.flink.connector.base.source.reader when creating custom sources; SourceBaseReader seems to be the equivalent source to the sink classes in the bullet above
Any help/guidance/insights are much appreciated, thanks.
How about extending RichAsyncFunction ?
you can find similar example here -

PyTorch Lightning with Amazon SageMaker

We’re currently running using Pytorch Lightning for training outside of SageMaker. Looking to use SageMaker to leverage distributed training, checkpointing, model training optimization(training compiler) etc to accelerate training process and save costs. Whats the recommended way to migrate their PyTorch Lightning scripts to run on SageMaker?
The easiest way to run Pytorch Lightning on SageMaker is to use the SageMaker PyTorch estimator (example) to get started. Ideally you will have add a requirement.txt for installing pytorch lightning along with your source code.
Regarding distributed training Amazon SageMaker recently launched native support for running Pytorch lightning based distributed training. Please follow the below link to setup your training code
There's no big difference in running PyTorch Lightning and plain PyTorch scripts with SageMaker.
One caveat, however, when running distributed training jobs with DDPPlugin, is to set properly the NODE_RANK environment variable at the beginning of the script, because PyTorch Lightning knows nothing about SageMaker environment variables and relies on generic cluster variables:
os.environ["NODE_RANK"] = str(int(os.environ.get("CURRENT_HOST", "algo-1")[5:]) - 1)
or (more robust):
rc = json.loads(os.environ.get("SM_RESOURCE_CONFIG", "{}"))
os.environ["NODE_RANK"] = str(rc["hosts"].index(rc["current_host"]))
Since your question is specific to migration of already working code into Sagemaker, using the link here as reference, I can try to break the process into 3 parts :
Create a Pytorch Estimator - estimator
import sagemaker
sagemaker_session = sagemaker.Session()
pytorch_estimator = PyTorch(
output_path: << s3 bucket >>,
source_dir = << path for >> ,
entry_point = "" - this part should be your existing Pytorch Lightning script. In the main method you can have something like this:
if __name__ == '__main__':
import pytorch_lightning as pl
trainer = pl.Trainer(
devices=-1, ## in order to utilize all GPUs
Also , the link here explains the coding process very well .

Sagemaker export and load model to memory

I have created a model using sagemaker (on aws ml notebook).
I then exported that model to s3 and a .tar.gz file was created there.
Im trying to find a way to load the model object to memory in my code (without using AWS docker images and deployment) and run a prediction on it.
I looked for functions to do that in the model section of the sagemaker docs, but everything there is tightly coupled to the AWS docker images.
I then tried opening the file with tarfile and shutil packages but that was useless.
Any ideas?
With the exception of XGBoost, built-in algorithms are implemented with Apache MXNet, so simply extract the model from the .tar.gz file and load it with MXNet: load_checkpoint() is the API to use.
XGBoost models are just pickled objects. Unpickle and load in sklearn:
$ python3
>>> import sklearn, pickle
>>> model = pickle.load(open("xgboost-model", "rb"))
>>> type(model)
<class 'xgboost.core.Booster'>
Models trained with built-in library (Tensorflow, MXNet, Pytorch, etc.) are vanilla models that can be loaded as-is with the correct library.
Hope this helps.

Tensorflow Keras API on Google cloud

I have a question on using tensorflow on google cloud platform.
I heard that Google cloud tensorflow doesnt support Keras ( However, now i can see that Tensorflow has its own API to access Keras (
Given this, can I use the above mentioned API inside google cloud, since it is coming out along with Tensorflow package? Any idea sir?
I am able to access this API from the tensorflow installed on a anaconda machine.
Option 1# Please try package-path option.
As per the docs...
"Path to a Python package to build. This should point to a directory containing the Python source for the job"
Try and give a relative path to keras from your main script.
More details here:
Option 2# If you have a file
Inside your file within setup call pass argument install_requires=['keras']
Google Cloud Machine Learning Engine does support Keras (, but you have to list it as a dependency when starting a training job. For some specific instructions, see this SO post, or a longer exposition on this blog page. If you'd like to serve your model on Google Cloud Machine Learning or using TensorFlow Serving, then see this SO post about exporting your model.
That said, you can also use tf.contrib.keras, as long as you use the --runtime-version=1.2 flag. Just keep in mind that packages in contrib are experimental and may introduce breaking API changes between versions.
Have a look at this example on git which I saw was recenly added:
Keras Cloud ML Example

Google Cloud Service Java client configuration

We are considering using Google Cloud Storage as an alternative to AWS, and so are planning to do some performance testing on GCS. One of the features we would like to test is searching for files at a certain path. Unfortunately, the SDK does not have the ability to search for a prefix. Instead, we are forced to use the Java client API. Here is the relevant code which is failing:
GcsService gcsService = GcsServiceFactory.createGcsService(RetryParams.getDefaultInstance());
AppIdentityService appIdentity = AppIdentityServiceFactory.getAppIdentityService();
ListOptions.Builder b = new ListOptions.Builder();
ListResult result = gcsService.list("rms-test-bucket",;
Specifically, the code rolls over on the call to gcsService.list() with a NullPointerException. I attached all sources in IntelliJ, stepped through the code, and found that the cause was a call to ApiProxy.getDelegate() returning null, when it should have returned a non null value.
We suspect that there is a configuration problem somewhere although it is not clear what it might be.
Where are you running that code from? This could should be run in AE standard or AE Flexible compat (as that API is App Engine specific). For all other cases you should use the google-cloud-java client. In fact I would suggest using that client even on AE as it is supported on all platform and much richer in its functionality. For more information see here.
I'm not entirely sure what's wrong with your example, but if your goal is strictly to test GCS performance with searching for files at a certain path, the gsutil command-line utility contains a solid implementation of that logic. You could use it to evaluate performance. If you're testing from a GCE instance, it's already preinstalled.
