deploy Huggingface model to SageMaker endpoint - amazon-sagemaker

I need to deploy a large language model (t0pp) on a SageMaker endpoint. I modified the official example to look like this:
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
role = sagemaker.get_execution_role()
hub = {
'HF_MODEL_ID':'bigscience/T0', # model_id from hf.co/models
'HF_TASK':'text2text-generation' # NLP task you want to use for predictions
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
env=hub,
role=role, # iam role with permissions to create an Endpoint
transformers_version="4.6", # transformers version used
pytorch_version="1.7", # pytorch version used
py_version="py36", # python version of the DLC
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.m5.xlarge"
)
but I'm getting this error
UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-inference-2022-09-21-15-44-30-116: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check.
Any idea what is going wrong here?

Related

How to Add Environment Variables to Google App Engine

I have deployed my Django Project to Google App Engine and I need to add environment variables.
The docs say to add them to app.yaml but that seems like bad practice because app.yaml should be in your git repository.
Is there any way to add environment variables to App Engine the same way you can add them in Cloud Run > Services > Variables & Secrets ?
Google Secret Manager is available, since this spring:
Enable Secret Manager API
Add the Secret Manager Secret Accessor role to the App Engine SA
Create secretes from the GCP Web UI or programmatically(code examples are from official documentation):
def create_secret(project_id, secret_id):
"""
Create a new secret with the given name. A secret is a logical wrapper
around a collection of secret versions. Secret versions hold the actual
secret material.
"""
# Import the Secret Manager client library.
from google.cloud import secretmanager
# Create the Secret Manager client.
client = secretmanager.SecretManagerServiceClient()
# Build the resource name of the parent project.
parent = client.project_path(project_id)
# Create the secret.
response = client.create_secret(parent, secret_id, {
'replication': {
'automatic': {},
},
})
# Print the new secret name.
print('Created secret: {}'.format(response.name))
Consume the secrets from the app instead of the environment variables:
def access_secret_version(project_id, secret_id, version_id):
"""
Access the payload for the given secret version if one exists. The version
can be a version number as a string (e.g. "5") or an alias (e.g. "latest").
"""
# Import the Secret Manager client library.
from google.cloud import secretmanager
# Create the Secret Manager client.
client = secretmanager.SecretManagerServiceClient()
# Build the resource name of the secret version.
name = client.secret_version_path(project_id, secret_id, version_id)
# Access the secret version.
response = client.access_secret_version(name)
# Print the secret payload.
#
# WARNING: Do not print the secret in a production environment - this
# snippet is showing how to access the secret material.
payload = response.payload.data.decode('UTF-8')
print('Plaintext: {}'.format(payload))
If you are using a continuous deployment process you could rewrite (or created) the app.yaml to include variables relevant to each deployment target within the CD build system.
We rewrite several files as part of our deployment process to App engine using Bitbucket pipelines. Variables can be defined at a workspace level (across multiple repositories), within a repository, and also for each deployment target defined. These variables can be secured so they are not readable.
build: &build
- step:
name: Update configuration for deployment
script:
- find . -type f -name "*.yaml" -exec sed -i "s/\[secret-key-placeholder\]/$SECRET_KEY/g" {} +
Refer to https://support.atlassian.com/bitbucket-cloud/docs/variables-in-pipelines/#Deployment-variables

How do I recreate session cookies in WebDriver for IAP secured resource?

I am trying to automate a site that is protected behind an Identity-Aware Proxy (IAP) on Google Cloud Platform GCP. I currently have access to a service account that I am able to make API requests with an OpenID Token.
When logging into the application normally (with username and password), I see the following cookies:
GCP_IAAP_AUTH_TOKEN_<Some GUID here>
GCP_IAP_UID
How can I use the service account credentials (available in a json file) to recreate these cookie values so that I can inject them into my selenium webdriver?
I ended up solving this using BrowserMob-Proxy. From their README:
BrowserMob Proxy allows you to manipulate HTTP requests and responses, capture HTTP content, and export performance data as a HAR file.
For Python3, I did the following:
Pre Requiements:
BrowserMob-Proxy installed and running
For Mac I used HomeBrew:
?> brew install browsermob-proxy
?> brew services start browsermob-proxy
Set up local python3 environment with pipenv (or your choice of virtual env manager)
?> brew install pipenv
?> pipenv --python 3.8
?> pipenv install browsermobproxy
?> pipenv install selenium
?> pipenv install ....
Ability to authenticate with data source for your webpage. Since I was utilizing a GCP service I followed the flow published in the IAP documentation for getting the authentication token found here: Authenticating from a service account
Simplified code for adding the proxy:
from selenium.webdriver import ChromeOptions
import browsermobproxy
# 1. Do whatever you need to do to get your token
token = get_auth_token()
# 2. Create browsermob client and add auth to headers
client = browsermobproxy.Client("localhost:9090") # port depends on your own setup
client.headers({"Authorization": "Bearer {}".format(token)})
# 3. Create browser (can vary wildly based on your own needs)
chrome_options = ChromeOptions()
chrome_options.add_argument("--ignore-certificate-errors") # I needed this, you may not
caps = chrome_options.to_capabilities()
client.add_to_capabilities(caps) # This is important!
# create driver instance with your capabilities

Consume SQS tasks from App Engine

I'm attempting to integrate with a third party that is posting messages on an Amazon SQS queue. I need my GAE backend to receive these messages.
Essentially, I want the following script to launch and always be running
import boto3
sqs_client = boto3.client('sqs',
aws_access_key_id=KEY,
aws_secret_access_key=SECRET,
region_name=REGION)
while True:
sqs_client.receive_message(QueueUrl=QUEUE_URL, WaitTimeSeconds=60)
for message in msgs_response.get('Messages', []):
deferred.defer(process_and_delete_message, message)
My main appengine web app is on Automatic Scaling (with the 60-second &10-minute task timeouts), but I'm thinking of setting up a micro-service set to either Manual Scaling or Basic Scaling because:
Requests can run indefinitely. A manually-scaled instance can choose to handle /_ah/start and execute a program or script for many hours without returning an HTTP response code. Task queue tasks can run up to 24 hours.
https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine
Apparently both Manual & Basic Scaling also allow "Background Threads", but I am having a hard-time finding documentation for it and I'm thinking this may be a relic from the days before they deprecated Backends in favor of Modules (although I did find this https://cloud.google.com/appengine/docs/standard/python/refdocs/modules/google/appengine/api/background_thread/background_thread#BackgroundThread).
Is Manual or Basic Scaling suited for this? If so, what should I use to listen on sqs_client.receive_message()? One thing I'm concerned about is this task/background thread dieing and not relaunching itself.
This maybe a possible solution:
Try to use a Google Compute Engine micro instance to run that script continuously and send a REST call to your app engine app. Easy Python Example For Compute Engine
OR:
I have used modules that run instance type B2/B1 for long running jobs; and I have never had any trouble; but those jobs do start and stop. I use the basic scaling: with max_instances set to 1. The jobs I have run take around 6 hours to complete.
I ended up creating a manual scaling app engine standard micro-service for this. This micro-service has handeler for /_ah/start never returns and runs indefinitely (many days at a time) and when it does get stopped, then app engine restarts it immediately.
Requests can run indefinitely. A manually-scaled instance can choose
to handle /_ah/start and execute a program or script for many hours
without returning an HTTP response code. Task queue tasks can run up
to 24 hours.
https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine
My /_ah/start handler listens to the SQS queue, and creates Push Queue tasks that my default service is set up to listen for.
I was looking into the Compute Engine route as well as the App Engine Flex route (which is essentially Compute Engine managed by app engine), but there were other complexities like not getting access to ndb and the taskqueue sdk and I didn't have time to dive into that.
Below are all of the files for this micro-service, not included is my lib folder that contains the source code for boto3 & some other libraries I needed.
I hope this helpful for someone.
gaesqs.yaml:
application: my-project-id
module: gaesqs
version: dev
runtime: python27
api_version: 1
threadsafe: true
manual_scaling:
instances: 1
env_variables:
theme: 'default'
GAE_USE_SOCKETS_HTTPLIB : 'true'
builtins:
- appstats: on #/_ah/stats/
- remote_api: on #/_ah/remote_api/
- deferred: on
handlers:
- url: /.*
script: gaesqs_main.app
libraries:
- name: jinja2
version: "2.6"
- name: webapp2
version: "2.5.2"
- name: markupsafe
version: "0.15"
- name: ssl
version: "2.7.11"
- name: pycrypto
version: "2.6"
- name: lxml
version: latest
gaesqs_main.py:
#!/usr/bin/env python
import json
import logging
import appengine_config
try:
# This is needed to make local development work with SSL.
# See http://stackoverflow.com/a/24066819/500584
# and https://code.google.com/p/googleappengine/issues/detail?id=9246 for more information.
from google.appengine.tools.devappserver2.python import sandbox
sandbox._WHITE_LIST_C_MODULES += ['_ssl', '_socket']
import sys
# this is socket.py copied from a standard python install
from lib import stdlib_socket
socket = sys.modules['socket'] = stdlib_socket
except ImportError:
pass
import boto3
import os
import webapp2
from webapp2_extras.routes import RedirectRoute
from google.appengine.api import taskqueue
app = webapp2.WSGIApplication(debug=os.environ['SERVER_SOFTWARE'].startswith('Dev'))#, config=webapp2_config)
KEY = "<MY-KEY>"
SECRET = "<MY-SECRET>"
REGION = "<MY-REGION>"
QUEUE_URL = "<MY-QUEUE_URL>"
def process_message(message_body):
queue = taskqueue.Queue('default')
task = taskqueue.Task(
url='/task/sqs-process/',
countdown=0,
target='default',
params={'message': message_body})
queue.add(task)
class Start(webapp2.RequestHandler):
def get(self):
logging.info("Start")
for loggers_to_suppress in ['boto3', 'botocore', 'nose', 's3transfer']:
logger = logging.getLogger(loggers_to_suppress)
if logger:
logger.setLevel(logging.WARNING)
logging.info("boto3 loggers suppressed")
sqs_client = boto3.client('sqs',
aws_access_key_id=KEY,
aws_secret_access_key=SECRET,
region_name=REGION)
while True:
msgs_response = sqs_client.receive_message(QueueUrl=QUEUE_URL, WaitTimeSeconds=20)
logging.info("msgs_response: %s" % msgs_response)
for message in msgs_response.get('Messages', []):
logging.info("message: %s" % message)
process_message(message['Body'])
sqs_client.delete_message(QueueUrl=QUEUE_URL, ReceiptHandle=message['ReceiptHandle'])
_routes = [
RedirectRoute('/_ah/start', Start, name='start'),
]
for r in _routes:
app.router.add(r)
appengine_config.py:
import os
from google.appengine.ext import vendor
from google.appengine.ext.appstats import recording
appstats_CALC_RPC_COSTS = True
# Add any libraries installed in the "lib" folder.
# Use pip with the -t lib flag to install libraries in this directory:
# $ pip install -t lib gcloud
# https://cloud.google.com/appengine/docs/python/tools/libraries27
try:
vendor.add('lib')
except:
print "Unable to add 'lib'"
def webapp_add_wsgi_middleware(app):
app = recording.appstats_wsgi_middleware(app)
return app
if os.environ.get('SERVER_SOFTWARE', '').startswith('Development'):
print "gaesqs development"
import imp
import os.path
import inspect
from google.appengine.tools.devappserver2.python import sandbox
sandbox._WHITE_LIST_C_MODULES += ['_ssl', '_socket']
# Use the system socket.
real_os_src_path = os.path.realpath(inspect.getsourcefile(os))
psocket = os.path.join(os.path.dirname(real_os_src_path), 'socket.py')
imp.load_source('socket', psocket)
os.environ['HTTP_HOST'] = "my-project-id.appspot.com"
else:
print "gaesqs prod"
# Doing this on dev_appserver/localhost seems to cause outbound https requests to fail
from lib import requests
from lib.requests_toolbelt.adapters import appengine as requests_toolbelt_appengine
# Use the App Engine Requests adapter. This makes sure that Requests uses
# URLFetch.
requests_toolbelt_appengine.monkeypatch()

Testing Google Cloud PubSub push endpoints locally

Trying to figure out the best way to test PubSub push endpoints locally. We tried with ngrok.io, but you must own the domain in order to whitelist (the tool for doing so is also broken… resulting in an infinite redirect loop). We also tried emulating PubSub locally. I am able to publish and pull, but I cannot get the push subscriptions working. We are using a local Flask webserver like so:
#app.route('/_ah/push-handlers/events', methods=['POST'])
def handle_message():
print request.json
return jsonify({'ok': 1}), 200
The following produces no result:
client = pubsub.Client()
topic = client('events')
topic.create()
subscription = topic.subscription('test_push', push_endpoint='http://localhost:5000/_ah/push-handlers/events')
subscription.create()
topic.publish('{"test": 123}')
It does yell at us when we attempt to create a subscription to an HTTP endpoint (whereas live PubSub will if you do not use HTTPS). Perhaps this is by design? Pull works just fine… Any ideas on how to best develop PubSub push endpoints locally?
Following the latest PubSub library documentation at the time of writing, the following example creates a subscription with a push configuration.
Requirements
I have tested with the following requirements :
Google Cloud SDK 285.0.1 (for PubSub local emulator)
Python 3.8.1
Python packages (requirements.txt) :
flask==1.1.1
google-cloud-pubsub==1.3.1
Run PubSub emulator locally
export PUBSUB_PROJECT_ID=fake-project
gcloud beta emulators pubsub start --project=$PUBSUB_PROJECT_ID
By default, PubSub emulator starts on port 8085.
Project argument can be anything and does not matter.
Flask server
Considering the following server.py :
from flask import Flask, jsonify, request
app = Flask(__name__)
#app.route('/_ah/push-handlers/events', methods=['POST'])
def handle_message():
print(request.json)
return jsonify({'ok': 1}), 200
if __name__ == "__main__":
app.run(port=5000)
Run the server (starts on port 5000) :
python server.py
PubSub example
Considering the following pubsub.py :
import sys
from google.cloud import pubsub_v1
if __name__ == "__main__":
project_id = sys.argv[1]
# 1. create topic (events)
publisher_client = pubsub_v1.PublisherClient()
topic_path = publisher_client.topic_path(project_id, "events")
publisher_client.create_topic(topic_path)
# 2. create subscription (test_push with push_config)
subscriber_client = pubsub_v1.SubscriberClient()
subscription_path = subscriber_client.subscription_path(
project_id, "test_push"
)
subscriber_client.create_subscription(
subscription_path,
topic_path,
push_config={
'push_endpoint': 'http://localhost:5000/_ah/push-handlers/events'
}
)
# 3. publish a test message
publisher_client.publish(
topic_path,
data='{"test": 123}'.encode("utf-8")
)
Finally, run this script :
PUBSUB_EMULATOR_HOST=localhost:8085 \
PUBSUB_PROJECT_ID=fake-project \
python pubsub.py $PUBSUB_PROJECT_ID
Results
Then, you can see the results in Flask server's log :
{'subscription': 'projects/fake-project/subscriptions/test_push', 'message': {'data': 'eyJ0ZXN0IjogMTIzfQ==', 'messageId': '1', 'attributes': {}}}
127.0.0.1 - - [22/Mar/2020 12:11:00] "POST /_ah/push-handlers/events HTTP/1.1" 200 -
Note that you can retrieve the message sent, encoded here in base64 (message.data) :
$ echo "eyJ0ZXN0IjogMTIzfQ==" | base64 -d
{"test": 123}
Of course, you can also do the decoding in Python.
This could be a known bug (fix forthcoming) in the emulator where push endpoints created along with the subscription don't work. The bug only affects the initial push config; modifying the push config for an existing subscription should work. Can you try that?
I failed to get PubSub emulator to work on my local env (fails with various java exceptions). I didn't even get to try various features like push with auth, etc. So I end up using ngrok to expose my local dev server and used the public https URL from ngrok in PubSub subscription.
I had no issue with whitelisting and redirects like described in the Q.
So might be helpful for anyone else.

Heroku Django: Running a Worker

I'm following the Heroku Django tutorial. I believe I followed it exactly. I ran no additional commands besides what they asked for.
However, when I get to the part where I sync the Celery and Kombu tables (under the "Running a Worker" section), I get a bug.
Typing in their command python hellodjango/manage.py syncdb, gives me the following:
...
File "/Users/Alex/Coding/getcelery/venv/lib/python2.7/site-packages/django/db/backends/dummy/base.py", line 15, in complain
raise ImproperlyConfigured("You haven't set the database ENGINE setting yet.")
django.core.exceptions.ImproperlyConfigured: You haven't set the database ENGINE setting yet.
Anybody run into this problem before? Should I be doing something that's not explicit in the tutorial?
Any hints would be greatly appreciated!
Your output is from running the syncdb locally. Enabling the database addon will set DATABASE_URL in your config, and hence the environment of the dynos (see heroku config). What it won't do is set DATABASE_URL locally - you'll need to do that yourself (or sort some other local database)
Its likely because your DATABASE dictionary is undefined. Can you attempt to add this code which should read your database from the environment variable then the CELERY db can be setup from it:
import os
import sys
import urlparse
# Register database schemes in URLs.
urlparse.uses_netloc.append('postgres')
urlparse.uses_netloc.append('mysql')
try:
# Check to make sure DATABASES is set in settings.py file.
# If not default to {}
if 'DATABASES' not in locals():
DATABASES = {}
if 'DATABASE_URL' in os.environ:
url = urlparse.urlparse(os.environ['DATABASE_URL'])
# Ensure default database exists.
DATABASES['default'] = DATABASES.get('default', {})
# Update with environment configuration.
DATABASES['default'].update({
'NAME': url.path[1:],
'USER': url.username,
'PASSWORD': url.password,
'HOST': url.hostname,
'PORT': url.port,
})
if url.scheme == 'postgres':
DATABASES['default']['ENGINE'] = 'django.db.backends.postgresql_psycopg2'
if url.scheme == 'mysql':
DATABASES['default']['ENGINE'] = 'django.db.backends.mysql'
except Exception:
print 'Unexpected error:', sys.exc_info()

Resources