Sagemaker - UnkownServiceError for Session - amazon-sagemaker
I am trying to run a simple model in sagemaker.
When trying to run the following code I keep getting this error.
It is a very simple code that I saw in some tutorials and in sagemaker examples that are included in the Jupiter notebook.
Does anyone know what should I do to make it work?
imput:
import sagemaker
sess = sagemaker.Session()
output:
---------------------------------------------------------------------------
UnknownServiceError Traceback (most recent call last)
/tmp/ipykernel_11350/4126940475.py in <cell line: 3>()
1 import sagemaker
2
----> 3 sess = sagemaker.Session()
~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker/session.py in __init__(self, boto_session, sagemaker_client, sagemaker_runtime_client, sagemaker_featurestore_runtime_client, default_bucket, settings, sagemaker_metrics_client)
131 self.settings = settings
132
--> 133 self._initialize(
134 boto_session=boto_session,
135 sagemaker_client=sagemaker_client,
~/anaconda3/envs/python3/lib/python3.8/site-packages/sagemaker/session.py in _initialize(self, boto_session, sagemaker_client, sagemaker_runtime_client, sagemaker_featurestore_runtime_client, sagemaker_metrics_client)
183 self.sagemaker_metrics_client = sagemaker_metrics_client
184 else:
--> 185 self.sagemaker_metrics_client = self.boto_session.client("sagemaker-metrics")
186 prepend_user_agent(self.sagemaker_metrics_client)
187
~/anaconda3/envs/python3/lib/python3.8/site-packages/boto3/session.py in client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
297
298 """
--> 299 return self._session.create_client(
300 service_name,
301 region_name=region_name,
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/session.py in create_client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
868 retryhandler, translate, response_parser_factory,
869 exceptions_factory, config_store)
--> 870 client = client_creator.create_client(
871 service_name=service_name, region_name=region_name,
872 is_secure=use_ssl, endpoint_url=endpoint_url, verify=verify,
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/client.py in create_client(self, service_name, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, api_version, client_config)
85 'choose-service-name', service_name=service_name)
86 service_name = first_non_none_response(responses, default=service_name)
---> 87 service_model = self._load_service_model(service_name, api_version)
88 cls = self._create_client_class(service_name, service_model)
89 region_name, client_config = self._normalize_fips_region(
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/client.py in _load_service_model(self, service_name, api_version)
152
153 def _load_service_model(self, service_name, api_version=None):
--> 154 json_model = self._loader.load_service_model(service_name, 'service-2',
155 api_version=api_version)
156 service_model = ServiceModel(json_model, service_name=service_name)
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/loaders.py in _wrapper(self, *args, **kwargs)
130 if key in self._cache:
131 return self._cache[key]
--> 132 data = func(self, *args, **kwargs)
133 self._cache[key] = data
134 return data
~/anaconda3/envs/python3/lib/python3.8/site-packages/botocore/loaders.py in load_service_model(self, service_name, type_name, api_version)
375 known_services = self.list_available_services(type_name)
376 if service_name not in known_services:
--> 377 raise UnknownServiceError(
378 service_name=service_name,
379 known_service_names=', '.join(sorted(known_services)))
UnknownServiceError: Unknown service: 'sagemaker-metrics'. Valid service names are: accessanalyzer, account, acm, acm-pca, alexaforbusiness, amp, amplify, amplifybackend, amplifyuibuilder, apigateway, apigatewaymanagementapi, apigatewayv2, appconfig, appconfigdata, appflow, appintegrations, application-autoscaling, application-insights, applicationcostprofiler, appmesh, apprunner, appstream, appsync, athena, auditmanager, autoscaling, autoscaling-plans, backup, backup-gateway, batch, braket, budgets, ce, chime, chime-sdk-identity, chime-sdk-meetings, chime-sdk-messaging, cloud9, cloudcontrol, clouddirectory, cloudformation, cloudfront, cloudhsm, cloudhsmv2, cloudsearch, cloudsearchdomain, cloudtrail, cloudwatch, codeartifact, codebuild, codecommit, codedeploy, codeguru-reviewer, codeguruprofiler, codepipeline, codestar, codestar-connections, codestar-notifications, cognito-identity, cognito-idp, cognito-sync, comprehend, comprehendmedical, compute-optimizer, config, connect, connect-contact-lens, connectparticipant, cur, customer-profiles, databrew, dataexchange, datapipeline, datasync, dax, detective, devicefarm, devops-guru, directconnect, discovery, dlm, dms, docdb, drs, ds, dynamodb, dynamodbstreams, ebs, ec2, ec2-instance-connect, ecr, ecr-public, ecs, efs, eks, elastic-inference, elasticache, elasticbeanstalk, elastictranscoder, elb, elbv2, emr, emr-containers, es, events, evidently, finspace, finspace-data, firehose, fis, fms, forecast, forecastquery, frauddetector, fsx, gamelift, glacier, globalaccelerator, glue, grafana, greengrass, greengrassv2, groundstation, guardduty, health, healthlake, honeycode, iam, identitystore, imagebuilder, importexport, inspector, inspector2, iot, iot-data, iot-jobs-data, iot1click-devices, iot1click-projects, iotanalytics, iotdeviceadvisor, iotevents, iotevents-data, iotfleethub, iotsecuretunneling, iotsitewise, iotthingsgraph, iottwinmaker, iotwireless, ivs, kafka, kafkaconnect, kendra, keyspaces, kinesis, kinesis-video-archived-media, kinesis-video-media, kinesis-video-signaling, kinesisanalytics, kinesisanalyticsv2, kinesisvideo, kms, lakeformation, lambda, lex-models, lex-runtime, lexv2-models, lexv2-runtime, license-manager, lightsail, location, logs, lookoutequipment, lookoutmetrics, lookoutvision, machinelearning, macie, macie2, managedblockchain, marketplace-catalog, marketplace-entitlement, marketplacecommerceanalytics, mediaconnect, mediaconvert, medialive, mediapackage, mediapackage-vod, mediastore, mediastore-data, mediatailor, memorydb, meteringmarketplace, mgh, mgn, migration-hub-refactor-spaces, migrationhub-config, migrationhubstrategy, mobile, mq, mturk, mwaa, neptune, network-firewall, networkmanager, nimble, opensearch, opsworks, opsworkscm, organizations, outposts, panorama, personalize, personalize-events, personalize-runtime, pi, pinpoint, pinpoint-email, pinpoint-sms-voice, polly, pricing, proton, qldb, qldb-session, quicksight, ram, rbin, rds, rds-data, redshift, redshift-data, rekognition, resiliencehub, resource-groups, resourcegroupstaggingapi, robomaker, route53, route53-recovery-cluster, route53-recovery-control-config, route53-recovery-readiness, route53domains, route53resolver, rum, s3, s3control, s3outposts, sagemaker, sagemaker-a2i-runtime, sagemaker-edge, sagemaker-featurestore-runtime, sagemaker-runtime, savingsplans, schemas, sdb, secretsmanager, securityhub, serverlessrepo, service-quotas, servicecatalog, servicecatalog-appregistry, servicediscovery, ses, sesv2, shield, signer, sms, sms-voice, snow-device-management, snowball, sns, sqs, ssm, ssm-contacts, ssm-incidents, sso, sso-admin, sso-oidc, stepfunctions, storagegateway, sts, support, swf, synthetics, textract, timestream-query, timestream-write, transcribe, transfer, translate, voice-id, waf, waf-regional, wafv2, wellarchitected, wisdom, workdocs, worklink, workmail, workmailmessageflow, workspaces, workspaces-web, xray
I was trying to update some libraries like sagemaker, boto, boto3, but nothing seems to help.
I came across the same issue trying to run a machine learning tutorial by AWS in a notebook instance. What I had to do was update sagemaker within the notebook instance like so:
import sys
!{sys.executable} -m pip install sagemaker -U
Hopefully this fixes your problem :)
I had the same issue, running
pip install sagemaker -U
solved it for me. Similar to the following issue: Sagemaker Studio UnkownServiceError for Session
Related
Declare a queue with x-max-length programmatically using Rabbitmq-c
I am implementing a RPC function for my C application , and try to programmatically declare a queue which limits maximum number of pending messages, after reading the declaration of amqp_table_entry_t and amqp_field_value_t in amqp.h , here's my minimal code sample : int default_channel_id = 1; int passive = 0; int durable = 1; int exclusive = 0; int auto_delete = 0; amqp_table_entry_t *q_arg_n_elms = malloc(sizeof(amqp_table_entry_t)); *q_arg_n_elms = (amqp_table_entry_t) {.key = amqp_cstring_bytes("x-max-length"), .value = {.kind = AMQP_FIELD_KIND_U32, .value = {.u32 = 234 }}}; amqp_table_t q_arg_table = {.num_entries=1, .entries=q_arg_n_elms}; amqp_queue_declare( conn, default_channel_id, amqp_cstring_bytes("my_queue_123"), passive, durable, exclusive, auto_delete, q_arg_table ); amqp_rpc_reply_t _reply = amqp_get_rpc_reply(conn); The code above always returns AMQP_RESPONSE_LIBRARY_EXCEPTION in the object of amqp_rpc_reply_t, with error message a socket error occurred , I don't see any active connection triggered by this code in web management UI of the RabbitMQ. so I think rabbitmq-c library doesn't establish a connection and just reply with error. However everything works perfectly when I replace the argument q_arg_table with default amqp_empty_table (which means no argument). Here are my questions : Where can I find the code which filter the invalid key of the queue argument ? according to this article , x-max-length should be correct argument key for limiting number of messages in a queue , but I cannot figure out why the library still reports error. Is there any example that demonstrates how to properly set up amqp_table_t passing in amqp_queue_declare(...) ? Development environment : RabbitMQ v3.2.4 rabbitmq-c v0.11.0 Appreciate any feedback , thanks for reading. [Edit] According to the server log rabbit#myhostname-sasl.log, RabbitMQ broker accepted a new connection, found decode error on receiving frame, then close connection immediately. I haven't figured out the Erlang implementation but the root cause is likely the decoding error on the table argument when declaring the queue. 131 =CRASH REPORT==== 18-May-2022::16:05:46 === 132 crasher: 133 initial call: rabbit_reader:init/2 134 pid: <0.23706.1> 135 registered_name: [] 136 exception error: no function clause matching 137 rabbit_binary_parser:parse_field_value(<<105,0,0,1,44>>) (src/rabbit_binary_parser.erl, line 53) 138 in function rabbit_binary_parser:parse_table/1 (src/rabbit_binary_parser.erl, line 44) 139 in call from rabbit_framing_amqp_0_9_1:decode_method_fields/2 (src/rabbit_framing_amqp_0_9_1.erl, line 791) 140 in call from rabbit_command_assembler:process/2 (src/rabbit_command_assembler.erl, line 85) 141 in call from rabbit_reader:process_frame/3 (src/rabbit_reader.erl, line 688) 142 in call from rabbit_reader:handle_input/3 (src/rabbit_reader.erl, line 738) 143 in call from rabbit_reader:recvloop/2 (src/rabbit_reader.erl, line 292) 144 in call from rabbit_reader:run/1 (src/rabbit_reader.erl, line 273) 145 ancestors: [<0.23704.1>,rabbit_tcp_client_sup,rabbit_sup,<0.145.0>] 146 messages: [{'EXIT',#Port<0.31561>,normal}] 147 links: [<0.23704.1>] 148 dictionary: [{{channel,1}, 149 {<0.23720.1>,{method,rabbit_framing_amqp_0_9_1}}}, 150 {{ch_pid,<0.23720.1>},{1,#Ref<0.0.20.156836>}}] 151 trap_exit: true 152 status: running 153 heap_size: 2586 154 stack_size: 27 155 reductions: 2849 156 neighbours:
RabbitMQ may not support unsigned integers as table values. Instead try using a signed 32 or 64-bit number (e.g., .value = {.kind = AMQP_FIELD_KIND_I32, .value = {.i32 = 234 }}). Also the RabbitMQ server logs may contain additional debugging information that can help understand errors like this as well as the amqp_error_string2 function can be used to translate error-code into an error-string.
NoSuchEntityException: An error occurred (NoSuchEntity) when calling the GetRole operation: The user with name <name> cannot be found
Call to get_execution_role() from notebook instance fails with the error message NoSuchEntityException: An error occurred (NoSuchEntity) when calling the GetRole operation: The user with name <name> cannot be found. Stack trace: NoSuchEntityExceptionTraceback (most recent call last) <ipython-input-1-1e2d3f162cfe> in <module>() 5 sagemaker_session = sagemaker.Session() 6 ----> 7 role = get_execution_role() /home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/session.pyc in get_execution_role(sagemaker_session) 871 if not sagemaker_session: 872 sagemaker_session = Session() --> 873 arn = sagemaker_session.get_caller_identity_arn() 874 875 if 'role' in arn: /home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/sagemaker/session.pyc in get_caller_identity_arn(self) 701 # Call IAM to get the role's path 702 role_name = role[role.rfind('/') + 1:] --> 703 role = self.boto_session.client('iam').get_role(RoleName=role_name)['Role']['Arn'] 704 705 return role /home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/botocore/client.pyc in _api_call(self, *args, **kwargs) 312 "%s() only accepts keyword arguments." % py_operation_name) 313 # The "self" in this scope is referring to the BaseClient. --> 314 return self._make_api_call(operation_name, kwargs) 315 316 _api_call.__name__ = str(py_operation_name) /home/ec2-user/anaconda3/envs/tensorflow_p27/lib/python2.7/site-packages/botocore/client.pyc in _make_api_call(self, operation_name, api_params) 610 error_code = parsed_response.get("Error", {}).get("Code") 611 error_class = self.exceptions.from_code(error_code) --> 612 raise error_class(parsed_response, operation_name) 613 else: 614 return parsed_response NoSuchEntityException: An error occurred (NoSuchEntity) when calling the GetRole operation: The user with name <name> cannot be found. However using boto client directly to get info about the role succeeds. This works fine: response = client.get_role( RoleName='role-name', )['Role']['Arn']
Turns out this is some weird bug that goes away if you stop and start the notebook instance.
I have shutdown and run again the notebook and it works. PD: I have to run again the code to make effect.
Sagemaker Hyperparameter Optimization XGBoost
I am trying to build a hyperparameter optimization job in Amazon Sagemaker, in python, but something is not working. Here is what I have: sess = sagemaker.Session() xgb = sagemaker.estimator.Estimator(containers[boto3.Session().region_name], role, train_instance_count=1, train_instance_type='ml.m4.4xlarge', output_path=output_path_1, base_job_name='HPO-xgb', sagemaker_session=sess) from sagemaker.tuner import HyperparameterTuner, IntegerParameter, CategoricalParameter, ContinuousParameter hyperparameter_ranges = {'eta': ContinuousParameter(0.01, 0.2), 'num_rounds': ContinuousParameter(100, 500), 'num_class': 4, 'max_depth': IntegerParameter(3, 9), 'gamma': IntegerParameter(0, 5), 'min_child_weight': IntegerParameter(2, 6), 'subsample': ContinuousParameter(0.5, 0.9), 'colsample_bytree': ContinuousParameter(0.5, 0.9)} objective_metric_name = 'validation:mlogloss' objective_type='minimize' metric_definitions = [{'Name': 'validation-mlogloss', 'Regex': 'validation-mlogloss=([0-9\\.]+)'}] tuner = HyperparameterTuner(xgb, objective_metric_name, objective_type, hyperparameter_ranges, metric_definitions, max_jobs=9, max_parallel_jobs=3) tuner.fit({'train': s3_input_train, 'validation': s3_input_validation}) And the error I get is: AttributeError: 'str' object has no attribute 'keys' The error seems to come from the tuner.py file: ----> 1 tuner.fit({'train': s3_input_train, 'validation': s3_input_validation}) ~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in fit(self, inputs, job_name, **kwargs) 144 self.estimator._prepare_for_training(job_name) 145 --> 146 self._prepare_for_training(job_name=job_name) 147 self.latest_tuning_job = _TuningJob.start_new(self, inputs) 148 ~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in _prepare_for_training(self, job_name) 120 121 self.static_hyperparameters = {to_str(k): to_str(v) for (k, v) in self.estimator.hyperparameters().items()} --> 122 for hyperparameter_name in self._hyperparameter_ranges.keys(): 123 self.static_hyperparameters.pop(hyperparameter_name, None) 124 AttributeError: 'list' object has no attribute 'keys'
Your arguments when initializing the HyperparameterTuner object are in the wrong order. The constructor has the following signature: HyperparameterTuner(estimator, objective_metric_name, hyperparameter_ranges, metric_definitions=None, strategy='Bayesian', objective_type='Maximize', max_jobs=1, max_parallel_jobs=1, tags=None, base_tuning_job_name=None) so in this case, your objective_type is in the wrong position. See the docs for more details.
Django/pyodbc error: not enough arguments for format string
I have a Dictionary model defined in Django (1.6.5). One method (called get_topentities) returns the top names in my dictionary (entity names are defined by Entity model): def get_topentities(self,n): entities = self.entity_set.select_related().filter(in_dico=True,table_type=0).order_by("rank")[0:n] return entities When I call the function (say with n=2), it returns the top 2 elements but I cannot access the second one because of this "not enough arguments to format string" error: In [5]: d = Dictionary.objects.get(code='USA') In [6]: top2 = d.get_topentities(2) In [7]: top2 Out[7]: [<Entity: BARACK OBAMA>, <Entity: GOVERNMENT>] In [8]: top2[0] Out[8]: <Entity: BARACK OBAMA> In [9]: top2[1] . . /usr/local/lib/python2.7/dist-packages/django_pyodbc/compiler.pyc in as_sql(self, with_limits, with_col_aliases) 172 # Lop off ORDER... and the initial "SELECT" 173 inner_select = _remove_order_limit_offset(raw_sql) --> 174 outer_fields, inner_select = self._alias_columns(inner_select) 175 176 order = _get_order_limit_offset(raw_sql)[0] /usr/local/lib/python2.7/dist-packages/django_pyodbc/compiler.pyc in _alias_columns(self, sql) 339 340 # store the expanded paren string --> 341 parens[key] = buf% parens 342 #cannot use {} because IBM's DB2 uses {} as quotes 343 paren_buf[paren_depth] += '(%(' + key + ')s)' TypeError: not enough arguments for format string In [10]: My server backend is MSSQL and I'm using pyodbc as the database driver. If I try the same on a PC with engine sqlserver_ado, it works. Can someone help? Regards, Patrick
google-app-engine eclipse policyutil.getKeyStore illegalarguementexception can't start debug environment
I can't seem to start my development environment for gae on eclipse anymore. Once I start it it goes to the debug view with source not found for the PolicyUtil.getKeyStore with an illegalarguementexception. Has anyone else had this problem? I've tried deleting my run configs but no luck. Any help is much appreciated. Here's the stack trace: {Daemon Thread [Thread-1] (Suspended (exception IllegalArgumentException)) PolicyUtil.getKeyStore(URL, String, String, String, String, Debug) line: 65 PolicyFile.init(URL, PolicyFile$PolicyInfo) line: 635 PolicyFile.access$400(PolicyFile, URL, PolicyFile$PolicyInfo) line: 266 PolicyFile$3.run() line: 546 AccessController.doPrivileged(PrivilegedAction<T>) line: not available [native method] PolicyFile.initPolicyFile(String, String, PolicyFile$PolicyInfo) line: 519 PolicyFile.initPolicyFile(PolicyFile$PolicyInfo, URL) line: 505 PolicyFile.init(URL) line: 464 PolicyFile.<init>() line: 309 NativeConstructorAccessorImpl.newInstance0(Constructor, Object[]) line: not available [native method] NativeConstructorAccessorImpl.newInstance(Object[]) line: 39 DelegatingConstructorAccessorImpl.newInstance(Object[]) line: 27 Constructor<T>.newInstance(Object...) line: 513 Class<T>.newInstance0() line: 355 Class<T>.newInstance() line: 308 Policy.getPolicyNoCheck() line: 167 ProtectionDomain.implies(Permission) line: 224 AccessControlContext.checkPermission(Permission) line: 352 AccessController.checkPermission(Permission) line: 546 SecurityManager.checkPermission(Permission) line: 532 Policy.getPolicy() line: 133 SecurityManagerInstaller.install(URL...) line: 81 DevAppServerFactory.createDevAppServer(File, File, File, String, int, boolean, boolean, Map<String,Object>) line: 136 DevAppServerFactory.createDevAppServer(File, File, File, String, int, boolean) line: 78 DevAppServerFactory.createDevAppServer(File, String, int) line: 52 DevAppServerMain$StartAction.apply() line: 175 Parser$ParseResult.applyArgs() line: 48 DevAppServerMain.<init>(String[]) line: 128 DevAppServerMain.main(String[]) line: 104} I went on and downloaded the latest for GAE and the eclipse plugin hoping that would fix it but no luck. Regards, John
Fixed - My workspace was corrupt. I created a new workspace folder and had to import my project and settings.