Sagemaker: DeepAR Hyperparameter Tuning Error - amazon-sagemaker

Running into a new issue with tuning DeepAR on SageMaker when trying to initialize a hyperparameter tuning job - this error also occurs when calling the test:mean_wQuantileLoss. I've upgraded the sagemaker package, restarted my instance, restarted the kernel (using a juptyer notebook), and yet the problem persists.
ClientError: An error occurred (ValidationException) when calling the
CreateHyperParameterTuningJob operation: The objective metric type, [Maximize], that you specified for objective metric, [test:RMSE], isn’t valid for the [156387875391.dkr.ecr.us-west-2.amazonaws.com/forecasting-deepar:1] algorithm. Choose a valid objective metric type.
Code:
my_tuner = HyperparameterTuner(estimator=estimator,
objective_metric_name="test:RMSE",
hyperparameter_ranges=hyperparams,
max_jobs=20,
max_parallel_jobs=2)
# Start hyperparameter tuning job
my_tuner.fit(inputs=data_channels)
Stack Trace:
ClientError Traceback (most recent call last)
<ipython-input-66-9d6d8de89536> in <module>()
7
8 # Start hyperparameter tuning job
----> 9 my_tuner.fit(inputs=data_channels)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in fit(self, inputs, job_name, include_cls_metadata, **kwargs)
255
256 self._prepare_for_training(job_name=job_name, include_cls_metadata=include_cls_metadata)
--> 257 self.latest_tuning_job = _TuningJob.start_new(self, inputs)
258
259 #classmethod
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/tuner.py in start_new(cls, tuner, inputs)
525 output_config=(config['output_config']),
526 resource_config=(config['resource_config']),
--> 527 stop_condition=(config['stop_condition']), tags=tuner.tags)
528
529 return cls(tuner.sagemaker_session, tuner._current_job_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in tune(self, job_name, strategy, objective_type, objective_metric_name, max_jobs, max_parallel_jobs, parameter_ranges, static_hyperparameters, image, input_mode, metric_definitions, role, input_config, output_config, resource_config, stop_condition, tags)
348 LOGGER.info('Creating hyperparameter tuning job with name: {}'.format(job_name))
349 LOGGER.debug('tune request: {}'.format(json.dumps(tune_request, indent=4)))
--> 350 self.sagemaker_client.create_hyper_parameter_tuning_job(**tune_request)
351
352 def stop_tuning_job(self, name):
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
312 "%s() only accepts keyword arguments." % py_operation_name)
313 # The "self" in this scope is referring to the BaseClient.
--> 314 return self._make_api_call(operation_name, kwargs)
315
316 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
610 error_code = parsed_response.get("Error", {}).get("Code")
611 error_class = self.exceptions.from_code(error_code)
--> 612 raise error_class(parsed_response, operation_name)
613 else:
614 return parsed_response
ClientError: An error occurred (ValidationException) when calling the CreateHyperParameterTuningJob operation:
The objective metric type, [Maximize], that you specified for objective metric, [test:RMSE], isn’t valid for the [156387875391.dkr.ecr.us-west-2.amazonaws.com/forecasting-deepar:1] algorithm.
Choose a valid objective metric type.

It looks like you are trying to maximize this metric, test:RMSE can only be minimized by SageMaker HyperParameter Tuning.
To achieve this in the SageMaker Python SDK, create your HyperparameterTuner with objective_type='Minimize'. You can see the signature of the init method here.
Here is the change you should make to your call to HyperparameterTuner:
my_tuner = HyperparameterTuner(estimator=estimator,
objective_metric_name="test:RMSE",
objective_type='Minimize',
hyperparameter_ranges=hyperparams,
max_jobs=20,
max_parallel_jobs=2)

Related

Imagenet ILSVRC2013 submission failed

Recently, i have submitted classification challenge on imagenet ILSVRC2013
Unfortunately, i received an email like this:
Dear participant,
We received your submission at 2022-12-14 01:39:08 for the classification challenge.
However, it was found that your submission did not conform to the specifications we mentioned. We were therefore unable to evaluate your submission. Please read the ILSVRC 2013 page for more details.
ILSVRC 2013 team
The contents of my submission file are as follows, including 100,000 rows of data
`771 778 794 387 650
363 691 764 923 427
737 369 430 531 124`
Want to know where my submission went wrong.
Any help is greatly appreciated!

How to improve SQL performance for search of top level domains

Trying to find a decent .com domain name so I downloaded complete list of .com domains from Verisign with the aim of running some SQL queries against it. One key goal is to run a query that checks dictionary sized list of English words to see if any don't have a.com domain. I don't use online service in part because I haven't found a service that gives me this sort of fine tune query control but it's also because I'm curious how to do this.
My first step was to import Verisign's com.zone file (a text file) into local developer version of SQL Server using built-in import flat file wizard. It created a column I named RawData (datatype nvarchar(450), no nulls) in a table I named Com. It has ~352 million records. The records need some cleanup (e.g. don't need nameserver details, and some records don't seem parsed the same as others) but the domain names themselves seem to have been imported successfully.
I also created another table (~372K records, nvarchar(450), no nulls) named Words which a column named Word that's a listing of most English words (e.g. the, internet, was, made, for, cat, videos, etc.. no definitions, just one word per record).
An immediate hurdle I've run into though is performance. Even a basic query to check availability of a single domain name is slow. When I run
SELECT *
FROM Com
WHERE RawData = '%insert-some-domain-name-here%'
the execution time is approximately 4 minutes (using laptop with i9-9880h, 32GB RAM, 2TB NVMe SSD).
Seeing as I'd prefer not to die of old age before any theoretical dictionary sized query ended, any suggestions on how to write the query and/or database alterations to get me to the end goal of reasonably fast search that generates a list of English words that don't have domain names.
Trying to find a decent .com domain name so I downloaded complete list
of .com domains from Verisign with the aim of running some SQL queries
against it. One key goal is to run a query that checks dictionary
sized list of English words to see if any don't have a.com domain.
Forget about the idea. The pool of available domain names has been mined to death a long time ago. You are not the first to try the idea. The names that are still unregistered are obscure dictionary words that are not really usable and have no commercial use.
There are interesting things you can do, but finding good names 'overlooked' by others is not among them. But you will see for yourself.
Regarding the parsing of the zone file: here is what today's .com zone file looks like:
; The use of the Data contained in Verisign Inc.'s aggregated
; .com, and .net top-level domain zone files (including the checksum
; files) is subject to the restrictions described in the access Agreement
; with Verisign Inc.
$ORIGIN COM.
$TTL 900
# IN SOA a.gtld-servers.net. nstld.verisign-grs.com. (
1587225735 ;serial
1800 ;refresh every 30 min
900 ;retry every 15 min
604800 ;expire after a week
86400 ;minimum of a day
)
$TTL 172800
NS A.GTLD-SERVERS.NET.
NS G.GTLD-SERVERS.NET.
NS H.GTLD-SERVERS.NET.
NS C.GTLD-SERVERS.NET.
NS I.GTLD-SERVERS.NET.
NS B.GTLD-SERVERS.NET.
NS D.GTLD-SERVERS.NET.
NS L.GTLD-SERVERS.NET.
NS F.GTLD-SERVERS.NET.
NS J.GTLD-SERVERS.NET.
NS K.GTLD-SERVERS.NET.
NS E.GTLD-SERVERS.NET.
NS M.GTLD-SERVERS.NET.
COM. 86400 DNSKEY 257 3 8 AQPDzldNmMvZFX4NcNJ0uEnKDg7tmv/F3MyQR0lpBmVcNcsIszxNFxsBfKNW9JYCYqpik8366LE7VbIcNRzfp2h9OO8HRl+H+E08zauK8k7evWEmu/6od+2boggPoiEfGNyvNPaSI7FOIroDsnw/taggzHRX1Z7SOiOiPWPNIwSUyWOZ79VmcQ1GLkC6NlYvG3HwYmynQv6oFwGv/KELSw7ZSdrbTQ0HXvZbqMUI7BaMskmvgm1G7oKZ1YiF7O9ioVNc0+7ASbqmZN7Z98EGU/Qh2K/BgUe8Hs0XVcdPKrtyYnoQHd2ynKPcMMlTEih2/2HDHjRPJ2aywIpKNnv4oPo/
COM. 86400 DNSKEY 256 3 8 AwEAAbbFc1fjkBCSycht7ah9eeRaltnLDK2sVyoxkjC6zBzm/5SGgfDG/H6XEupT7ctgCvnqexainTIfa8nnBYCOtAec7Gd1vb6E/3SXkgiDaMUJXmdt8E7obtVZqjFlN2QNnTljfMiECn16rZXlvXIi255T1wFkWtp5+LUCiufsLTeKc9xbQw7y0ucsR+GKz4yEStbYi98fnB5nOzzWhRUclf0=
COM. 86400 DNSKEY 256 3 8 AwEAAcpiOic4s641IPlBcMlBWA0FFomUWuKDWN5CzId/la4aA69RFpakRxPSZM8fegOQ+nYDrUY6UZkQRsowPr18b+MqyvHBUaT6CJUBkdRwlVcD/ikpcjvfGEiH5ttpDdZdS/YKZLBedh/uMCDLNS0baJ+nfkmMZGkYGgnK9K8peU9unWbwAOrJlrK60flM84EUolIIYD6s9g/FfyVB0tE86fE=
COM. 86400 NSEC3PARAM 1 0 0 -
COM. 900 RRSIG SOA 8 1 900 20200425160215 20200418145215 39844 COM. ItE0mu9Hb2meliHlot2/6f0cMvCJThPps/BxbyRkDDYesfLBVXqtIRHiDN+wlf7HS+lxFtLHIUzT0GAPf2y5cA0s3pUdBxyRft0fC76GEJq7g0Tcpifdxft4T/6XTv77rP8pFE7aSp+SMDtUMRFIGnGTGBo7WRhjIx1G0peGXMr13xRg4Pa9kigGtjSRi3SWyNT9x1IjVVVJtsFzP9sELQ==
COM. RRSIG NS 8 1 172800 20200423045013 20200416034013 39844 COM. D9BeOQ8drx8LiXXBOk6KlxKacpno/tPujwOAPd482Kj+yAQkFxVVL1bqU03WA7c12W/mkLxk665OQDfOOoirqMePDuamvQCguaSFVKVm5no42JsxoitzwOo+g0kwm9u2F/xGO9ybPfcEQ/nrH9de/RluVSVc0MPsCMja2sCuohEMSYApMjFs4XcXsED0lFTzllIESW7JvK8xb8RFId0TOw==
COM. 86400 RRSIG NSEC3PARAM 8 1 86400 20200423045013 20200416034013 39844 COM. l37dFS1JFXDg9gr1oGACX2rI/iegsIX3RlAEpshuIsT4isZ0FAw0pkAVJQvyqd5a3IOO1TjczWN1U/eYB2ynvX1MKLg3NEp01zUo67eJgowjV+g3zF3XtifhoW///Tqqz0GuAk443jol/Ue00SX3k6XgzxbycX9GKR9FmsLaIIowvz991eJyL1mgOpzQvLnIL1/EAZi9felFilkrj5JaIA==
COM. 86400 RRSIG DNSKEY 8 1 86400 20200430182421 20200415181921 30909 COM. Zkw5YJzP75sdfLN8kN/y8/ywFX+DvotF6fVdxKQGdmgJyyUnDliP6q0VXqVpHHDwtW2WfOlwskiW6007+MIGqV5VtTL6tGeZLv4hJDYZkAwIrl/xBN+aQmIvan4UdBROkOAnfi5Atf1adX5iCqn1jfIGMXb8kKVMrDuDJc/V6XbXEL8NysnqRQtdC6bVuunDQOg/Edw1Uy+B9ly9njU/EmlISkNZo2jo+cXJBFS+Is/6Xcn04+jkHiSRuAFwaGPxKPLeG92v+5ea7pWXBpSIwiqD7Gp/yJCvCUrRZP8eJcoYGav0TT7Bsp2ml15dV20FmNnBPdTqKZHtT8HAIp60Qw==
KITCHENEROKTOBERFEST NS NS1.UNIREGISTRYMARKET.LINK.
KITCHENEROKTOBERFEST NS NS2.UNIREGISTRYMARKET.LINK.
KITCHENFLOORTILE NS NS1.UNIREGISTRYMARKET.LINK.
KITCHENFLOORTILE NS NS2.UNIREGISTRYMARKET.LINK.
KITCHENTABLESET NS NS1.UNIREGISTRYMARKET.LINK.
KITCHENTABLESET NS NS2.UNIREGISTRYMARKET.LINK.
KITEPICTURES NS NS1.UNIREGISTRYMARKET.LINK.
KITEPICTURES NS NS2.UNIREGISTRYMARKET.LINK.
BOYSBOXERS NS NS1.UNIREGISTRYMARKET.LINK.
BOYSBOXERS NS NS2.UNIREGISTRYMARKET.LINK.
What you are interested in is the lines that contain NS in second position. In this example KITCHENEROKTOBERFEST.COM has two name servers declared. Since this is the .com zone file, .com has to be assumed when the name does not end with a dot. So you have to filter on those lines and remove duplicates. You should end up with about 140 million .com domain names. Definitely not 352 million - either you have duplicates or you have imported the wrong data.
That gives you an idea of how crowded the .com zone is. Don't be surprised that any name that remotely makes sense is already registered.
When you have loaded the data to a table, the remaining problem is indexing the data to achieve good performance while making queries. I understand this issue has now been addressed so I won't be making further suggestions on that.
One thing you can do to highlight pure dictionary domains is to do a JOIN of two tables: your domain table and the keyword table. This should need indexes on both sides to run optimally, and I don't have details about your table structure. Or try each keyword by one by one against the domain table in a loop (can use a cursor).
I too feel like this is a whole new question.

Apache Flink: Write a DataStream to a Postgres table

I'm trying to code a streaming job which sinks a data stream into a postgres table. To give full information, I based my work on articles : https://tech.signavio.com/2017/postgres-flink-sink which propose to use JDBCOutputFormat.
My code looks like the following:
98 ...
99 String strQuery = "INSERT INTO public.alarm (entity, duration, first, type, windowsize) VALUES (?, ?, ?, 'dur', 6)";
100
101 JDBCOutputFormat jdbcOutput = JDBCOutputFormat.buildJDBCOutputFormat()
102 .setDrivername("org.postgresql.Driver")
103 .setDBUrl("jdbc:postgresql://localhost:5432/postgres?user=michel&password=polnareff")
104 .setQuery(strQuery)
105 .setSqlTypes(new int[] { Types.VARCHAR, Types.INTEGER, Types.VARCHAR}) //set the types
106 .finish();
107
108 DataStream<Row> rows = FilterStream
109 .map((tuple)-> {
110 Row row = new Row(3); // our prepared statement has 3 parameters
111 row.setField(0, tuple.f0); // first parameter is case ID
112 row.setField(1, tuple.f1); // second paramater is tracehash
113 row.setField(2, f.format(tuple.f2)); // third paramater is tracehash
114 return row;
115 });
116
117 rows.writeUsingOutputFormat(jdbcOutput);
118
119 env.execute();
120
121 }
122 }
My problem now is that values are inserted only when my job is stopped (to be precise, when I cancel my job from apache flink dashboard).
So my question is the following: Did I miss something ? Should I commit somewhere the rows I inserted ?
Best regards,
Ignatius
As Chesnay said in his comment, you have to adapt the batch interval.
However this is not the full story. If you want to achieve at-least once results, you have to sync the batch writes with Flink's checkpoints. Basically, you have to wrap the JdbcOutputFormat in a SinkFunction that also implements the CheckpointedFunction interface. When the snapshotState() is called, you have write the batch to the database. You can have a look at this pull request that will provide this functionality in the next release.
Fabian's answer is one way to achieve at-least-once semantics; by syncing the writes with Flink's checkpoints. However, this has the disadvantage that your Sink's data freshness is now tight to your checkpointing interval cycle.
As an alternative, you could store your tuples or rows that have (entity, duration, first) fields in Flink's own managed state so Flink takes care of checkpointing it (in other words, make your Sink's state fault-tolerant). To do that, you implements CheckpointedFunction and CheckpointedRestoring interfaces (without having to sync your writes with checkpoints. You can even executes your SQL inserts individually if you do not have to use JDBCOutputFormat). See: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html#using-managed-operator-state. Another solution is to implement ListCheckpointed interface only (can be used in a similar way as the deprecated CheckpointedRestoring interface, and supports list-style state redistribution).

Solr returning wrong documents while searching field containing dots in solr.StrField

Field Type:
fieldType name="StrCollectionField" class="solr.StrField" omitNorms="true" multiValued="true" docValues="true"
field name="po_line_status_code" type="StrCollectionField" indexed="true" stored="true" required="false" docValues="false"
po_no is PK
Index value: po_line_status_code:[3700.100]
Search Query: po_line_status_code:(1100.200 1100.500 1100.600 1100.400 1100.300 1100.750 1100.450)
Result:
Getting Results with po_line_status_code: [3700.100] as well.
Does Solr internally tokenize solr.StrField containing dots or is some regular expression matching going on here? Sounds like a bug to me.
We don't get this document, when we change the query to one of the following
1> po_line_status_code:(1200.200 1200.500 1200.600 1200.400 1200.300 1200.750 1200.450)
2> po_line_status_code:(1100.200 1100.500 1100.600 1100.400 1100.300 1100.750 1100.450) AND po_no:938792842
We are using DSE version: 4.7.4 having Apache Solr 4.10.3.0.203.
Debug Query Output from one the servers which is returning wrong documents:
response={numFound=2,start=0,docs=[SolrDocument{po_no=4575419580094, po_line_status_code=[3700.4031]}, SolrDocument{po_no=1575479951283, po_line_status_code=[3700.100]}]},debug={rawquerystring=po_line_status_code:(3 1100.200 29 5 6 1100.300 63 199 1100.500 200 1100.600 198 1100.400 343 344 345 346 347 409 410 428 1100.750 1100.450) ,querystring=po_line_status_code:(3 1100.200 29 5 6 1100.300 63 199 1100.500 200 1100.600 198 1100.400 343 344 345 346 347 409 410 428 1100.750 1100.450)]
I also see the below thing in the response which I believe has something do with ranking or so:
No match on required clause (po_line_status_code:3 po_line_status_code:1100.200 po_line_status_code:29 po_line_status_code:5 po_line_status_code:6 po_line_status_code:1100.300 po_line_status_code:63 po_line_status_code:199 po_line_status_code:1100.500 po_line_status_code:200 po_line_status_code:1100.600 po_line_status_code:198 po_line_status_code:1100.400 po_line_status_code:343 po_line_status_code:344 po_line_status_code:345 po_line_status_code:346 po_line_status_code:347 po_line_status_code:409 po_line_status_code:410 po_line_status_code:428 po_line_status_code:1100.750 po_line_status_code:1100.450)\n 0.0 = (NON-MATCH) product of:\n 0.0 = (NON-MATCH) sum of:\n 0.0 = coord(0/23)\n 0.015334824
Also, could it be something to do with re-indexing? If I re-index my documents will it fix the issue?
The links to doc file containing solr schema and solr config can be found here
I've had to put this in an answer as the comments won't allow formatting.
No it's not a version problem or a tokenizer problem or a bug in solr.
solr.StrField won't tokenize on either analysis or query. It is matching on something else. Can you post solrconfig.xml and schema.xml?
If you are searching on po_line_status_code this is the debug you should see:
"querystring": " po_line_status_code:(1100.200 1100.500 1100.600 1100.400 1100.300 1100.750 1100.450)",
"parsedquery": "(+(po_line_status_code:1100.200 po_line_status_code:1100.500 po_line_status_code:1100.600 po_line_status_code:1100.400 po_line_status_code:1100.300 po_line_status_code:1100.750 po_line_status_code:1100.450))/",
Whereas what you are seeing is
querystring=ship_node:610055 AND po_line_status_code:(3 1100.200 29 5 6 1100.300 63 199 1100.500 200 1100.600 198 1100.400 343 344 345 346 347 409 410 428 1100.750 1100.450) AND expected_ship_date:[2016-02-03T16:00:00.000Z TO 2016-06-09T13:59:59.059Z]
So your query string has been altered. I assume all your queries are through the solr admin tool? So that should leave DSE out of the loop.
I still wouldn't expect your query to match but things are more complicated than you have presented them as you have ship_node and expected_ship_date in your query too.
Oh the No match on required clause says that you didn't match anything with the po_line_status_code query.

Realtime API - "Maximum Stack Size Exceeded"

About 7-8hrs ago we started seeing instances of the "Maximum Stack Size Exceeded" error on our live site. Unfortunately I don't have much information on the error as the relevant bits of the stack have been blown away in the report and I am not able to get it to happen myself. The information I have is the repeated loop and the document ID:
Document ID: 0B0m--69Vv_F6c1gwdUU3OG5MNDQ
Stack
...
File https://drive.google.com/otservice/api line 78 col 51 in dl
File https://drive.google.com/otservice/api line 250 col 415 in c.M
File https://drive.google.com/otservice/api line 78 col 51 in dl
File https://drive.google.com/otservice/api line 250 col 415 in c.M
...
Will update with more information if it becomes available. Is this happening for others?

Resources