Issues in creating jobs of size XL in Watson Machine Learning (WML) - ibm-watson

I have an issue when trying to create jobs for Decision Optimization when using size XL in Watson Machine Learning (WML). The first job for the day I have no issues what so ever to create. But the second job is failing.
If changing to smaller instance (S or M) there is no issue to start a new job. But for size XL I can not start a second job. I can not find out why. Any ideas?
I get the following fault codes:
Code:
error_in_instance_creation
Message:
Instance creation of t-shirt-size XL and type do12.10 failed.

We encountered the same issue; for us it had to to with the facts that we did not properly remove our old models / deployments / jobs. After cleaning up all left-over resources accordingly, we were able to startup an instance again.

Thanks everyone for your support. We have now found out that it seems like IBM has updated the service.
Without doing any changes it is now possible to deploy multiple models in a day and execute the one you are interested in.
So I would say it solved it self

Related

Project Run Time does not start on Sagemaker Studio Lab

That is the case as of last night. Does not work for CPU or GPU "compute type"
Basically, after pressing the "Start runtime" button, it says "Preparing project runtime..." for about ten minutes and then stops. It shows the following error, "There was a problem when starting the project runtime. This should be resolved shortly. Please try again later."
I have now tried it about five times over last night and this morning.
There is no way to even access the work that is saved there. The "project" will not boot up.
Basically it is a dud at this point.
Is anyone else experiencing similar issues? What does one do?
The issues/questions and answers that I have learned since (because I was asked to clarify):
(1) The environment on Sagemaker Studio Lab is supposed to be persistent. I.e., any time one starts it, the environment, files uploaded, etc. are where it was left last. However, I was not able to start the environment any longer. Before it locked out, it would start fine. Consequently, I was not able to get to my saved work in any way shape or form. I was wondering if anyone has had this issue
Answer: Thus far not too many people are approved for Sagemaker Studio Lab. So, I may have been the first or one of the first to encounter this issue. As of this writing there does not exist a way to access ones data if one cannot spin up a virtual machine that would have access to the data using their "Start runtime" button.
(2) It is not clear where one is supposed to report issues with Sagemaker Studio Lab. On one's home page in Studio Lab, it has a link to StackOverflow under "Get answers and help others". That is how I ended up here. Though, I should have included the following hashtag (#amazon-sagemaker pointing to https://stackoverflow.com/questions/tagged/amazon-sagemaker).
Answer: I eventually found where people are submitting bug reports. And I reported the issue there (see issue 56). https://github.com/aws/studio-lab-examples/issues
(3) It was not clear to me if I was to delete my account and request a new one if I would be put on a waitlist (which supposedly is presently long). I.e., this would be the manual factory reboot option where one still loses all of ones work, but, at least has an opportunity to start again with the environment.
Answer: Once one is approved, one does not go to the waitlist. Deleting my account, requesting a new one, and setting it up to the initial state took a couple of minutes for me. And yes, I lost all my work that was on there. So, back stuff up as it was your computer in the `80s. I.e., back up externally to that environment.
I signed up for ASMSL about 2 weeks ago. As of this evening (Feb 15) I'm able to log into my runtime without any trouble at all.

Can not run the case of "Real Time Reporting with the Table API" in TryFlink

I just get start learning flink and try the case, "Real Time Reporting with the Table API "
When I ran docker-compose, all containers worked except jobmanager which is exited with 2.
up all
exit with 2
I tried rebuild and restart, but it does not work and I do not know what's wrong with it.
Could anyone help me to figure it out,please? Many thanks!
This particular tutorial does fail if you skip ahead and try to run it without first providing an implementation for org.apache.flink.playgrounds.spendreport.SpendReport.report. Several versions of that method are provided in the tutorial: pick one (perhaps the last one), drop it in, rebuild the docker image, and try again.

App Engine backup never finishes only clue is failure in map reduce worker_callback

Over the last few weeks we have repeatedly failed on doing a complete backup of the data store using the datastore admin tool. We thought the issues had to do with quota errors we were running into so we switched our application from a free to a paid app and we still have problems.
Each time we are attempting to back up to the blobstore and what occurs is that the process never finishes. We see the backup in our Pending Backups list but it never actually completes. We only have a total of 43MB of data right now so we don't see it as a data transfer problem. Looking at our default Task Queues it shows that we have two pending tasks one is a call to /_ah/mapreduce/controller_callback and another is a call to /_ah/mapreduce/worker_callback
The worker_callback racks up its retry count and the only error clue we have is on the Previous Run tab it shows the last http response code to be 500. There is no error message, nothing shows up in our error logs, it just keeps trying over and over again.
We've been able to narrow the backup problems to a specific entity kind for a particular namespace but we can't figure out why that entity kind is failing whereas the others are not. The major difference is the entity kind has a large number of embedded entities, but if the app engine is able to read / put those entities we can't understand why it seems to be having problems backing it up. The particular namespace that the error occurs in has the largest data stored for that entity kind compared to the other namespaces we have setup.
We think if we can see what error is occurring in the worker_callback we may be able to figure out why the backup is failing, or what is wrong with our data that's preventing the backup. Is there something we need to setup / enable through settings / configuration files to give us more detailed information on the backup? Or is there some other avenue we should explore to figure out how to investigate/fix this problem?
I should mention we are using the Java SDK as well as Objectify V3 to work with the data store. We are also backing up data to the Blobstore.
Thank you.
Well with the app engine team's help we figured what the problem was and we worked around the issue. I want to give details in case anyone else runs into this problem.
From issue 8363 the app engine team indicated that from their logs they could see that the map reduce failed because of the large number of properties that our entity kind had. The specific entity kind that was causing the failure had a large number of variable properties that was generating errors when map reduce tried to write out a schema. They indicated that the solution on their end was to ignore entities that were like this in the backup to make it so the backup worked successfully.
What we did to work around the issue and make the backup work was change how we told objectify to store out data. The large number of properties were being created due to our use of the #embedded keyword on a HashMap() class member field. Since the embedded keyword breaks down classes into individual components it was generating a large number of properties. We switched the member field to be #serialized and then ran a conversion process to make it use the new serialized property. This made the backup / restore work again.
You can read more about the differences between embedded and serialized on objectify's website
snielson, would you mind opening an issue on our Public issue tracker here. Remember to add your Application ID so we can further debug this specific scenario.
Thanks!

Fogbugz database schema management

This is a very simple question, and maybe the man himself can provide insight on this :)
Does anyone know the pseudocode behind how Fog Creek does database schema management?
I'm running into an issue and I'm trying to figure out if I'm handling it right... I have a module that runs each time someone spins up their site and examines their database to make sure that they have the right changes in place. if they are missing changes, then the script makes the required changes.
My issue is that I was trying to tie it to the session_start portion of the Global.asax, but it seems to be rather flaky at times, and I'm trying to come up with a better scenario.
For reference, I'm trying to run 1 x web application that can respond to any number of hosts, where the host maps via a metabase to find out what database it belongs to and then makes the necessary connections.
You might have more luck asking this on http://fogbugz.stackexchange.com/

How to keep Stored Procedures and other scripts in SVN/Other repository?

Can anyone provide some real examples as to how best to keep script files for views, stored procedures and functions in a SVN (or other) repository.
Obviously one solution is to have the script files for all the different components in a directory or more somewhere and simply using TortoiseSVN or the like to keep them in SVN, Then whenever a change is to be made I load the script up in Management Studio etc. I don't really want this.
What I'd really prefer is some kind of batch script that I can run periodically (nightly?) that would export all the stored procedures / views etc that had changed in a given timeframe and then commit them to SVN.
Ideas?
Sounds like you're not wanting to use Revision Control properly, to me.
Obviously one solution is to have the
script files for all the different
components in a directory or more
somewhere and simply using TortoiseSVN
or the like to keep them in SVN
This is what should be done. You would have your local copy you are working on (Developing new, Tweaking old, etc) and as single components/procedures/etc get finished, you would commit them individually until you have to start the process over.
Committing half-done code just because it's been 'X' time since it was last committed is sloppy and guaranteed to cause anyone else using the repository grief.
I find it best to treat Stored Procedures just like any other compilable code: Code lives in the repository, you check it out to make changes and load it in your development tool to compile or deploy the code.
You can create a batch file and schedule it:
delete the contents of your scripts directory
using something like ExportSQLScript to export all objects to script/scripts
svn commit
Please note: That although you'll have the objects under source control, you'll not have the data or it's progression (is that a renamed field, or 1 new field and 1 deleted?).
This approach is fine for maintaining change history. But, of course, you should never be automatically committing to the "production build" (unless you like broken builds).
Although you didn't ask for it: This approach also won't produce a set of scripts that will upgrade a current DB. You'll only have initial creation scripts. Recording data progression and creation upgrade scripts is beyond basic source control systems.
I'd recommend Redgate SQL Compare for this - it allows you to compare database versions and generate change scripts - it's also fairly easily scriptable.
Based on your expanded question, you really want to use DDL triggers. Check out this article that details how to create a changelog system for your database.
Not sure on your price range, however DB Ghost could be an option for you.
I don't work for this company (or own the product) but in my researching of the same issue, this product looked quite promising.
I should've been a little more descriptive. The database in question is for an internal ERP system and thus we don't have many versions of our database, just Production/Testing/Development. When we've done a change request, some new fancy feature or something, we simply execute a script or series of scripts to update the procedures in question on the Testing database, if that is all good, then we do the same to Production.
So I'm not really after a full schema script per se, just something that can keep track of the various edits to the stored procedures over time. For example, PROCESS_INVOICE does stuff. It gets updated in some minor way in March. Some time later in say May it is discovered that in a rare case customers get double invoiced (or some other crazy corner case). I'd like to be able to see what has happened over time to this procedure. Currently the way the development environment is setup here I don't have that, which I'm trying to change.
I can recommend DBPro which is part of Visual Studio Team Edition. Have been using it for a few months for storing all parts of the database in Team Foundation Server as well as for deployment and database compares, etc.
Of course, as someone else mentioned, it does depend on your environment and price range.
I wrote a utility for dumping all of the relevant parts of my db into a directory structure that I use SVN on. I never got around to trying to incorporate it into the Manager but, if you're interested, it's here: http://www.reluctantdba.com/dbas-and-programmers/sqltools/svnforsql2005.aspx
It's free and, since I regularly run it, you know any bugs get fixed quickly.
You can always try integrating SourceSafe with SQL Server. Here's a quick start : link . To work with it you've got to have Managment Studio Developers Edition.

Resources