We are trying to convert from Watson NLC to NLU using the same NLC training csv data.
There are 3 classification models to convert.
Current status of these models are all "error".
2 model messages are
"internal training error occurred, please try again"
1 model message is
"Training data validation failed: Too few examples for label XXXX. Minimum of 5 required"
There are two questions.
Does NLC CSV data have to be processed to be usable?
What is the best way to deal with these error messages?
Thank you.
Are you using the API to specify the training data? If so then the API documentation specify the it should be provided as json - https://cloud.ibm.com/apidocs/natural-language-understanding#createclassificationsmodel
Related
I have modified this sample to read PDFs in tabular format. I would like to keep the tabular structure of the original pdf when doing the human review process. I notice the custom worker task template uses the crowd-entity-annotation element which seems to read only texts. I am aware that the human reviewer process reads from an S3 key which contains raw text written by the textract process.
I have been considering writing to S3 using tabulate but I don't think that is the best solution. I would like to keep the structure and still have the ability to annotate custom entities.
Comprehend now natively support to detect custom-defined entities for pdf documents. To do so, you can try the following steps:
Follow this github readme to start the annotation process for PDF documents.
Once the annotations are produced. You can use Comprehend CreateEntityRecognizer API to train a custom entity model for Semi-structured document”
Once entity recognizer is trained, you can use StartEntitiesDetectionJob API to run inference for PDF documents
I am trying to train two models on Watson VR. One is for object (details) recognition within a picture. The other is to estimate the class object.
I have been able to prepare the classes of object for both models.
However, it seems I have multiple issues with training and I am now stack. I have found a similar post in Stack Overflow but it relates to data size and type; my data are all in .jpg format and all dataset is below 250 MB.
Classifier:
The classifier is the one that gives me more issues.
Firstly, I have tried to train the model but then the server went down. The day after I have found the model "trained" but with errors. I basically restarted by preparing again the classes.
All classes have at least 10-12 pictures (10 is minimum required). When I click on "Train Model" I receive the following error:
In the dashboard I am given explanation of the failed training:
Data size was originally about 241/250 MB, now it is 18.4/250 MB. I am not sure what brought the change.
Thank you for the help!
Thanks for providing the screenshots, that is very helpful!
It says your "DrinksClassifier" is in a failed state. It's best to delete that collection from Studio, and start over. Make sure you have at least 10 examples of each class... the lower screenshot seems to show it didn't find any examples for "AgedCoffee".
I am trying to build a training set for Sagemaker using the Linear Learner algorithm. This algorithm supports recordIO wrapped protobuf and csv as format for the training data. As the training data is generated using spark I am having issues to generate a csv file from a dataframe (this seem broken for now), so I am trying to use protobuf.
I managed to create a binary file for the training dataset using Protostuff which is a library that allows to generate protobuf messages from POJO objects. The problem is when triggering the training job I receive that message from SageMaker:
ClientError: No training data processed. Either the training channel is empty or the mini-batch size is too high. Verify that training data contains non-empty files and the mini-batch size is less than the number of records per training host.
The training file is certainly not null. I suspect the way I generate the training data to be incorrect as I am able to train models using the libsvm format. Is there a way to generate IOrecord using the Sagemaker java client ?
Answering my own question. It was an issue in the algorithm configuration. I reduced mini batch size and it worked fine.
Im using the IBM Watson analytics trial, it says it only takes data as CSV, Excel and a few others. How can i convert books or bodies of text into an acceptable format? thank you
It seems like the architecture of WCA(Watson Context Analytics) does not support PDF itself. Please refer the following images from IBM Link
I think it would be better to convert pdf to text with converter such as CONVERTER and pushing it into database or others.
Then, you can crawing the text data from it.
FYI, the document has to have a KEY column (i.e. name of the book).
Even if you do convert your book into an acceptable text format (.csv. .xls, .xlsx. .sav), Watson Analytics isn't optimized for text analytics. It sounds like Watson Explorer is the offering that'd best suit your needs.
Hope this helps.
Even though CSV or XLS is the acceptable format of the file, Datasets needs to be in the specific structure. You need headers for all the tables and data following it. I am not sure how a data of the book can fit into that format.
I have recently published this blog post on how to structure and refine data before importing into Watson Analytics to get the best results.
For your specific requirement, you can look into Watson Explorer as suggested by Brennan above, or even better you can learn to use IBM Content Analytics here.
Fellow developers,
I'm creating a report to be seen directly by the clients, so I need to ensure maximum user-friendliness. When the user types an invalid date string the report throws a rsReportParameterTypeMismatch and displays an error messages like this:
An error occurred during local reoprt processing
Query Execution Failed for data set 'myDataSet'
I want to replace this cryptic (for the final user) message for a more friendly custom message. How or where can I write them? How do I intercept the exception? I don't need anything too complex, just changing the words, perhaps colors and sizes would suffice.
I thank thee in advance for thy willingness to help.
take a look at Custom error pages in Reporting Services 2008