Can we train a single model for multiple type of forms? - azure-form-recognizer

I am using Microsoft form-recognizer cognitive service to train a model for forms. My question is that can we train a single model for multiple type of forms? Is it recommended? I have many multiple types of forms and I choose to train a single model for four different types, it get trained but is it a recommended approach?

This has been given as an answer to the question by Microsoft Cognitive Services Admin:
"Form Recognizer supports training a single model for different types of forms. The model quality should be the same for a single type model and a multi type model per type."

Related

How to use ML models in Vespa.ai?

We are trying to use the ML model in Vespa, we have textual data stored in Vespa, can somebody help us with the below question-
One example of onnx model trained using scikit-learn used in Vespa.
Where to add preprocessing steps before model training and prediction using onnx model in Vespa with example.
This is a very broad question and the answer very much depends on what your goals are. In general, the documentation for using an ONNX model in Vespa can be found here:
https://docs.vespa.ai/documentation/onnx.html
An example that uses an ONNX BERT model for ranking can be found in the Transformers sample application:
https://github.com/vespa-engine/sample-apps/tree/master/transformers
Note that both these links assume that you have an existing model. In general, Vespa is a serving platform and not usually used in the model training process. As such Vespa doesn't really care where your model comes from, be that scikit-learn, pytorch or any other system. ONNX is a general format for ML model exchange between various systems.
However, there are some foundational ideas that I think I should get across that maybe can clarify a bit. Vespa currently considers all ML models to have numeric (in the form of tensors) inputs and outputs. This means you can't directly put text to your model and have text come out on the other side. Most textual data these days are encoded to some form of numeric representation such as embedding vectors, or, as the BERT example above shows, text is tokenized such that each token gets its own vector representation. After model computation, embedding vectors or token-set representations can be decoded back to text.
Vespa currently handles the computational part, the (pre-)processing of encoding/decoding text to embeddings or other representations are currently up to the user. Vespa does offer a rich set of features to help out in this regard in the form of document and query processors. So you can create a document processor that encodes the text of each incoming document to some representation before storing it. Likewise, a searcher (query processor) can be created that encodes incoming textual queries to a compatible representation before documents are scored against it.
So, in general, you would train your models outside of Vespa using whatever embedding or tokenization strategies are necessary for your model. When deploying the Vespa application you add the models with any required custom processing code, which is used when feeding or querying Vespa.
If you have a more concrete example of what you are trying to achieve I could be more specific.

How can I train API using Watson AlchemyAPI?

I am trying to find out the entity from text input. If there any option to train Alchemy. So that I can modify entity according to my needs.
You can't train the entity extraction from AlchemyLanguage but there are other Watson APIs that you can use to extract entities or concepts from text.
Relationship Extraction: Performs linguistic analysis of the input text. It then finds spans of text and clusters them together to form entities, before finally extracting the relationships between them.
Concept Insights: Helps you annotate concepts and identify conceptual associations from text.
You can quickly test this API Swagger: https://watson-api-explorer.mybluemix.net/apis/concept-insights-v2

When will we be able to use a custom corpus?

When will ibm make it's Watson Q&A api capable of accepting a custom corpus?
Is there a roadmap I can see?
Besides the Question and Answer service currently doesn't provide a way to use your own data. You can get similar or better results by combining Document Conversion and Retrieve and Rank.
You will use Document Conversion to convert your corpus documents (PDF, docx, html) to answer units that will be indexed by the Retrieve and Rank service.
The Retrieve and Rank service is built on top of Apache Solr, and once you load your data into the Solr index, you can create and train a Ranker (machine learning model that knows how to sort results).
To expand on German's answer, also take a look at the Watson Natural Language Classifier (NLC) and Dialog services, which are additional building blocks for creating a custom Question and Answer application. NLC classifies text and allows you to trigger an action, and Dialog allows you to create and manage virtual conversations with your users.
Here is a great blog with an introduction to both NLC and Dialog. And another good blog that introduces the Watson Document Conversion and Retrieve and Rank services.

how to tell if specifications are modelled using database oriented approach or class design oriented approach

Given a problem specification, how to tell if it is a database design problem or class design(object oriented design) problem?
What comes to my mind, is that in OOP, classes(objects) contain methods, whereas a database is just a collection of relationships and values.
Therefore:
If you can say a problem is about how "things" in the specification relate to each other you have a database design problem.
If it is about what the "things" in the specification can do, you're going to be modeling more along object oriented programming.
If you're using a database and creating domain objects, it's both. Database design and class design are two different things, and both are necessary if you're using a database and classes. It's not like you choose one or the other.
This is where an ORM comes into play. When your data layer retrieves information from the database, a typical approach is to transform the relational data into your domain object(s) and pass that to the business logic layer so the rest of your application can deal with domain objects instead of a relational model.
Then your ORM does the opposite when persisting data: it takes a domain entity and turns it back into a relational structure that can be saved to the database.
Note: I'm assuming a relational database here. If not, substitute relational for whatever type of persistence layer you're using.
I believe that the only specifications which should be addressed as database-oriented problems are those which are focused on the manipulation of structured data types. If your specification is all about "store a customer record", "delete an order record", "change the value of price from 12 to 33 for record matching specifcation", you've got a database project.
I haven't seen that kind of problem specification since the Cobol team I worked in employed a systems ~~anarchist~~ analyst. Almost every project I've worked on since has had requirements that were not about how data was stored, but what the data meant.
If you get a requirement that says "Users may create Customers. Customers can place orders. Orders contain products. Orders can have delivery methods, payment methods, and status. Status follows a business process", you have an OO problem. You probably need a storage mechanism - and a database would be an excellent choice - but you have business logic that cannot be exclusively implemented by creating structured data types and relationships.

How to handle differing validation rules in RIA Services?

We have an Entity Framework model that is used by two different silverlight applications. The validation rules are very similar in the two contexts, but differ slightly.
For example, a regular user in one of the applications cannot input time that is in the future, but an administrator in the other application can put time that is in the future.
How would you handle designing this application? Two ideas we came up with:
Creating two entirely separate models, so that each can be independent
Share the same model, but put a "Context" property on our base Entity class, so that the validation rules can validate differently where necessary.
I have never tried it, but what about extending or creating new validation attributes that uses a different validation depending on the authorisation role of the user?
Those sound like business rules which should be seperate from data access. You should be able to use the same EDM but implement the business rules in the business layer, not the data layer.

Resources