I'm noticing that Watson is having trouble IDing "HI" for Hawaii and "WI" for Wisconson. It also doesn't do well with certain names like "Logan" and "Abbey."
They are system entities so I don't see an option to edit them in the Improve tab.
Has anyone dealt with this? I've tried a few work-arounds but haven't gotten anything to work well. Thanks
You could define your own entities for the states and shortnames, e.g., #states with values for each state. If you know where those entities are mentioned, then you could look into contextual entities. Keep the system entities on and compare to how your own entities are recognized.
Related
Azure form-recognizer, prebuilt-invoice doesnt recognize currency and some of my other custom fields from my invoice pdf. General Document gets me all key values. But for General document keyvalues I need to write algorithm to categorize the invoice related fields, which are already done in prebuilt-invoice.
I need all keyvalues from prebiult-invoice api, so I can find the missing elements by myself.
Anybody faced this? how do you overcome? one way I think of is, we can call both apis for same document. But it affects performance and increases cost.
any suggestion?
I'm making an application for collectors where users will upload lists of stuff they want to collect
For example, someone wants to collect flowers and uploads a list of some flowers he'd like to collect which may look something like this:
Rose, Chrystanthemums, Narcissus
Then they may choose which ones they have and work towards their goal.
Of course, users would be able to upload all kinds of different lists which brings up the question on how should this data be saved and accessed.
An approach i thought of would be to dynamically create a table everytime a user uploads a list, but upon looking it up it's a practice that's generally frowned upon and people usually suggested other alternatives. However i can't quite think of an alternative for my situation.
In this question on DBA stack exchange the reply was that there can be a few rare cases where this is a good practice.
Is my case one of those?
How should i go about designing it?
Also i understand that i'm not providing many details about this problem and i'm not asking for you to design this for me. I'm just asking for some general guidelines or a direction to move to.
Thank you in advance!
Hello #aMimikyu for an example as simple as the one you mention dynamic tables won't be a help, on the contrary migth degrade the performance of your software, as you can use a single table for storing the users list and then use a column of the table to identify the type of list the user is saving. But in my opinion, there is a case when the dynamic tables migth be usefull: if the entities (abstract representation of data) cannot be managed by the same class (model) on every different type of input. In this case the models and tables can be created on the fly.
For quite a while I am struggling with how to save custom user specific arrays of data in Mailchimp.
Simple example: I want to save the project ids for a user in Mailchimp and in best case be able to use them there properly as well. Let's say user fritz#frey.com has the 5 project ids 12345, 25345, 21342, 23424 and 48935. Why is there no array merge field that let's me save this array of project ids to a user?! (Or is there one and I am just blind...)
I know I can do drop down fields to put users in multiple groups, like types of projects for example, but the solution can hardly be a drop down with all (several thousand) project ids and I check the ones the user is a part of (and I doubt that Mailchimp would support that solution for a large number of group items anyways).
Oh and of course I could make the field myself by abusing a string field and connect the project ids with commas or a json string but that seems neither like a clean solution nor could I use the data properly in Mailchimp (as far as I know).
I googled quite a bit and couldn't find anything helpful sadly... :(
So? Can anybody enlighten me? :)
Thanks for all your help!
It sounds like you have already arrived at the correct answer: there is no "array" type, other than the interests type, which is global and not quite the same as an array.
The best solution here sort of depends on your data. If each project ID will have many different subscribers attached to it, and there won't be too many of them active at any given time, I'd just use interests. If you think there may be dozens of project ids active simultaneously, I'd not store this data on the subscribers at all, instead I'd build static segments for each project, and add users to them.
If projects won't have a bunch of subscribers associated, I'd store the data on your end and/or continue using the comma-separated string field.
I'm trying to setup solr that should understand English. For example I've indexed our company website (www.biginfolabs.com) or it could be any other website or our own data.
If i put some English like queries i should get the one word answer just what Google does;queries are:
Where is India located.
who is the father of Obama.
Workaround:
Integrated UIMA,Mahout with solr(person name,city name extraction is done).
I read the book called "Taming Text" and implemented https://github.com/tamingtext/book. But Did not get what i want.
Can anyone please tell how to move further. It can be anything our team is ready to do it.
This task is called Named Entity Recognition. You can look up this tutorial to see how they use Solr for extractic atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc. and then learning a model to answer queries.
Also have a look at Stanford NLP for more ideas on algorithms that you can use.
Are there any proven ways of refactoring a database into supporting multiple versions of entries?
I've got a pretty straight forward database with some tables like:
article(id, title, contents, ...)
...
This obviously works like a charm if you're only going to store one version of each article. I remember asking my client really clearly whether the system should be able to store articles in different languages, really stressing that it would be expensive to add this support later on. You can probably guess what the client said back then..
My current approach will be to create a couple of new tables like:
language(id, code, name)
article_index(id, original_title) <- just to be able to group articles
And then add a foreign key into the original article table:
article(id, title, contents, article_index_id, ...)
I would love to hear your comments to this approach and your experiences on the topic.
This is an approach I've used successfully in the past. Another is to replace all text fields with an identifier (int, guid, whatever you want), and then store translations for all the text fields in a single table, keyed on this identifier plus a language id.
Personally, I have had more success with the first approach (i.e. yours), and have, for instance, found it easier to deal with via an ORM. With an NHibernate ORM on my current project, for instance, I've created what amounts to a language-aware session, that returns the correct set of translations for each object automatically. Consistency in the approach obviously helps here.