Azure form-recognizer - prebuilt-invoice doesnt recognize currency, other custom fields - azure-form-recognizer

Azure form-recognizer, prebuilt-invoice doesnt recognize currency and some of my other custom fields from my invoice pdf. General Document gets me all key values. But for General document keyvalues I need to write algorithm to categorize the invoice related fields, which are already done in prebuilt-invoice.
I need all keyvalues from prebiult-invoice api, so I can find the missing elements by myself.
Anybody faced this? how do you overcome? one way I think of is, we can call both apis for same document. But it affects performance and increases cost.
any suggestion?

Related

Azure Search - Hierarchical facets guidance

I'm developing a project where I want to have hierarchical facets.
I have an index with a complex structure, like:
Index
-field1
-List
And othercomplexfield contains another list with anothercomplexfield inside.
I'd like to be able to give to users the possibility to:
Have the facets of field1.
When one is selected, I'd like to give the user the possibility to select one of the values of a certain field of "othercomplexfield" while filtering by the selected field1.
I can do that.
I'd then like to give the user the possibility to select one of the possible values of "anothercomplexfield" while filtering by field1 AND by the selected othercomplexfield.
The difficulty here is that I don't want every possible facet value, but only the ones CONTAINED by the othercomplexfield that I'm filtering for.
So far I had to do this inside of c# and i did not find a way to write a query that gives me back from azure search the distinct values that I want.
Someone has a similar problem?
Did I explain the problem well enough?
I saw no clear guidance online, everything is easy if you only have level 1 facets but when you get into nested objects it's not that clear anymore.
I'm not sure I fully understand the context of your question. What I can tell you is that filters only apply at the document level and not at the complex collection level. What I mean by that is that if a filter matches an item in a complex collection, the entire document will be returned, not just the item in the complex collection that matched. The same is true for facets--facets will count all documents in the result set that match the filter and can't be scoped down just to parts of documents. With that, it seems like having this logic in your application like you mentioned might be the best approach for your current index schema.
We do have this old blog post that talks about one way to implement hierarchical facets with Azure Cognitive Search which may give you some other ideas on how you could implement the functionality you're looking for: https://learn.microsoft.com/en-us/archive/blogs/onsearch/multi-level-taxonomy-facets-in-azure-search

Creating Index in Cloudant

Scenario.
I have a document in the database which has thousands of item in
'productList' as below.
here
All the object in array 'productList' has the same shape and same fields with different values.
Now I want to search in the following way.
when a user writes 'c' against 'Ingrediants' field, the list will show all 'Ingrediants' start with alphabet 'c'.
when a user write 'A' against 'brandName' field, the list will show
all 'brandName' start with alphabet 'A'.
please give an example using this to search for it, either it is by
creating an index(json,text).
creating a Search index (design document) or
using views etc
Note: I don't want to create an index at run-time(I mean index could be defined by Cloudant dashboard) I just want to query it, by this library in the application.
I have read the documentation's, I got the concepts.
Now, I want to implement it with the best approach.
I will use this approach to handle all such scenarios in future.
Sorry if the question is stupid :)
thanks.
CouchDB isn't designed to do exactly what you're asking. You'd need one index for Ingredient, and another for Brand Name - and it isn't particularly performant to do both at once. The best approach I think would be to check out the Mango query feature http://docs.couchdb.org/en/2.0.0/api/database/find.html, try the queries you're interested in and then add indexes as required (it has the explain plan to help make this more efficient).

solr auto-suggestion with multible where clauses

Not sure if it is a relevant query to post, but want to understand auto-suggestion is suitable option for location based search as I am looking for specific requirement. The requirement is, from a specified geo location, want to search for providers(be it doctor with specialty or hospitals) using auto suggestion.
As part of suggestion, I need to pass geo location with search key, the search key would be a doctor’s name or doctor’s specialty or hospital name or hospital address, the suggester would provide the results on the basis of geo distance in ascending order.
The weightage option would be calculated on the basis of distance by inverse value.
I posted earlier a query here (solr autosuggestion with tokenization), this post is relevant to my earlier query.
Regards
Venkata Madhu
If you want to add more logic to the suggestions that you're going to show is probably a good idea to use normal queries instead of the suggest component.
For instance take a look at this repo is a (bit outdated) example of using a normal solr core to store suggestions and do suggest-like queries. Meaning you can do partial match queries on that index and add the custom scoring logic that you want. Keep in mind that it doesn't need to be a separated core you could just copy data from the fields that you have in a separate field used only for generating the suggestions.
In this case, you'll only need to add/edit the score function used to add your own logic (geodist) or even do a hard sort on the distance.

How to save user specific arrays in Mailchimp

For quite a while I am struggling with how to save custom user specific arrays of data in Mailchimp.
Simple example: I want to save the project ids for a user in Mailchimp and in best case be able to use them there properly as well. Let's say user fritz#frey.com has the 5 project ids 12345, 25345, 21342, 23424 and 48935. Why is there no array merge field that let's me save this array of project ids to a user?! (Or is there one and I am just blind...)
I know I can do drop down fields to put users in multiple groups, like types of projects for example, but the solution can hardly be a drop down with all (several thousand) project ids and I check the ones the user is a part of (and I doubt that Mailchimp would support that solution for a large number of group items anyways).
Oh and of course I could make the field myself by abusing a string field and connect the project ids with commas or a json string but that seems neither like a clean solution nor could I use the data properly in Mailchimp (as far as I know).
I googled quite a bit and couldn't find anything helpful sadly... :(
So? Can anybody enlighten me? :)
Thanks for all your help!
It sounds like you have already arrived at the correct answer: there is no "array" type, other than the interests type, which is global and not quite the same as an array.
The best solution here sort of depends on your data. If each project ID will have many different subscribers attached to it, and there won't be too many of them active at any given time, I'd just use interests. If you think there may be dozens of project ids active simultaneously, I'd not store this data on the subscribers at all, instead I'd build static segments for each project, and add users to them.
If projects won't have a bunch of subscribers associated, I'd store the data on your end and/or continue using the comma-separated string field.

Using Solr to store user specified information in documents

I have an application that contains a set of text documents that users can search for. Every user must be able to search based on the text of the documents. What is more, users must be able to define custom tags and associate them to a document. Those tags are used in two ways:
1)Users must be able to search for documents based on specific tag ids.
2)There must be facets available for the tags.
My solution was adding a Mutivalued field in each document to pose as an array that contains the tagids that this document has been tagged with. So far so good. I was able to perform queries based on text and tagids ( for example text:hi AND tagIds:56 ).
My question is, would that solution work in production mode in an environment that users add but also remove tags from the documents ? Remember , I have to have the data available in real time, so whenever a user removes/adds a tag I have to reindex that document and commit immediately. If that's not a good solution, what would be an alternative ?
Stackoverflow uses Solr - this is in case if you doubt Solr abilities in production mode.
And although I couldn't find much information on how they have implemented tags, I don't think your approach sounds wrong. Yes, tagged documents will have to be reindexed (that means a slight delay) but other than that I don't see anything wrong with it.

Resources