v2 to v3 Transition for Form Recognizer - azure-form-recognizer

Because FR v3.0 is still Preview mode, so I went v2.1 Quickstarts, "Analyze using a Prebuilt model", Navigate to the Form Recognizer Sample Tool.
Using Form Type = "Invoice" to test many size and text including handwriting, very happy with the results, especially returned JSON file structure:
...
"analyzeResult":
{
...
readResults:[...],
pageResults:[...]
...
}
For large/complex image/doc, use pageResults.tables[0].cells based on rowIndex and columnIndex, I can easily piece each row text restoring the whole doc. For small/simple image/doc or when pageResults.tables.length==0, use readResults.lines achieve the same OCR outcome, like one-size-fits-all, perfect!
Next is my own hands-on for the same images, Samples, JavaScript. Because I've been using Invoice only, so I picked recognizeInvoice.js, great sample, easy and simple to follow. Even it's v3 and missing readResults and pageResutls, I'm still able to use invoice.pages[0].tables[0].cells achieve the same result for large/complex image/doc. For small/simple image found 2 issues:
invoice.pages[0].tables.length = 0, so no text values.
only text value is NRT LLC. of invoice.fields.VendorName.value, all other printed text and handwriting returned by v2.1 are gone!
I believe there must be some reasons at MS side for the above changes, for us it means v3 is not backward compatible. And more importantly we wouldn't be able to know if the image fits a model and/or will return something before submitting, even we provide a list of choices of models users may frustrate by extra manual work. At the moment all we can do is switching back to Google. So,
where is the v2.x sample code and when will MS discontinue v2.x?
how does v2.x transit to v3?
Below is my navigation route. Thank you and really appreciate the great work!

It is a bit confusing, but the versions of the #azure/ai-form-recognizer package on NPM are one major version ahead of the Form Recognizer API versions. The preview API version "2021-09-30-preview" (REST API "v3") can be used with Form Recognizer SDK version 4.0.0-beta.2. REST API version v2.1 (GA) is used with SDK version 3.2.0. On the README for #azure/ai-form-recognizer 3.2.0, it explains this:
Note: This package targets Azure Form Recognizer service API version 2.x.
I'm guessing based on what you've said that you are using the latest stable version 3.2.0 of the SDK. When extracting data using a prebuilt or custom model in this version, tables are attached to pages, and pages are attached to Forms, so you can access a table by looking through the forms:
const poller = await client.beginRecognizeInvoices(inputs);
const invoices = await poller.pollUntilDone();
const table = invoices[0].pages[0].tables[0];
If a table appears on a page that isn't associated with any form (no form appears on that page), it can't be accessed using this method. That feature is present in the new beta SDK for the new preview API, but in the current SDK to get all pages (regardless of whether or not they contain a form), you could consider using the beginRecognizeContent method.

Related

Using Maps in Codename One

Can anyone assist me in adding a Map to a Form in Codename One? I've been reading and searching, and can't find an "updated" demo/tutorial on using the library effectively.
IDE - IntelliJ
Build Tool - Maven
The old code works mostly the same as it did. The github project instructions are up to date and should work as expected. Just add the dependency and get the API keys from Google.
Then all you need to do is create a BorderLayout Form and place the map within it. E.g.:
Form mapForm = new Form(new BorderLayout());
mapForm.add(CENTER, mapContainerInstance));
mapForm.show();

Google Structured Data Tool don't read my React Site content

I use this tool to test my structured data:
https://search.google.com/structured-data/testing-tool
This is my page:
https://www.offersprive.eu/it/prod/Black%20Latte/56
If I try to check it, the response is empty
...But if I copy-and-paste my html content, the tool read it correctly
What can I do to read the link content? Is that a problem with React content loading?
Thanks.
I'm having this same issue.
Basically I have a static website (job board) built with React and want the job to show in the Google Job Network.
To do this the web page needs to contain structured data for Google to crawl.
I've tried some npm packages like react-structured-data which does get the data to appear in the header, but the data gets injected AFTER Google runs the scan, so the data does not yet exist for Google and therefore is returning zero results.
I have the same issue when I try using react-helmet.
I have the same issue when I try to append a script with the data to either the header or body upon ComponentDidMount or ComponentWillMount.
It's weird that it shows in the header when I do inspect elements but doesn't show in the header when I view page source.
Maybe one solution is server-side rendering, but there must be another way.
Possible answer
according to this answer, the Google might actually see the data, its just the testing tool doesn't see the data, which is quite a pain in the butt.
https://webmasters.stackexchange.com/questions/91064/structured-data-tool-doesnt-see-javascript-rendered-content
Also, this page:
https://developers.google.com/search/docs/guides/intro-structured-data#structured-data-format
says: Google can read JSON-LD data when it is dynamically injected into the page's contents, such as by JavaScript code or embedded widgets in your content management system.
Another potential solution, but less plausible because it still loads after the fact
Instead of using JSON-LD, use microdata attached to your elements, like if you go here:
https://schema.org/JobPosting
and click example 4, microdata tab
Then perhaps it will know to wait for your elements to load before scanning.
Testing these solutions now. I will update probably tomorrow as I am logging off soon.
UPDATE: I FOUND THE ANSWER
I have tried the above and it appears that the data is valid and Google does see it, it's just Google's Structured Data tool (and also some structured data chrome extensions) don't not see the data. This is because such tools scan the page before the data is loaded in. Other tools, wait until the data is loaded before scanning, and on those tools, it works.
For example: If you inspect your web page and click on the HTML element, and click "edit as html" and copy the entire html of you page, and paste that HTML as code into Google Structured Data tool, you should see that it now finds your data. Hopefully they fix that in the future but for now, you can at least try that to make sure your data is valid.
Another thing is, if you go to the Google Search Console and request the URL in question to be indexed by Google, then wait a day or so for it to process, then check back in on it. You will see that the Google Search Console DID find your data. So Google IS seeing your data, e.g. search console. It's just the broken Structured Data Tool from Google that is not seeing your data. Hopefully it is fixed soon.
For the record, how I was able to get this to work on my React app is by putting my data inside of Component Did Mount. E.g.
`componentDidMount() {
const googleJobNetworkScript = document.createElement("script");
googleJobNetworkScript.type = "application/ld+json";
googleJobNetworkScript.innerHTML = JSON.stringify({
"#context": "http://schema.org",
"#type": "JobPosting",
"baseSalary": "100000",
"jobBenefits": "Medical, Life, Dental",
"datePosted": "2011-10-31",
"description": "Description: ABC Company Inc. seeks a full-time mid-level software engineer to develop in-house tools.",
"educationRequirements": "Bachelor's Degree in Computer Science, Information Systems or related fields of study.",
"employmentType": "Full-time",
"experienceRequirements": "Minumum 3 years experience as a software engineer",
"incentiveCompensation": "Performance-based annual bonus plan, project-completion bonuses",
"industry": "Computer Software",
"jobLocation": {
"#type": "Place",
"address": {
"#type": "PostalAddress",
"addressLocality": "Kirkland",
"addressRegion": "WA"
}
},
"occupationalCategory": "15-1132.00 Software Developers, Application",
"qualifications": "Ability to work in a team environment with members of varying skill levels. Highly motivated. Learns quickly.",
"responsibilities": "Design and write specifications for tools for in-house customers Build tools according to specifications",
"salaryCurrency": "USD",
"skills": "Web application development using Java/J2EE Web application development using Python or familiarity with dynamic programming languages",
"specialCommitments": "VeteranCommit",
"title": "Software Engineer",
"workHours": "40 hours per week"
});
document.head.appendChild(googleJobNetworkScript);
}`
You can also append the child to document.body instead of document.head. Either should work. Your choice.
You could also use react-helmet, or react-structured-data from NPM, which some other people do, but I didn't see the need, since the above seems to work fine.
You can find the other structured data types at schema.org
Remember to either submit a new site map to Google or submit your site to the Google indexing API each time you have a new webpage or webpage with updated content that you would like Google to scan.
This post is long but I hope it covered all the bases and I hope it helps.
Having had a brief look at how your website loads; I believe you are using React Helmet. The issue is with this tool (and vanilla React in general) is that the page must be loaded and javascript run in order for your headers to be set and your content loaded.
Most tools that crawl webpages don't run javascript, Google now does on its main crawler I believe but they don't seem to have updated all their various tools. Facebook, Twitter, Bing etc, I believe it's patchy at best.
The answer is probably either Gatsby or Next.js; both provide ways of rendering your React code on the server or during the build so that all the headers and content are sent when your page is first called. You can write your own server side rendering methods but these solutions provide all that leg work.
This removes the need for a crawler to be running javascript; so you get index properly! For the sake of interest, when I ran into this issue I went for Gatsby.
A quick work around is do what you have your other links / meta tags; write them into the base index.html file. However, this obviously can't be updated per page etc...
Hope that shines some light on it :)
Google Structured Data Tool don't read my React Site content
i think its reading correctly,but not executing the js,that is google data tool use crawler to fetch the page,which is the source code of your page,to see on what content google tool is working ,just open your page and goto view page source,you can see google tool is working on this source,not on what generated by the react.
Is that a problem with React content loading?
this is because react components are rendered after the page load.and your content is not visible to crawler as webcrawler do not execute javascript.
i hope this will clear your doubt.
I would suggest you to have a look at React Helmet package, which can help you manage your <head> and structured data.

What is the best way to integrate a small angular.js application into Wordpress?

I am working on adding a small one-page AngularJS application to my friend's Wordpress site. The application will take some user input and generate on-screen output using that input. It will also log the user input for analysis purposes.
I will need to be able to host the files for that page on my friend's site as well as create a back-end script that can capture the user input and store it to a MySQL database.
I have worked with WordPress sites before but have never customized them or written a plugin. How would you go about making this happen?
I will select the answer that leads me down the most efficient / effective path. Thanks!
After poking around for a while, I ran across the Advanced Custom Fields plugin. Using this helped me add custom JavaScript to the specific page that needed Angular support.
Here is information on how to use ACF to add JavaScript support:
https://www.godaddy.com/garage/3-ways-to-insert-javascript-into-wordpress-pages-or-posts/
Here is the plugin itself:
https://wordpress.org/plugins/advanced-custom-fields/

ATK Upload Addon

I am attempting to add an uploading interface to my site that supports multiple-file selection and drag-and-drop uploading. The Filestore add-on has worked very well for me in the past, but I need more features than it supports.
I found an upload add-on developed by Romans here: https://github.com/romaninsh/upload
The README states that the add-on uses the BlueImp Upload jQuery widget and it lists several features such as a FileList view and a DropZone controller. However, when I looked in the source code for the add-on, I didn't find classes for most of the views described in the README or for the controller. I tried following the instructions under "Stand-alone use" by adding a View_Uploader element to a page, but this only added an empty div to the page.
Is the add-on incomplete? Or is it meant to be extended before it can be functional? If this add-on isn't the best tool for the job, is there a better way to implement the kind of enhanced uploader that I need on my site?
I am the author of that new upload add-on. It is in fact incomplete, I've planned it out and drafted features / readme but haven't had time to finish it.
The goal here is to create a View which would interact with Filestore / importFile, but uses more modern way to upload the file.
If you think you would want to take over that add-on and build it, i'll offer you some help.

Using intermediate appPage xpages mobile

I have a simple issue, but I don´t how to solve the problem.
Using XPages Mobile controls, I have a document with some actions, and one of them is "Send to Signature".
The workflow is: select the signer of document (from names) and send to signature.
I need to show for the current user, one field to select the signer (I already have a field with typeahead funcion to select him).
I would like to use an intermediate appPage for this, but when I´m in the new appPage, the currentDocument is gone.
How can I use the same document (opening from a dataView) for solve this problem?
Can I navigate between appPages (inside SinglePageApp) using the currentDocument??
Thanks in advance
It might be worth looking at Mobile Value Picker on OpenNTF. The mobile prefix may need setting in XSP Properties to "Mobile_" (or it could be that I messed around with the development version after the last upload). But it gives code to have a button / link to go to a subsequent appPage to select values from a list. It starts from a DataView, to a document, to the selection page.
The project has not changed for a couple of years, but I've just tested it in Chrome setting user agent to Android and it still seems to work fine on 9.0.1 FP2.

Resources