Google Structured Data Tool don't read my React Site content - reactjs

I use this tool to test my structured data:
https://search.google.com/structured-data/testing-tool
This is my page:
https://www.offersprive.eu/it/prod/Black%20Latte/56
If I try to check it, the response is empty
...But if I copy-and-paste my html content, the tool read it correctly
What can I do to read the link content? Is that a problem with React content loading?
Thanks.

I'm having this same issue.
Basically I have a static website (job board) built with React and want the job to show in the Google Job Network.
To do this the web page needs to contain structured data for Google to crawl.
I've tried some npm packages like react-structured-data which does get the data to appear in the header, but the data gets injected AFTER Google runs the scan, so the data does not yet exist for Google and therefore is returning zero results.
I have the same issue when I try using react-helmet.
I have the same issue when I try to append a script with the data to either the header or body upon ComponentDidMount or ComponentWillMount.
It's weird that it shows in the header when I do inspect elements but doesn't show in the header when I view page source.
Maybe one solution is server-side rendering, but there must be another way.
Possible answer
according to this answer, the Google might actually see the data, its just the testing tool doesn't see the data, which is quite a pain in the butt.
https://webmasters.stackexchange.com/questions/91064/structured-data-tool-doesnt-see-javascript-rendered-content
Also, this page:
https://developers.google.com/search/docs/guides/intro-structured-data#structured-data-format
says: Google can read JSON-LD data when it is dynamically injected into the page's contents, such as by JavaScript code or embedded widgets in your content management system.
Another potential solution, but less plausible because it still loads after the fact
Instead of using JSON-LD, use microdata attached to your elements, like if you go here:
https://schema.org/JobPosting
and click example 4, microdata tab
Then perhaps it will know to wait for your elements to load before scanning.
Testing these solutions now. I will update probably tomorrow as I am logging off soon.
UPDATE: I FOUND THE ANSWER
I have tried the above and it appears that the data is valid and Google does see it, it's just Google's Structured Data tool (and also some structured data chrome extensions) don't not see the data. This is because such tools scan the page before the data is loaded in. Other tools, wait until the data is loaded before scanning, and on those tools, it works.
For example: If you inspect your web page and click on the HTML element, and click "edit as html" and copy the entire html of you page, and paste that HTML as code into Google Structured Data tool, you should see that it now finds your data. Hopefully they fix that in the future but for now, you can at least try that to make sure your data is valid.
Another thing is, if you go to the Google Search Console and request the URL in question to be indexed by Google, then wait a day or so for it to process, then check back in on it. You will see that the Google Search Console DID find your data. So Google IS seeing your data, e.g. search console. It's just the broken Structured Data Tool from Google that is not seeing your data. Hopefully it is fixed soon.
For the record, how I was able to get this to work on my React app is by putting my data inside of Component Did Mount. E.g.
`componentDidMount() {
const googleJobNetworkScript = document.createElement("script");
googleJobNetworkScript.type = "application/ld+json";
googleJobNetworkScript.innerHTML = JSON.stringify({
"#context": "http://schema.org",
"#type": "JobPosting",
"baseSalary": "100000",
"jobBenefits": "Medical, Life, Dental",
"datePosted": "2011-10-31",
"description": "Description: ABC Company Inc. seeks a full-time mid-level software engineer to develop in-house tools.",
"educationRequirements": "Bachelor's Degree in Computer Science, Information Systems or related fields of study.",
"employmentType": "Full-time",
"experienceRequirements": "Minumum 3 years experience as a software engineer",
"incentiveCompensation": "Performance-based annual bonus plan, project-completion bonuses",
"industry": "Computer Software",
"jobLocation": {
"#type": "Place",
"address": {
"#type": "PostalAddress",
"addressLocality": "Kirkland",
"addressRegion": "WA"
}
},
"occupationalCategory": "15-1132.00 Software Developers, Application",
"qualifications": "Ability to work in a team environment with members of varying skill levels. Highly motivated. Learns quickly.",
"responsibilities": "Design and write specifications for tools for in-house customers Build tools according to specifications",
"salaryCurrency": "USD",
"skills": "Web application development using Java/J2EE Web application development using Python or familiarity with dynamic programming languages",
"specialCommitments": "VeteranCommit",
"title": "Software Engineer",
"workHours": "40 hours per week"
});
document.head.appendChild(googleJobNetworkScript);
}`
You can also append the child to document.body instead of document.head. Either should work. Your choice.
You could also use react-helmet, or react-structured-data from NPM, which some other people do, but I didn't see the need, since the above seems to work fine.
You can find the other structured data types at schema.org
Remember to either submit a new site map to Google or submit your site to the Google indexing API each time you have a new webpage or webpage with updated content that you would like Google to scan.
This post is long but I hope it covered all the bases and I hope it helps.

Having had a brief look at how your website loads; I believe you are using React Helmet. The issue is with this tool (and vanilla React in general) is that the page must be loaded and javascript run in order for your headers to be set and your content loaded.
Most tools that crawl webpages don't run javascript, Google now does on its main crawler I believe but they don't seem to have updated all their various tools. Facebook, Twitter, Bing etc, I believe it's patchy at best.
The answer is probably either Gatsby or Next.js; both provide ways of rendering your React code on the server or during the build so that all the headers and content are sent when your page is first called. You can write your own server side rendering methods but these solutions provide all that leg work.
This removes the need for a crawler to be running javascript; so you get index properly! For the sake of interest, when I ran into this issue I went for Gatsby.
A quick work around is do what you have your other links / meta tags; write them into the base index.html file. However, this obviously can't be updated per page etc...
Hope that shines some light on it :)

Google Structured Data Tool don't read my React Site content
i think its reading correctly,but not executing the js,that is google data tool use crawler to fetch the page,which is the source code of your page,to see on what content google tool is working ,just open your page and goto view page source,you can see google tool is working on this source,not on what generated by the react.
Is that a problem with React content loading?
this is because react components are rendered after the page load.and your content is not visible to crawler as webcrawler do not execute javascript.
i hope this will clear your doubt.

I would suggest you to have a look at React Helmet package, which can help you manage your <head> and structured data.

Related

how to dynamically set title and description of angular single page application for google bot?

i want google bot to recognize titles and descriptions of my pages, the title and descriptions are coming from the database..
i used
document.title = $scope.dataFetchedFromDB.title;
and
document.querySelector("meta[name='description']").content = $scope.dataFetchedFromDB.description;
and it does change the title and description in the browser, but not in the snippets fetched by google or facebook or slack.... the old title and description remains.
i know about ng-meta npm package, but i dont have my pages on static route, the route is determined by the page ID (every page has its ID and its description and title)
i also read
Remember that while Google says that they use JavaScript to crawl pages, Facebook, Twitter, etc., do not. You can test Google's render of your page from the links here.
But Google takes a while to index these changes in their snippets. I would recommend creating a Google Search Console account and having it fetch-and-render the pages you want it to re-index. Even then, public results make take days or weeks to reflect your changes.
Also, it seems that the Googlebot with Javascript doesn't have a lot of patience. Try to make sure you are changing your Title and Description within mere moments of the page loading, and not at the end. In little tests, it appears that the Googlebot renderer may time-out after a few seconds, and only capture the original Title and Description.
In order to get other sites like Facebook/Twitter to render the proper metadata, you'll need to server-side render these pieces of data. Whatever appears when you say "View Source..." will be seen by these simplified crawlers. Consider updating to Angular (from AngularJS) and try server-side rendering for your metadata.

How to modify HTMLElement in index.html before page gets returned to requestor

Based on my custom URL parameters I process, I am trying to modify dynamically a meta tag I have id'ed in index.html like so:
<meta name="og:image" content="http://example.com/someurl.jpg" id="ogImage"/>
The code below in my home.ts seems to be working
document.getElementById('ogImage').setAttribute("content", Media.ImageURL) ;
I can verify it is via the browser dev console/elements.
However, when I view from facebook via their ojbect graph debugger at
https://developers.facebook.com/tools/debug/og/object/
It appears to see the default
http://example.com/someurl.jpg
as if the index.html is shipped before my home.ts gets chance to make the update.
Perhaps, my understanding is flawed and there is better way to do this.
Thank you.
Note1: initially, I was thinking I had to make some angular binding between index.html and one of my services but I could not locate any sample code, the closest I came to was this post
How can I update meta tags in AngularJS?
But I don't know how to apply it for my ionic2/3 code, so I opted for the document.get approach.
Note2: the ultimate goal here is to share a link into a social media (web or app) like facebook, a messenger like viber/skype, etc... and have it resolve to meaningful images, title, description to drive the visit back to the site via browser, or app if the user clicking on the link is on a mobile device with my app version of the site installed on his device.
Note3: if you decide to point me to ionic deeplinking please provide code to match above, because I could not understand how to apply to my case.
If you are trying to implement dynamic open graph meta tags values in your pages, you will need a server-side scripting language like php. Such a script will run on the server, update the pages as needed, then the pages will be served to the requesting site or application.
client-side scripting (ie. JavaScript) is usually ignored when a site or app is merely visiting your site/link for the purpose of extracting (aka scrapping, parsing html) information such as the one provided by the open graph meta tags (og:title, og:description og:image...).

AngularJS application problems appearance in Google search

I have a personal project which consumes my free time and effort for about a year without significant profit. I have problems with it appearance in Google and would really appreciate to get help here.
This project (http://yuppi.com.ua - similar to craiglist in US) is WEB-based AngularJS 1.2 application that uses PHP rest API hosted on GoDaddy. And in order to make this application popular it have to be very visible in internet and very searchable in Google and users have to be able to share pages via social networks or skype.
According to Google specification, google crawlers doesn't run javascript to get content of a web page before index, so I've added _escaped_fragment_ page that displays content of web page without javascript. For example:
Page: http://yuppi.com.ua/#!/items/sub/18/_
Dirty : yuppi.com.ua/?_escaped_fragment_=/items/sub/18/_
This dirty page will be redirected here where google will see content.
http://yuppi.com.ua/server/crawler_proxy/routee.php?path=/items/sub/18/
So basically I have two versions on HTML file for that page. One version is the one that available to users, which has styles, a lot more HTML tags etc. And the second is the version for Google crawler - very light-weight without any styles. And I am expecting to see clean link to my site in Google, not dirty.
So, If to search all links to a web site in Google you will see that one of the links displays it's "dirty" state.
Another problem is sharing links in Skype.
When I send a link to someone, I am expecting that this link will be transformed to thumbnail image but it is not happens. Instead I see ungly link to my web site.
Please help me to understand how to make happy everyone: users, google crawler, GoDaddy and me.
I was encountering the same problems last year with a big project and we ended to use : https://prerender.io/.
It's a prerendering system that work with a phantomjs browser to detect bot request and render a full html template. It does also instanciate a cache service to not render again a template that haven't change.
Hope it help's.

Angular.js: Is there any disadvantage of hash in url with respect to SEO?

I am making a website using AngularJS, I am curious to know that is there any disadvantage of hash in url with respect to seo ?
e.g. http://www.website.com/#about-us
I'll appreciate any contribution.
Thanks
If we go back to the basics, HASH # means a DIV ID in your HTML, and to talk in more details Google ignores anything after the HASH.
Example, this page www.mydomain.com is similar to www.mydomain.com/#about-us
This is an advanced technique some marketers are using it to track their campaign without using parameters like UTMs to avoid content duplication.
To make sure your page is loading without any errors, try to disable the JS from your browsers using "Web Developer Tool" and then load your page, i think you will get a white page without content and this is the way Google and most of the search engines see your pages.
Also there is another way to test it by going to Search Console "Webmaster tool" and use fetch as Google, here you will see exactly how Google view your page.

offline mobile website

Is it possible to have a mobile website that can still function if there's no internet connection?
The user should still be able to use the website (if he has visited that page before), see the data (that was loaded before), add new stuff (cache locally).
When internet connection comes back online, all changed local data should be pushed online.
This should be a complete webbased solution, not a native app.
You should have a look at HTML5 offline storage, see http://diveintohtml5.ep.io/offline.html and the Offline Web Applications spec as a start. There are also quite a few posts here on SO.
Bookmarklets work when a user is offline. The trick with a bookmarklet is that it's entirely self contained javascript wrapped up in such a way that it can live within the bookmark itself. E.g. a javacsript: URL. You can also have a data: URL as a bookmark, which could be a complete HTML page. Usually these are base64 encoded with a mime type.
Probably what I'd do would be have a small base page as data:text/html,base64 which contained whatever offline content you cared about, but periodically tried to bootstrap the rest of the "real" content from wherever you host it.

Resources