Google crawler verification with Prerender not working - angularjs

I have setup Prerender.io for SEO of my one of my Angular based project. It's hosted on EC2 Instance running on Apache2 server.
Problem is, when I am doing Fetch or Fetch and Render on Google Webmaster, it not showing expected result.
Case1: If I am checking
http://mywebsite/something
It is fetching just index.html file
Case2: If I am checking
http://mywebsite/?_escaped_fragment_=/something
It is showing this:
Is there a way to verify and to show that crawler is actually crawling my website content or not. Specifically which content crawler actually getting.

Related

How to get 404 status code response for React.js & Firebase (GCP) Hosting app

I recently took on a new client who had an existing web presence with their domain, and the old build structure of the links, Google is still indexing those sub-title links on the search result for the company as the old links. I have re-submitted the sitemap on Google Search console, tried re-indexing them individually, and also tried submitting removal requests for domains with certain suffixes. None of this has changed the links Google is bringing up in search, still the old links. The proper links are scattered in there, but not the top results.
I asked to the Google Search forum and we found that the pages are still returning a 200 status code, and not a true 404 status code when the page does not exist on the site (see below). My app is built with React.js and hosted with Firebase Hosting on Google Cloud Platform. I am not seeing a lot of information online on how to fix this issue I am having so my Error Boundary in React is sending a true 404 error. Or if client side status code sending is not possible, then doing so on the Firebase side. Any ideas out there on how to fix the primary issue of Google promoting dead links under main link for a website?

Hosting a React App from Google Cloud Storage 404 status code

I am hosting a React App from Google Cloud Storage and I have setup the bucket to use index.html as the main and error page so I can always load up the index.html page but I get a 404 when accessing the page. I don't know if this is the best way to handle this. I guess the problem is that I need React to use url rewriting and always load up the html and since the bucket can't really redirect the way I need it to this is happening.
Are there alternatives such as serving at least the index page from an actual server and load balancer?

Create-react-app with react-router v3 hosted on S3 not working with "fetch as google"

I currently have a React app built with create-react-app using react-router v3 hosted on S3 through CloudFront. The app is the view for a PHP api hosted elsewhere. The react-router is set up to use browserHistory.
Currently we are trying to set up the app so that it can be crawled by google and are testing this with google webmaster tools and "fetch as google".
The homepage fetches no problem but any internal page is unable to even render and returns a "not found".
The site also still has a 404 error show in console when trying to directly navigate to a route in a new tab (but loads the page as expected).
What I've tried so far:
1) importing babel-polyfill at the entry point for the googlebot.
2) set up CloudFront error pages to send 404 responses to /index.html with a 200
3) set the error page for s3 to index.html
From my reading, google shouldn't require server-side-rendering just to crawl the site (SEO is not a concern for us), but none of the other solutions I've found online seem to solve the problem.
Will I need to make the whole app be able to handle SSR, something simple like: https://github.com/facebook/create-react-app/blob/master/packages/react-scripts/template/README.md#serving-apps-with-client-side-routing, or are there other things I can try that will just make a page crawlable without setting anything up server side?
Any help or direction to further resources would be appreciated!
I found out the solution was pretty easy. In the cloudfront distribution, set the custom error pages to have 404 errors go to the target "/" with a http response of 200.
A lot of other people have it posted with the target as "/index.html", if that doesnt work, just try the above.

How to Test Facebook Connect Locally with Monaca

I am currently working into a project with AngularJS and OnsenUi in Monaca Localkit and i need your help with the integration of facebook login to the application.
I followed the instructions
here but i didn't make it to work.
On Site URL field i use the http://localhost/
Although, i achieved the desired result with plain javascript and serving my index.html from http://localhost:8080. (of course i changed the Site URL to http://localhost:8080)

Fetch as Google Webmaster tools

I have an AngularJS SPA site which I wanted to test using google's "Fetch as Google" feature in webmaster tools. I am a little confused about the results. The screenshot from Googlebot looks correct however the response doesn't include any of the contents inside the "ui-view" (ui-router)... can someone explain what is happening here? Is google indexing the site properly since the screenshot is correct? Or is google not able to execute the JS properly for indexing?
This is a mixed bag. From some tests I've seen the GoogleBot is able to index some of the AJAX fetched content in some cases. A safe bet though to make all the search engines happy is to use prerender.io or download their open source stuff (uses PhantomJS) to have your site be easily indexable. Basically what this does is saves the version of your site after async operations have completed for a given URL and then you setup a redirect on your server that points any of the potential bots for search engines over to the preprocessed page. It sounds pretty complicated but following the instructions on the site it's not too hard to setup, and if you don't want to pay for prerender.io to serve cached copies of your pages to search engines you can run the server component yourself too.

Resources