I don't see any updated answer on similar topics (hopefully something has changed with last crawl releases), that's why I come up with a specific question.
I have an AngularJS website, which lists products that can be added or removed (the links are clearly updated). URLs have the following format:
http://example.com/#/product/564b9fd3010000bf091e0bf7/published
http://example.com/#/product/6937219vfeg9920gd903bg03/published
The product's ID (6937219vfeg9920gd903bg03) is retrieved by our back-end.
My problem is that Google doesn't list them, probably because I don't have a sitemap.xml file in my server..
In a day a page can be added (therefore a new url to add) or removed..
How can I manage this?
Do I have to manually (or by batch) edit the file each time?
Is there a smart way to tell Google: "Hey my friend, look at this page"?!
Generally you can create a JavaScript - AngularJS sitemap, and according to this guidance from google :
https://webmasters.googleblog.com/2015/10/deprecating-our-ajax-crawling-scheme.html
They Will crawl it.
you can also use Fetch as Google To validate that the pages rendered correctly
.
There is another study about google execution of JavaScript,
http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157
Related
I have this big issue.
I have developed a Gatsby website for the car parts market.
The users can select car model and part type and receive a page with the parts for their specific car.
I have used React Route to create dynamically the new path and an API that recall the data:
https:/mysiteAPI.com/{car.id}/{part.id}
The paths that will be created are something like that: https:/mysite.com/comparison/{car.name}-{part.name}
The big issue that I have is that this path will be created only when i click on
Using this approach I can not generate the pages when I build the website, but the pages will be generated only when users click.
Actually I need that pages will be readable from search engine crawlers for SEO reasons.
I tried to create each page inserting in Gatsby Node a CreatePage, but the system crashed, due the huge amount of pages (over 5 million).
I don’t know how to generate pages that can be readable by crowler and persist. I hope that someone of you can help me to improve my site.
Thanks in advance
Take a look at the createPages API, available in the gatsby-node file of your project. If you would like to create static HTML files in the build process for each item you need to first fetch them somehow. With the data in your possession you could iterate over it and call createPage for each page you want to create.
I remade my website, and used angularJS for some part of it. It is online for three weeks now, and seems that Google still not indexed any of the angularjs content.
I would like to know what is the status of Google crawling Angularjs in 2018?
Searching the web returns old articles that claims that Google cannot crawl Angularjs, although google claim they do crawl Angularjs.
Should I wait patiently for Google to crawl my site or generate a server-side-rendering instead?
Also, I would like a link to how to properly do server-side-rendering in 2018?
Is hashbang is still the standard way to do it? There are some similar questions on Stack Overflow that are several years old, but I wonder if the situation has changed.
here is a good article - http://kevinmichaelcoy.com/blog/2018/02/19/2018-search-engine-optimization-with-angularjs-1-x-single-page-application/
Also, for your sanity, you can check what your website looks like when Google crawls it by going to Google Webmaster/Search Console and under “Crawl” choose “Fetch as Google”, then “Fetch and Render” :
https://www.google.com/webmasters/tools/googlebot-fetch
In the case of my site - google doesn't index angular JS so well
For some page it display the content as I would expected, but on other it just display the raw html (i.e. with the {{title}} ng tag instead of the value of the $scope.title)
I'm fetching a category page that uses ajax to display the category content - some category display well, thus it might be a bug in the googlebot-fetch tool
https://buyamerica.co.il/ba_supplier#/showCategory/ba-suplier/840
https://buyamerica.co.il/ba_supplier#/showCategory/ba-suplier/468
But I still don't know how long should it take for google to show it in the index?
NOTE: according to this https://webmasters.googleblog.com/2015/10/deprecating-our-ajax-crawling-scheme.html server side rendering is deprecated
I have an angularjs app. In order to make the app Ajax Crawlable, I changed all the '#' to '#!' . When I tried the change with google webmaster tools, the results still will be redirected to the index page(Home page). my site URL is like https://www.sample.com/web/ and the rest of the URL im entering for fetch and render is like, #!/wellness . The issue is, Im always getting the rendering googlebot snapshot as the homepage(image of https://www.sample.com/web/). And the "path" column of that fetch attempt is / (The part I entered which is, #!/wellness not there).
Finally I've found the solution. Though the Google web crawlers recognize the #! to escaped-fragment Fetch as Google bot requires it to be entered manually
Refer these links if someone needs help regarding an issue like this.
Below is the link to a question that exactly like mine and it has the answer
https://productforums.google.com/forum/#!msg/webmasters/fZjdyjq0n98/PZ-nlq_2RjcJ
Below link give a complete explanation on these issues
Google bot crawling on AngularJS site with HTML5 Mode routes
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 11 months ago.
Improve this question
We have built a simple page app in angular, which is a loan application wizard. So every user must start at page 1, enter some info to get to page 2 etc. There is no possibility to enter other pages without submitting data from the fist one.
I am quite new to SEO and my question is what an optimal sitemap file should look like. I think it must contain just first page url and ignore the rest because when crawler tries to access other pages the app returns first page too (this is handled in $stateChangeStart - app checks user data on BE and redirects him to appropriate page)
In src folder paste you sitemap.xml file then go to angular.json file just add sitemap in assets.
It's work for me
"assets": [
"src/favicon.ico",
"src/assets",
"src/sitemap.xml",
],
First of all, this is not really about programming, so you may get a better response on the sister site: http://webmasters.stackexchange.com. But will give my view anyway.
A sitemap is just an optional extra hint to your site to help search engines and other web crawlers find all your pages. There is no need to have one at all and Google will find your pages if linked from anywhere else. A sitemap just speeds up the process and also gives search engines like Google one place to check if any of your pages have been updated so for those reasons it's recommended but still not mandatory nor is it mandatory to include all your pages in it.
So, given that, what pages do you want Search Engines to find? Those pages should go in the sitemap. As per above, that is not to say other pages will not be found because you do not include them in your site map, so don't depend on that.
Given that your second and subsequent pages will just display your first page there seems little point in including them in a sitemap. At best case they would be ignored and in worst case Google would think they are duplicate content which can cause problems.
Typically for a site like yours you want the static pages indexed by Google along with the first page of the dynamic application form and that's it. So that's what should go in the sitemap.
The other thing to note is that Angular SPAs take quite a bit of Javascript which may or may not be processed properly by search engines. Google crawlers have got pretty good at it, but can't vouch for other search engines. It's best to have as much content as possible not dependent on Javascript and particularly heavy frameworks like Angular if you want the best change of being understood by search engines. So if you turn off Javascript and don't at least get a basic page of content then you could be in trouble.
Leading on from that, and why it pertains to sitemap, search engines page process hash URLs (used by Angular) and parameters differently. Some ignore them and treat anything with the same core URL as the same page. Adding same pages with different parameters to a sitemap can be an indicator that these are different pages - but even then I've seen lots of problems with this. Not sure how you're handling page moves in your app (same url, hash url or parameters?) but the fact you only want or need your first page indexed will probably avoid a lot of those problems.
Google has a number of useful tools in Google Search Console (aka Google Webmaster Tools). If you have not registered your site in that then do that ASAP. This allows you to upload your sitemap, see any errors, and also fetch a page as Google sees it (to answer Javascript questions raised above) and tell Google how to handle parameters on your pages - amongst other things. Google will also use this to report back any errors it finds in your site which is another very important reason to use it.
There is no need of adding an optional sitemap or many number of sitemaps for each page. Just by adding one sitemap it take in all your applications link(url) into it. Well what is sitemap ? Sitemap is a source or a file that simplify's the job of search crawl engine, in other terms it helps crawler to easily get into your sitemap XML file and index your app or websites pages instead of going one by one. Even it is less time consuming. You can add as many as link(url) into your Sitemap XML files.
Eg:-
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
</urlset>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>1.00</priority>
</url>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.80</priority>
</url>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.80</priority>
</url>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.80</priority>
</url>
And so on, you can keep on adding number of pages.
Since some time google officially depreceated ajax crawling scheme they came up with back in 2009 so I decided to get rid of generating snapshots in phantomJS for _escaped_fragment and rely on google to render my single page app like a modern browser and discover its content. They describe it in here. But now I seem to have a problem.
Google indexed my page (at least I can see in webmastertools it has) but when using webmastertools I look at google index-->content keywords it shows non processed content of my angularJS templates only and names of my binded variable names e.g. {{totalnewprivatemessagescount}} etc. The keywords do not contain words that should should be generated by ajax calls when Javascript executes so e.g. fighter is not even in there and it should be all over the place.
Now, when I use Crawl-->Fetch as google-->Fetch and render the snapshot what google bot sees is very much the same as what user sees and is clearly generated using Javascript. The Fetch HTML tab though shows only source without being processed using JS which I'm guessing is fine.
Now my question is why google didn't index my website properly? Is there anything I implemented incorrectly somewhere? The website is live at https://www.fightersconnect.com and thanks for any advice.