sitemap.xml in Angular SPA [closed] - angularjs

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 11 months ago.
Improve this question
We have built a simple page app in angular, which is a loan application wizard. So every user must start at page 1, enter some info to get to page 2 etc. There is no possibility to enter other pages without submitting data from the fist one.
I am quite new to SEO and my question is what an optimal sitemap file should look like. I think it must contain just first page url and ignore the rest because when crawler tries to access other pages the app returns first page too (this is handled in $stateChangeStart - app checks user data on BE and redirects him to appropriate page)

In src folder paste you sitemap.xml file then go to angular.json file just add sitemap in assets.
It's work for me
"assets": [
"src/favicon.ico",
"src/assets",
"src/sitemap.xml",
],

First of all, this is not really about programming, so you may get a better response on the sister site: http://webmasters.stackexchange.com. But will give my view anyway.
A sitemap is just an optional extra hint to your site to help search engines and other web crawlers find all your pages. There is no need to have one at all and Google will find your pages if linked from anywhere else. A sitemap just speeds up the process and also gives search engines like Google one place to check if any of your pages have been updated so for those reasons it's recommended but still not mandatory nor is it mandatory to include all your pages in it.
So, given that, what pages do you want Search Engines to find? Those pages should go in the sitemap. As per above, that is not to say other pages will not be found because you do not include them in your site map, so don't depend on that.
Given that your second and subsequent pages will just display your first page there seems little point in including them in a sitemap. At best case they would be ignored and in worst case Google would think they are duplicate content which can cause problems.
Typically for a site like yours you want the static pages indexed by Google along with the first page of the dynamic application form and that's it. So that's what should go in the sitemap.
The other thing to note is that Angular SPAs take quite a bit of Javascript which may or may not be processed properly by search engines. Google crawlers have got pretty good at it, but can't vouch for other search engines. It's best to have as much content as possible not dependent on Javascript and particularly heavy frameworks like Angular if you want the best change of being understood by search engines. So if you turn off Javascript and don't at least get a basic page of content then you could be in trouble.
Leading on from that, and why it pertains to sitemap, search engines page process hash URLs (used by Angular) and parameters differently. Some ignore them and treat anything with the same core URL as the same page. Adding same pages with different parameters to a sitemap can be an indicator that these are different pages - but even then I've seen lots of problems with this. Not sure how you're handling page moves in your app (same url, hash url or parameters?) but the fact you only want or need your first page indexed will probably avoid a lot of those problems.
Google has a number of useful tools in Google Search Console (aka Google Webmaster Tools). If you have not registered your site in that then do that ASAP. This allows you to upload your sitemap, see any errors, and also fetch a page as Google sees it (to answer Javascript questions raised above) and tell Google how to handle parameters on your pages - amongst other things. Google will also use this to report back any errors it finds in your site which is another very important reason to use it.

There is no need of adding an optional sitemap or many number of sitemaps for each page. Just by adding one sitemap it take in all your applications link(url) into it. Well what is sitemap ? Sitemap is a source or a file that simplify's the job of search crawl engine, in other terms it helps crawler to easily get into your sitemap XML file and index your app or websites pages instead of going one by one. Even it is less time consuming. You can add as many as link(url) into your Sitemap XML files.
Eg:-
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
</urlset>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>1.00</priority>
</url>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.80</priority>
</url>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.80</priority>
</url>
<url>
<loc>http://www.yoururl.com/</loc>
<lastmod>2015-03-18T18:14:04+00:00</lastmod>
<changefreq>weekly</changefreq>
<priority>0.80</priority>
</url>
And so on, you can keep on adding number of pages.

Related

What is the status of Angularjs SEO in 2018?

I remade my website, and used angularJS for some part of it. It is online for three weeks now, and seems that Google still not indexed any of the angularjs content.
I would like to know what is the status of Google crawling Angularjs in 2018?
Searching the web returns old articles that claims that Google cannot crawl Angularjs, although google claim they do crawl Angularjs.
Should I wait patiently for Google to crawl my site or generate a server-side-rendering instead?
Also, I would like a link to how to properly do server-side-rendering in 2018?
Is hashbang is still the standard way to do it? There are some similar questions on Stack Overflow that are several years old, but I wonder if the situation has changed.
here is a good article - http://kevinmichaelcoy.com/blog/2018/02/19/2018-search-engine-optimization-with-angularjs-1-x-single-page-application/
Also, for your sanity, you can check what your website looks like when Google crawls it by going to Google Webmaster/Search Console and under “Crawl” choose “Fetch as Google”, then “Fetch and Render” :
https://www.google.com/webmasters/tools/googlebot-fetch
In the case of my site - google doesn't index angular JS so well
For some page it display the content as I would expected, but on other it just display the raw html (i.e. with the {{title}} ng tag instead of the value of the $scope.title)
I'm fetching a category page that uses ajax to display the category content - some category display well, thus it might be a bug in the googlebot-fetch tool
https://buyamerica.co.il/ba_supplier#/showCategory/ba-suplier/840
https://buyamerica.co.il/ba_supplier#/showCategory/ba-suplier/468
But I still don't know how long should it take for google to show it in the index?
NOTE: according to this https://webmasters.googleblog.com/2015/10/deprecating-our-ajax-crawling-scheme.html server side rendering is deprecated

angularjs sitemap SEO

I don't see any updated answer on similar topics (hopefully something has changed with last crawl releases), that's why I come up with a specific question.
I have an AngularJS website, which lists products that can be added or removed (the links are clearly updated). URLs have the following format:
http://example.com/#/product/564b9fd3010000bf091e0bf7/published
http://example.com/#/product/6937219vfeg9920gd903bg03/published
The product's ID (6937219vfeg9920gd903bg03) is retrieved by our back-end.
My problem is that Google doesn't list them, probably because I don't have a sitemap.xml file in my server..
In a day a page can be added (therefore a new url to add) or removed..
How can I manage this?
Do I have to manually (or by batch) edit the file each time?
Is there a smart way to tell Google: "Hey my friend, look at this page"?!
Generally you can create a JavaScript - AngularJS sitemap, and according to this guidance from google :
https://webmasters.googleblog.com/2015/10/deprecating-our-ajax-crawling-scheme.html
They Will crawl it.
you can also use Fetch as Google To validate that the pages rendered correctly
.
There is another study about google execution of JavaScript,
http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157

Web Crawlers are reversing query parameters and path when scraping pages

We have an AngularJS based web application that currently uses hashbang urls, such as:
www.example.com/#!/item?id=1.
For crawling purposes, we use the prerender.io service to render/cache pages. For our meta tags (og, twitter specifically) we use an angular library called angular-view-head. Until around a month ago, this was all working beautifully, and our pages were both searchable and sharable as expected.
Currently, when scraping pages on our site, crawlers appear to be switching the path for the query strings. For example,
www.somesite.com/#!/item?id=1
becomes
www.somesite.com/?id=1#!/item
Which, as you might suspect, returns a 404 always.
After some checking, this seems to have started sometime around the 7th of February. We haven't changed anything with our prerender setup nor our URL schema. I've checked google webmaster tools, and see many 404s for urls such as these.
I haven't had any luck in my research over the last few days finding any similar issues.
Has anyone faced something similar with this style of setup? Any ideas on how to fix this issue?
For anyone who finds this question, we solved this by moving to HTML5 push-state navigation.

HTML snippets for AngularJS app that uses pushState?

I'm deciding whether it's safe to develop my client-facing app in AngularJS using pushState.
I've read that when using pushState in an AngularJS app, we don't need to worry about Googlebot because it can now execute enough JS to produce an HTML snippet for itself. But then I wonder about Bing, Facebook and other bots and scrapers. The tutorials I've seen for making AngularJS SEO-friendly all deal with apps that use hashbangs (#!). These don't apply to me since I'm not using hashbangs.
Does anyone have insight into this problem? What are some methods for ensuring an AngularJS app that uses pushState is SEO-friendly and Social-scraper-friendly? If you use a service like Seo4Ajax or prerender.io I'd appreciate your thoughts on it.
Note: As I understand it, when developing single page apps in the last couple of years it has been necessary to send HTML snippets to SEO crawlers. This was accomplished by using hashbangs and a meta tag that let Google, Bing and Facebook know that it needed to replace the bang (!) with an _escaped_string when making a request. On the server you'd listed for requests with _escaped_string and deliver the appropriate HTML snippet using a tool to generate HTML snippets like phantomJS.
Now that we have pushState, I don't see how we indicate to javascript-less bots what part of the URL to rewrite with an _escaped_string or even if it's necessary. I'm having trouble finding any information beyond "you're site will be okay with google ;)".
Here are some other SO questions that are similar but have gone unanswered.
Angularjs vs SEO vs pushState
.htaccess for SEO bots crawling single page applications without hashbangs
Here's a solution I posted in that question and am considering for myself in case I want to send HTML snippets to bots. This would be a solution for a Symfony2 backend:
Use prerender or another service to generate static snippets of all your pages. Store them somewhere accessible by your router.
In your Symfony2 routing file, create a route that matches your SPA. I have a test SPA running at localhost.com/ng-test/, so my route would look like this:
# Adding a trailing / to this route breaks it. Not sure why.
# This is also not formatting correctly in StackOverflow. This is yaml.
NgTestReroute:
----path: /ng-test/{one}/{two}/{three}/{four}
----defaults:
--------_controller: DriverSideSiteBundle:NgTest:ngTestReroute
--------'one': null
--------'two': null
--------'three': null
--------'four': null
----methods: [GET]
In your Symfony2 controller, check user-agent to see if it's googlebot or bingbot. You should be able to do this with the code below, and then use this list to target the bots you're interested in (http://www.searchenginedictionary.com/spider-names.shtml)...
if(strstr(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot"))
{
// what to do
}
If your controller finds a match to a bot, send it the HTML snippet. Otherwise, as in the case with my AngularJS app, just send the user to the index page and Angular will correctly do the rest.
Supposedly, Bing also supports pushState. For Facebook, make sure your website takes advantage of Open Graph META tags.

Deeplinking backbonejs with pushstate

Not sure what i am missing, but I have pushState working on my Backbone based app, where I could click around and have my URL look like www.example.com/route_specified, however if i try to go directly to that page it shows up as not found. If I do www.example.com/#route_specified it works, and quickly changes back to www.example.com/route_specified on the address bar
I am guessing i need to do something in Apache to handle this and make sure that all calls resolve to the index or something like that, but can't find explanation.
Correct. Think about it this way without pushstate enabled. Your server is still trying to serve the page at that route. Since it cannot find the specified document at that location, it throws a 404.
Technically speaking, your server should still produce some sort of result at the url location, then have Backbone take over. In it's simplest form, this is called progressive enhancement. The server should still serve some sort of static page with critical info, which will eliminate issues you will have with SEO. Work your site/app with javascript disabled, serving only the relevant data. Then have Backbone takeover. I have just come across Mashable's redesign, and they integrate progressive enhancement extremely well with Backbone.
If SEO is not a concern, you could always redirect the user to the index page. Just remember that search engines will only index your app page then. If your content is being served dynamically, there wont be any data to index.
Hope this helps.
Thanks
Tyrone

Resources