PageSpeed Insights: Server-side rendering or Client-side rendering? - reactjs

My website has a conditional accessing rule: if the access is being made by the Googlebot I deliver a page fully rendered by the server (server-side rendering).
Otherwise, if the access is being made by a human being I deliver a page with SPA features (client-side rendering).
My question is: In my last analysis using PageSpeed Insights (Lighthouse) I noticed that the results were based on the client-side rendering version. In this case, should I consider that the website performance result will be judged by Google by the client-side version?
I'm a little bit confused about this behavior.
Thanks :)

Lighthouse is different from googlebot, so probably you are not checking it correctly.
Did a quick check and I found this user agent string my case:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4143.7 Safari/537.36 Chrome-Lighthouse
But in any case: don't do this. This tecnique is called cloaking and is forbidden by Google.
You risk to have you page removed form the index or heavily penalized.
You must serve to bot what you users normally access to.

Related

"For your protection, you can't sign in from this device." usando selenium web-drivers en GCP Compute Engine

I have automated tests of UI with Selenium, The first step is to enter a Google account (username and password, the account does not require anything else). These tests run well on my personal computer, but when I try to run them on a virtual machine in Compute Engine with chrome / firefox etc. After entering email and password, Google returns the following message:
"For your protection, you can't sign in from this device. Try again later, or sign in from another device."
Additional notes:
I have already tried several accounts; Gmail (personal / standard and with G Suite) and the same thing happens (Selenium + Compute Engine)
In Compute Engine machines I can enter traditionally / manually well,no problem. The problem arises when I run the script with Selenum (webdrive chrome and firefox).
The OS Centos 7 + xfce, selenium node js
UserAgent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36"
I attach the image, I appreciate any help.
Image link
Possibly not related with your use case. But I started to encounter this 3 weeks ago myself on a regular Windows 7 Home using Firefox Developer as my browser.
The solution offered by #john-smith (which was voted down) gave me an idea as I am using a custom user-agent for this browser. I reverted the user-agent back to normal and I was able to login to all of the Google services again.
Whoever voted down #john-smith's answer did so without actually testing it first or understanding it. But it appears that Google probably started to blacklist user-agents that are old or unknown to them or they find suspicious.
If it is also a user-agent issue on your end, then this is more on Google's end.
I hope it helps.
And thanks to #john-smith, his solution gave me an idea and it worked. Unfortunately, I can't up-vote you because of the restrictions in place.
I'm not positive this will help, but I notice your user-agent looks unconventional, reporting Safari and WebKit on Linux.
I know in my Firefox on MacOS, I have an extension that spoofs my user-agent with strings that I know for sure are valid but don't match my actual setup, and I reliably get that same error trying to log into my Google account unless I turn that extension off. I believe Google may be using Javascript to do some fingerprinting, and then refusing to allow logins at all if the user-agent string doesn't match what it finds, and returns only that vague message that I "can't sign in from this device."
Maybe you could try setting your user-agent string to something more common or appropriate for your OS and browser.

How to get "Did you mean" data from Google without api in c?

I try to download the source code of the Google search page with curl in C and get "did you mean" or "showing results for" data but I fail.
How can I save the Google search page source code using only C?
sample url: https://www.google.com/search?q=stacoverflow
i want: view-source:https://www.google.com/search?q=stacoverflow
Thank you.
First: make sure you specified user agent. (Search curl docs to find out how to do this.) Google doesn't give you page if you didn't specified user agent.
This is my user agent and it seems to work: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.119 Safari/537.36"
Second: make sure you are trying to find RIGHT string in the output of Google site. Google may try to figure out your country from your IP address and try to localize string "did you mean" or "Showing results for" for your language. For example, I am in Estonia now, Google (I tried to access google.COM!) determined my current country and gave me "Näitab tulemusi" instead of "Showing results for". Also, I want to point here: your curl library doesn't share settings, coockies, etc with your browser. I. e. if you set up some language in Google in browser, curl will not inherit it.
And third: when I get Google results in English, I get "Showing results for" instead of "Did you mean". So, try to search for different strings.
Try to pass Google output through some HTML prettyfier and look at resulting source

Why does Googlebot crawl for /mobile/* and /m/* pages that are not referenced anywhere?

Since the end of may, I have a lot of new 404 errors in the Smartphone Crawl Errors page in Webmaster Tools / Google search console. All of them starts with /m/ or /mobile/, none of which are existing nor linked to anywhere on the site.
For example, I have a 404 error for http://www.example.com/mobile/foo-bar/ and http://www.example.com/m/foo-bar pages. According to the Search Console, those page are linked in the existing page http://www.example.com/foo-bar/, but they are not.
Is Googlebot deciding on its own to look for a mobile version of every page ? Can I disable this behavior ? Is this because my site is not mobile-friendly yet (a problem for which I received another warning message from Google).
As #Jonny 5 mentioned in a comment, this seems to be happening as a result of Google guessing that you may have a mobile version of your site in the /m and/or /mobile directories. From what I have read, they will only try those directories if they decided that the pages they initially indexed were not mobile-friendly/responsive. More info on this behavior can be found in these Google Product Forum threads:
https://productforums.google.com/forum/#!topic/webmasters/k3TFeCkFE0Q
https://productforums.google.com/forum/#!topic/webmasters/56CNFxZBFwE
Another helpful comment came from #user29671, who pointed out that your website does in fact have some URLs with /m and /mobile indexed. I found that the same was true for my website, so this behavior may also be limited to sites that Google has (for whatever reason) indexed a /m and/or /mobile URL for. To test if this is true for your site, go to the following URLs and replace example.com with your website's domain:
https://www.google.com/search?q=site:example.com/m/&filter=0
https://www.google.com/search?q=site:example.com/mobile/&filter=0
As far as preventing this goes, your best bet is either creating a mobile-friendly version of your site or redirecting /m and /mobile pages back to the originals.
You could block those directories in your robots.txt, but that's a bit of a workaround. The better option would be to figure out where exactly Googlebot is picking up those URLs from.
If you shared an example page URL where Google says you have links to the /mobile pages, I could look at it and figure out where that's being picked up.
And no, Google doesn't just invent directories to crawl on the off-chance that you might have snuck in a mobile page randomly :)
I am experiencing the same issue since December 2016. Googlebot is constantly trying to crawl my website pages with the /m/ and /mobile/ prefixes.
All those urls cause the 404 errors and get listed in Google Webmaster Tools as errors.
The automatic email was received from GWT on January 2nd, 2017 stating
Googlebot for smartphones identified a significant increase in the number of URLs on http://example.com that return a 404 (not found) error. If these pages exist on your desktop site, showing an error for mobile users can be a bad user experience. This misconfiguration can also prevent Google from showing the correct page in mobile search results. If these URLs don't exist, no action is necessary.
This is done by a mobile crawler:
*Ip: 66.249.65.124
Agent: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1)
Browser: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1)*
You are not alone, therefore.
Take it easy. It's a Google bug :)
As for redirecting /m and /mobile pages back to the originals, here's a snippet for nginx:
location /m/ {
rewrite ^/[^/]+(/.*)$ $1 permanent;
}
location /mobile/ {
rewrite ^/[^/]+(/.*)$ $1 permanent;
}
One can also redirect everything to the root:
location /m/ {
return 301 $scheme://$host/;
}
location /mobile/ {
return 301 $scheme://$host/;
}

How to extract GAE quota/dashboard data?

I would like to centralize my app's performance monitoring, so I should extract somehow GAE dashboard or quota usage details data. Is there any way to do it (for ex. with Google API)?
I do not think there is an API provided by Google to get your quota statistics for App Engine application.
This is an Open Issue : https://code.google.com/p/googleappengine/issues/detail?id=655 and you can get some hints from the thread about what other developers have been trying.
Still waiting on official version.
Khan academy uses curl:
https://github.com/Khan/analytics/tree/master/src/gae_dashboard
Probably against terms of service, but I use mechanize to access the GAE dashboard (verified works from deployed app).
import gaemechanize2._mechanize as mechanize
self.br = mechanize.Browser()
#self.br.set_all_readonly(False) # allow everything to be written to
self.br.set_handle_robots(False) # ignore robots
self.br.set_handle_refresh(False) # can sometimes hang without this
self.br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
self.br.open(given_source)
loginForm = self.br.forms().next()
loginForm["Email"] = self.the_email
loginForm["Passwd"] = self.the_password
response = self.br.open(loginForm.click())
self.the_response=response.read()
if re.search('Verify that it.*you',self.the_response):
logging.info("BREAK: NOT YOU...trying again")
loginForm=self.br.forms().next()
loginForm.set_value(['PhoneVerificationChallenge'],name='challengetype')
loginForm['phoneNumber']=self.the_phone
response1=loginForm.click()
response=self.br.open(response1)
self.the_response=response.read()

How to detect that a request is originating from a Good Mobile browser

We have a requirement to redirect the request to a mobile version of the app if it origniates from a mobile device.I'm using the existence of X-WAP-Profile in the header and it seems to work with Blackberry however when we try to test on Good (Secure Mobile) Browser it doesn't work.looks like the header is not on in this case.I'm accessing from iPhone.
So there are two questions
What is a conclusive way of recognising that the request is originating from the Good Browser
Will this change based on the kind of device that the Mobile browser is used from i.e iPad/iPhone/Android etc?
If there is a way to avoid the user-agent (assuming that they change from device/mobile os type) I would prefer that method of detection.
Any pointers in this regard please help
Ultimately, a http request, including its headers, is text, and this text can be anything a piece of software wants to send. So, I can easily have a mobile browser that reports itself as being a desktop browser. What this means is that there is no absolute and conclusive way of recognizing anything about the source of a request. All you can reasonably do is trust the user-agent string, and respond to as many different values as you can. If you're getting no value, then you'll have to make a decision on which version of the app to go to.

Resources