I have created a site, complete with responsive layout which works well.
Apparently google thinks the site isn't mobile friendly, and has listed a whole pile of resoures that I notice are included in this text in the robots.txt file
Disallow: /administrator/
Disallow: /bin/
Disallow: /cache/
Disallow: /cli/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /layouts/
Disallow: /libraries/
Disallow: /logs/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
It looks like I need to allow access to some of these files/folders including media, templates and plugins
I am concerned that google will then be putting up administrator type pages within its search results
What should I do?
Is it ok to do this - and which ones should I allow?
Thanks
After some more rooting around I just made image, media and templates viewable to robots. Now my site is friends with google.
Related
In robots.txt I can put:
#Baiduspider
User-agent: Baiduspider
Disallow: /
#Yandex
User-agent: Yandex
Disallow: /
to tell the search engines to stop crawling my app pages (php app). But how to block them by IPs in GAE?
There are two ways.
Do it in your code.
Use the DOS facilities https://developers.google.com/appengine/docs/python/config/dos
A large site I am working with is getting 80K+ 404s a day from Google for garbage URLs. I can't figure out where they are coming from. Here is a sample of a few. These URIs exist no where in the site structure so I am assuming they are being created by an external agent/site that is driving Gbot to crawl them. Anyone have any ideas?
7/2/2013 22:05 /Sl/4watQCXBFtF6obwFRA0f35148b 10262 404 - Not Found No
Referrer Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)
7/2/2013 22:05 /PvDIs6AveH9tju3tETtWg045cb22d 10261 404 - Not Found No
Referrer Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)
I have uploaded robots.txt in my url http://watchmariyaanmovieonline.appspot.com/robots.txt , But when i use google webmaster and do Fetch as google for my home page http://watchmariyaanmovieonline.appspot.com/ i get error Unreachable robots.txt
Your robots.txt contents have one empty Disallow due to which you get that error.
User-agent: *
Disallow:
Disallow: /cgi-bin/
Sitemap: http://watchmariyaanmovieonline.appspot.com/sitemap.xml
Update it to:
User-agent: *
Disallow: /cgi-bin/
Sitemap: http://watchmariyaanmovieonline.appspot.com/sitemap.xml
And it should work just fine, let me know if this helps :)
Google give the following example of setting up an Apache server to serve a GWT app.
What is the equivalent entries to an AppEngine (GAE) app.yaml file?
<Files *.nocache.*>
ExpiresActive on
ExpiresDefault "now"
Header merge Cache-Control "public, max-age=0, must-revalidate"
</Files>
<Files *.cache.*>
ExpiresActive on
ExpiresDefault "now plus 1 year"
</Files>
see https://developers.google.com/appengine/docs/go/config/appconfig#Static_Cache_Expiration
The expiration time will be sent in the Cache-Control and Expires HTTP
response headers, and therefore, the files are likely to be cached by
the user's browser, as well as intermediate caching proxy servers such
as Internet Service Providers.
I'm trying to run the GWT 2.4 sample app "MobileWebApp". I get a 500 "No Realm" error when I try to run the app in dev mode through Eclipse.
I understand this is an authentication problem.
I'm not familiar with Google App Engine or Jetty but from looking at the web.xml I can see there is a servlet filter where it is using the appengine UserService to presumably redirect the user to Google for authentication.
I'm using:
Eclipse 3.7 (Indigo SR1)
Google Plugin for Eclipse 2.4
m2eclipse
I'm including an excerpt from the web.xml below. I'm not sure what other info would be helpful in diagnosing this problem.
<security-constraint>
<display-name>
Redirect to the login page if needed before showing
the host html page.
</display-name>
<web-resource-collection>
<web-resource-name>Login required</web-resource-name>
<url-pattern>/MobileWebApp.html</url-pattern>
</web-resource-collection>
<auth-constraint>
<role-name>*</role-name>
</auth-constraint>
</security-constraint>
<filter>
<filter-name>GaeAuthFilter</filter-name>
<!--
This filter demonstrates making GAE authentication
services visible to a RequestFactory client.
-->
<filter-class>com.google.gwt.sample.gaerequest.server.GaeAuthFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>GaeAuthFilter</filter-name>
<url-pattern>/gwtRequest/*</url-pattern>
</filter-mapping>
Below is the output in the Eclipse console:
[WARN] Request /MobileWebApp.html failed - no realm
[ERROR] 500 - GET /MobileWebApp.html?gwt.codesvr=127.0.0.1:9997 (127.0.0.1) 1401 bytes
Request headers
Host: 127.0.0.1:8888
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: keep-alive
Response headers
Content-Type: text/html; charset=iso-8859-1
Content-Length: 1401
Many thanks for any helpful advice!
Edit on 11/11/11: I added Jetty tag since it seems relevant to this problem.
If your very first request fails, just getting the /MobileWebApp.html page, then it probably isn't an authentication problem. Do you have GAE enabled for that project (not only GWT)? That might be one issue.
I read somewhere that there's two ways of debugging an app in Eclipse, one is with run as/webapp, and forgot which was the other one (I don't use Eclipse). One of them works and another doesn't.
If that doesn't work, you can try replacing the built-in jetty:
add a GWT param: -server com.google.appengine.tools.development.gwt.AppEngineLauncher
VM param: -javaagent:/path_to/appengine-agent.jar
And the last option is with -noserver, but then you wont be able to debug the server-side code, just the client-side GWT stuff: first start jetty with mvn jetty:run and then debug in Eclipse with -noserver GWT param.
I had the same problem. Finally I noticed that when I switched to a newer version of Appengine, the older Appengine libraries remained in the WEB-INF/lib along with the new ones.
Removing them solved the problem.