Download Log from AppEngine Including Python Log Statements - google-app-engine

I know you can download the raw access logs with appcfg.py, but I'm really interested in all the information around a specific request like python logging statements, exceptions and api statistics (just like the online log viewer). Does anyone know if there is a way to get that information another way then having to build it yourself?
If case anyone is wondering, we want to do some continuos statistical analyzing for problems and displaying them on a large screen on a wall in the office.

Sure - just pass the --severity flag to appcfg.py:
$ appcfg.py help request_logs
Usage: appcfg.py [options] request_logs <directory> <output_file>
Write request logs in Apache common log format.
The 'request_logs' command exports the request logs from your application
to a file. It will write Apache common log format records ordered
chronologically. If output file is '-' stdout will be written.
Options:
-h, --help Show the help message and exit.
-q, --quiet Print errors only.
-v, --verbose Print info level logs.
--noisy Print all logs.
-s SERVER, --server=SERVER
The server to connect to.
--insecure Use HTTP when communicating with the server.
-e EMAIL, --email=EMAIL
The username to use. Will prompt if omitted.
-H HOST, --host=HOST Overrides the Host header sent with all RPCs.
--no_cookies Do not save authentication cookies to local disk.
--passin Read the login password from stdin.
-A APP_ID, --application=APP_ID
Override application from app.yaml file.
-V VERSION, --version=VERSION
Override (major) version from app.yaml file.
-n NUM_DAYS, --num_days=NUM_DAYS
Number of days worth of log data to get. The cut-off
point is midnight UTC. Use 0 to get all available
logs. Default is 1, unless --append is also given;
then the default is 0.
-a, --append Append to existing file.
--severity=SEVERITY Severity of app-level log messages to get. The range
is 0 (DEBUG) through 4 (CRITICAL). If omitted, only
request logs are returned.
--vhost=VHOST The virtual host of log messages to get. If omitted,
all log messages are returned.
--include_vhost Include virtual host in log messages.
--end_date=END_DATE End date (as YYYY-MM-DD) of period for log data.
Defaults to today.

This is what works for us really well:
appcfg.py --append --num_days=0 --include_all request_logs /path/to/your/app/ /var/log/gae/yourapp.log
Anyway, the line above will get all your log records and append them to a log file if you've executed this before, if not, it will create a new log file. It actually looks at your existing log (if it's there) and it will not get any duplicates. You can run this without --append if you want, but use it if you are automating log downloads.
The key here is the --include_allflag which seems to be undocumented. This flag will get all the data that you see if you use GAE's web log viewer. So, you will get fields such as: ms=71 cpu_ms=32 api_cpu_ms=12 cpm_usd=0.000921... etc.
OK, I hope that helps someone.
BTW, we wrote up a blog post on this, check it out here.

I seem to be running into 100M limit with appcfg. I ended up using logservice API to get the logs
Here's the code - https://github.com/manasg/gae-log-fetcher

Here is a way to access raw logs so you can further processing without custom parsing (also for me request_logs is not downloading all the data for specified time frame).
Here is an app which runs in the appengine itself:
https://gaelogapp.appspot.com/
You can easily add this functionality to your app with updates to app.yaml and copy logs.py:
https://github.com/okigan/gaelogapp

Related

Logging in admin UI does not entries after changing settings

I am trying to debug a datimport and changed inside solr 7.7.1 admin UI the loglevel of dataimport from ERROR to ALL. This does not have any effect, furthermore the setting will go back to the original settings after restarting and reindexing.
How can I enable logging for dataimport INFO?
Any changes made in the web UI for logging are temporary changes and can't be persisted:
You can control the amount of logging output in Solr by using the Admin Web interface. Select the LOGGING link. Note that this page only lets you change settings in the running system and is not saved for the next run.
The easiest way to change the logging level might be to change it when starting Solr, either through the SOLR_LOG_LEVEL environment variable, or through the -v parameter to bin/solr:
bin/solr start -f -v
This will start Solr with the DEBUG log level by default.
More detailed, permanent logging configuration is done through the standard Log4j2 syntax, which can be configured in server/resources/log4j2.xml.

Getting "/dev/log" in the "gcloud app logs tail" stream

In the browser cloud console for my Google App Engine app, I can choose to see the logs for /dev/log and stderr which gives me all of the log entries that I'm expecting to see.
However, when I use the command line gcloud app logs tail to stream the logs in my terminal, I can't get it to give me the /dev/log entries.
The docs say the default list of logs include: stderr,stdout,crash.log,nginx.request,request_log
So the /dev/log must be represented by some other identifier, but I can't find any docs on what it might be. I've tried a few guesses, but none work.
How can I can the terminal to stream the same logs I'm getting in my browser?
You can use command like gcloud logging read to interact with Stackdriver logging & get a non streamed version of those logs. Setup the Stackdriver GUI with the logs you wish to see, then convert to an advanced filter. You can then paste the advanced filter as is, in quotations after gcloud logging read. Examples in gcloud logging read documentation. I will get back to you in a comment on this post as to whether you can get the /dev/log logs with the gcloud app logs tail command. I will update on Saturday

How to fetch Google App Engine logs in minute chunks via appcfg.py?

My latest project expects 100.000 unique visitors a day and I started to use Google App Engine to be able to scale the infrastructure according to the load.
I would like to fetch log lines with appcfg.py request_logs every minute to get the latest data and add them to my monitoring dashboard utilizing LogStash/ElasticSearch/Kibana.
Is there a way to tell appcfg.py request_logs to load the log lines from an explicit time range like a minute from 2014-10-10T10:00:00 to 2014-10-10T10:01:00 ?
I am a bit afraid that I have too many log lines and I am not able to retrieve all of them because of limit to the number of lines the appcfg.py request_logs can retrieve, that I may incurring too much cost if I retrieve all log lines every minute, etc.
You can only specify by days:
-n NUM_DAYS, --num_days=NUM_DAYS
Number of days worth of log data to get. The cut-off
point is midnight US/Pacific. Use 0 to get all
available logs. Default is 1, unless --append is also
given; then the default is 0.
I haven't tried this myself but if you can create a local script to process a website you use the Logs API to generate a webpage that is read from your script. Make sure to set the login as admin only to prevent others from reading it.

File is not creating on heroku using cakephp

I tried to make a file on heroku using PHP code:
$fh = fopen("../DownloadFiles/".$filename,'a');
fwrite($fh,$s);
but the file has not been created and it is not showing any error. Please help.
This should work just fine, but are you aware that if you're running multiple dynos, that file will exist only on the dyno that served that one request, and not on all the others?
Also, Dynos restart every 24 hours, and their state is reset every time you push a change to Heroku, so you cannot store persistent information on them; that's called an ephemeral filesystem.
You could, for instance, store uploaded files on Amazon S3, like described in the docs: https://devcenter.heroku.com/articles/s3-upload-php
Two remarks about your original issue:
you're probably running an old version of CakePHP which mangles all internal PHP error handling and writes it out to a log file (if you're lucky), so you can't see anything in heroku logs, and it's not possible to configure it otherwise; upgrade to a more recent version that lets you log to streams and then use php://stderr as the destination
in general, if you want to write to a file in PHP, you can just do file_put_contents($filename, $contents)...
Does the DownloadFiles folder exist in the deployment? Node fs gives error if the directory is not found. You can add a snippet to check if dir exists and if not then create. You can use fs.exists and fs.mkdir.
For more info http://nodejs.org/api/fs.html

App Engine: Development datastore cleared each time I turn off my computer. How to avoid this?

I've been using App Engine with Python for a few months. Now that my application has a fair amount of code, I'm trying to solve a problem I've ignored so far:
Each time I turn off my computer, all my development datastore entities are removed.
I would like to keep this data until the next time I launch my development server. But I would also like to be able to turn off my computer without losing all of this data.
How should I proceed?
Thanks a lot
======== UPDATE ==========
When I set the datastore_path flag as explained by #moishe, my development server crashes as soon as it must write into the datastore.
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/datastore_file_stub.py", line 557, in __WritePickled
os.rename(tmp_filename, filename)
OSError: [Errno 13] Permission denied
Therefore, I gave this folder all UNIX permissions
chmod a+w /my_app_folder
But I have now another error which is
OSError: [Errno 21] Is a directory
Obviously the path should not be a directory. So I changed the path to:
/my_app_folder/data.datastore
And now it works! PFF...
Maybe the default data store path is in a /tmp directory that's being deleted on shutdown? You can manually set the path with the --datastore_path flag in dev_appserver.py. See the docs for details.
This clearing should not be the default behavior.
Check that this application in the Google AppEngine launcher doesn't have the --clear_datastore flag.
Select app in list and select Edit->Applications Settings...
Extra Command Line Flags should be empty.
I once set this to restart some tests and forgot to remove it.
Remove the existing application in the launcher and Create New Application. See if that helps.
Verify the OS isn't deleted the file. If you open the log for the app, then launch it, the output says where the sqlite file is being located (e.g. T:\temp\dev_appserver.rdbms)
flag when starting the dev server:
--storage_path=...
Path at which all local files (such as the Datastore, Blobstore files,
Google Cloud Storage Files, logs, etc) will be stored, unless
overridden by --datastore_path, --blobstore_path, --logs_path, etc.
found at https://developers.google.com/appengine/docs/python/tools/devserver?csw=1
I had the same problem, and installing the latest gae SDK solved it.
As in the case here: app engine datastore auto-clears every time project runs

Resources