How to log Stackdriver log messages correlated by trace id using stdout Go 1.11 - google-app-engine

I'm using Google App Engine Standard Environment with the Go 1.11 runtime. The documentation for Go 1.11 says "Write your application logs using stdout for output and stderr for errors". The migration from Go 1.9 guide also suggests not calling the Google Cloud Logging library directly but instead logging via stdout.
https://cloud.google.com/appengine/docs/standard/go111/writing-application-logs
With this in mind, I've written a small HTTP Service (code below) to experiment logging to Stackdriver using JSON output to stdout.
When I print plain text messages they appear as expected in the Logs Viewer panel under textPayload. When I pass a JSON string they appear under jsonPayload. So far, so good.
So, I added a severity field to the output string and Stackdriver Log Viewer successfully categorizes the message according to the levelled logging NOTICE, WARNING etc.
https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry
The docs say to set the trace identifier to correlate log entries with the originating request log. The trace ID is extracted from the X-Cloud-Trace-Context header set by the container.
Simulate it locally using curl -v -H 'X-Cloud-Trace-Context: 1ad1e4f50427b51eadc9b36064d40cc2/8196282844182683029;o=1' http://localhost:8080/
However, this does not cause the messages to be threaded by request, but instead the trace property appears in the jsonPayload object in the logs. (See below).
Notice that severity has been interpreted as expected and does not appear in the jsonPayload. I had expected the same to happen for trace, but instead it appears to be unprocessed.
How can I achieve nested messages within the original request log message? (This must be done using stdout on Go 1.11 as I do not wish to log directly with the Google Cloud logging package).
What exactly is GAE doing to parse the stdout stream from my running process? (In the setup guide for VMs on GCE there is something about installing an agent program to act as a conduit to Stackdriver logging- is this what GAE has installed?)
app.yaml file looks like this:
runtime: go111
handlers:
- url: /.*
script: auto
package main
import (
"fmt"
"log"
"net/http"
"os"
"strings"
)
var projectID = "glowing-market-234808"
func parseXCloudTraceContext(t string) (traceID, spanID, traceTrue string) {
slash := strings.Index(t, "/")
semi := strings.Index(t, ";")
equal := strings.Index(t, "=")
return t[0:slash], t[slash+1 : semi], t[equal+1:]
}
func sayHello(w http.ResponseWriter, r *http.Request) {
xTrace := r.Header.Get("X-Cloud-Trace-Context")
traceID, spanID, _ := parseXCloudTraceContext(xTrace)
trace := fmt.Sprintf("projects/%s/traces/%s", projectID, traceID)
warning := fmt.Sprintf(`{"trace":"%s","spanId":"%s", "severity":"WARNING","name":"Andy","age":45}`, trace, spanID)
fmt.Fprintf(os.Stdout, "%s\n", warning)
message := "Hello"
w.Write([]byte(message))
}
func main() {
http.HandleFunc("/", sayHello)
port := os.Getenv("PORT")
if port == "" {
port = "8080"
}
log.Fatal(http.ListenAndServe(fmt.Sprintf(":%s", port), nil))
}
Output shown in the Log viewer:
...,
jsonPayload: {
age: 45
name: "Andy"
spanId: "13076979521824861986"
trace: "projects/glowing-market-234808/traces/e880a38fb5f726216f94548a76a6e474"
},
severity: "WARNING",
...

I have solved this by adjusting the program to use logging.googleapis.com/trace in place of trace and logging.googleapis.com/spanId in place of spanId.
warning := fmt.Sprintf(`{"logging.googleapis.com/trace":"%s","logging.googleapis.com/spanId":"%s", "severity":"WARNING","name":"Andy","age":45}`, trace, spanID)
fmt.Fprintf(os.Stdout, "%s\n", warning)
It seems that GAE is using the logging agent google-fluentd (a modified version of the fluentd log data colletor.)
See this link for a full explanation.
https://cloud.google.com/logging/docs/agent/configuration#special-fields
[update] June 25th, 2019: I've written a Logrus plugin that will help to thread log entries by HTTP request. It's available under on GitHub https://github.com/andyfusniak/stackdriver-gae-logrus-plugin.
[update]] April 3rd, 2020: I've since switched to using Cloud Run and the Logrus plugin appears to work fine with this platform also.

Related

Why does dev_appserver.py exit without throwing errors?

I am trying to run a simple python 2 server code with AppEngine and Datastore. When I run dev_appserver.py app.yaml, the program immediately exits (without an error) after the following outputs:
/home/username/google-cloud-sdk/lib/third_party/google/auth/crypt/_cryptography_rsa.py:22: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
import cryptography.exceptions
INFO 2022-12-20 11:59:41,931 devappserver2.py:239] Using Cloud Datastore Emulator.
We are gradually rolling out the emulator as the default datastore implementation of dev_appserver.
If broken, you can temporarily disable it by --support_datastore_emulator=False
Read the documentation: https://cloud.google.com/appengine/docs/standard/python/tools/migrate-cloud-datastore-emulator
INFO 2022-12-20 11:59:41,936 devappserver2.py:316] Skipping SDK update check.
INFO 2022-12-20 11:59:42,332 datastore_emulator.py:156] Starting Cloud Datastore emulator at: http://localhost:22325
INFO 2022-12-20 11:59:42,981 datastore_emulator.py:162] Cloud Datastore emulator responded after 0.648865 seconds
INFO 2022-12-20 11:59:42,982 <string>:384] Starting API server at: http://localhost:38915
Ideally, it should have continued by runnning the server on port 8000. Also, it works with option --support_datastore_emulator=False.
This is the code:
import webapp2
import datetime
from google.appengine.ext import db, deferred, ndb
import uuid
from base64 import b64decode, b64encode
import logging
class Email(ndb.Model):
email = ndb.StringProperty()
class DB(webapp2.RequestHandler):
def post(self):
try:
mail = Email()
mail.email = 'Test'
mail.put()
except Exception as e:
print(e)
self.response.headers['Content-Type'] = 'application/json; charset=utf-8'
return self.response.out.write(e)
def get(self):
try:
e1 = Email.query()
logging.critical('count is: %s' % e1.count)
e1k = e1.get(keys_only=True)
logging.critical('count 2 is: %s' % e1k.count)
e1 = e1.get()
key = unicode(e1.key.urlsafe())
logging.critical('This is a critical message: %s' % key)
logging.critical('This is a critical message: %s' % e1k)
e2 = ndb.Key(urlsafe=key).get()
self.response.headers['Content-Type'] = 'application/json; charset=utf-8'
return self.response.out.write(str(e2.email))
except Exception as e:
print(e)
self.response.headers['Content-Type'] = 'application/json; charset=utf-8'
return self.response.out.write(e)
app = webapp2.WSGIApplication([
('/', DB)
], debug=True)
How can I find the reason this is not working?
Edit: I figured that the dev server works and writes to a datastore even with support_datastore_emulator=False option. I am confused by this option. I also don't know where the database is stored currently.
It should be count() and not count i.e.
logging.critical('count is: %s' % e1.count())
A get returns only 1 record and so it doesn't make sense to do a count after calling a get. Besides, the count operation is a method of the query instance not the results. This means the following code is incorrect
e1k = e1.get(keys_only=True)
logging.critical('count 2 is: %s' % e1k.count)
You should replace it with
elk = e1.fetch(keys_only=True) # fetch gives an array
logging.critical('count 2 is: %s' % len(e1k))
When you first run your App, it will execute the GET part of your code and because this is the first time your App is being run, you have no record in Datastore. This means e1 = e1.get() will return None and key = unicode(e1.key.urlsafe()) will lead to an error.
You have to modify your code to first check you have a value for e1 or e2 before you attempt to use the keys.
I ran your code with dev_appserver.py and it displayed these errors for me in the logs. But I ran it with an older version of gcloud SDK (Google Cloud SDK 367.0.0). I don't know why yours exited without displaying any errors. Maybe it's due to the version...??
Separately - Don't know why you're importing db. Google moved on to ndb long ago and you don't use db in your code
The default datastore (for the older generation runtimes like Python 2) is in .config (hidden folder) > gcloud > emulators > datastore
You can also specify your own location by using the flag --datastore_path. See documentation

Google AppEngine application log assigned to the wrong request log

When I look at the logs in the Google Log Viewer for my GAE project, I see that often the logs that I write myself in the code are assigned to the wrong request. Most of the time the log is assigned to the request directly after the request that produced the log entry.
As the root of every application log in GAE must be a request, this means that the wrong request is sometimes marked as error, because another request before produced an error, but the log is somehow assigned to the request after that.
I don't really do anything special, I use Ktor as my servlet and have an interceptor that creates a log when an exception occurs before returning status 500.
I use Java logging via SLF4J with the google cloud logging handler, but before that I used logback via SLf4J and had the same problem.
The content of the logs itself is also correct, the returned status of the request, the level of the log entry, the message, everything is ok.
I thought that it may be because I use kotlin and switch coroutine contexts during a single request, but in some cases the point where I write the log and where I send the response are exactly next to each other, so I'm not sure if kotlin has anything to do with it.
My logging.properties:
# To use this configuration, add to system properties : -Djava.util.logging.config.file="/path/to/file"
#
.level = INFO
# it is recommended that io.grpc and sun.net logging level is kept at INFO level,
# as both these packages are used by Stackdriver internals and can result in verbose / initialization problems.
io.grpc.netty.level=INFO
sun.net.level=INFO
handlers=com.google.cloud.logging.LoggingHandler
# default : java.log
com.google.cloud.logging.LoggingHandler.log=custom_log
# default : INFO
com.google.cloud.logging.LoggingHandler.level=INFO
# default : ERROR
com.google.cloud.logging.LoggingHandler.flushLevel=WARNING
# default : auto-detected, fallback "global"
#com.google.cloud.logging.LoggingHandler.resourceType=container
# custom formatter
com.google.cloud.logging.LoggingHandler.formatter=java.util.logging.SimpleFormatter
java.util.logging.SimpleFormatter.format=%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS %4$-6s %2$s %5$s%6$s%n
#optional enhancers (to add additional fields, labels)
#com.google.cloud.logging.LoggingHandler.enhancers=com.example.logging.jul.enhancers.ExampleEnhancer
My logging relevant dependencies:
implementation "org.slf4j:slf4j-jdk14:1.7.30"
implementation "com.google.cloud:google-cloud-logging:1.100.0"
An example logging call:
exception<Throwable> { e ->
logger().error("Error", e)
call.respondText(e.message ?: "", ContentType.Text.Plain, HttpStatusCode.InternalServerError)
}
with logger() being:
import org.slf4j.Logger
import org.slf4j.LoggerFactory
inline fun <reified T : Any> T.logger(): Logger = LoggerFactory.getLogger(T::class.java)
Edit:
An example of the log in Google cloud. The first request has the query parameter GAID=cdda802e-fb9c-47ad-0794d394c913, but as you can see the error log for that request is in the one below, marked in red.

Non-interactive auto-refresh stale OAuth Token with Googlesheets package

I'm trying to automatically run an r script to download a private Google Sheet every hour. It always works fine when I'm interactively using R. It also works fine during the first hour after I automate the script with launchd.
It stops working an hour after I start automating it with launchd. I think the problem is that after one hour the access token changes, and the non-interactive version isn’t waiting for the auto refreshing of the OAuth token. Here is the error that I get from the error report:
Auto-refreshing stale OAuth token.
Error in gzfile(file, mode) : cannot open the connection
Calls: gs_auth ... -> -> cache_token -> saveRDS -> gzfile
In addition: Warning message:
In gzfile(file, mode) :
cannot open compressed file '.httr-oauth', probable reason 'Permission denied'
Execution halted
I'm using Jenny Bryan's googlesheets package. Here is the code that I initially use to register the sheet, and then save the oAuth token:
gToken <- gs_auth() # Run this the first time to get the oAuth information
saveRDS(gToken, "/Users/…/gToken.rds") # Save the oAuth information for non-interactive use
I then use the following script in the file that I automate with launchd:
gs_auth(token = "/Users/…/gToken.rds")
How can I avoid this error when running the script automatically with launchd?
I don't know about launchd but I had the same problem when I wanted to run a R script automatically from the Windows task planer. Changing the 'cache' attribute value to FALSE did the trick for me [1]: https://i.stack.imgur.com/pprlC.png
You can find the solution here: https://github.com/jennybc/googlesheets/issues/262
To authenticate once in the browser in order to get a token file, I did this:
token_file <- gs_auth(new_user = TRUE, cache = FALSE)
saveRDS(token_file, "googlesheets_token.rds")
Automatic login afterwards via:
gs_auth(token = paste0(path_scripts, "googlesheets_token.rds"),
verbose = TRUE, cache = FALSE)

How to catch BigQuery loading errors from an AppEngine pipeline

I have built a pipeline on AppEngine that loads data from Cloud Storage to BigQuery. This works fine, ..until there is any error. How can I can loading exceptions by BigQuery from my AppEngine code?
The code in the pipeline looks like this:
#Run the job
credentials = AppAssertionCredentials(scope=SCOPE)
http = credentials.authorize(httplib2.Http())
bigquery_service = build("bigquery", "v2", http=http)
jobCollection = bigquery_service.jobs()
result = jobCollection.insert(projectId=PROJECT_ID,
body=build_job_data(table_name, cloud_storage_files))
#Get the status
while (not allDone and not runtime.is_shutting_down()):
try:
job = jobCollection.get(projectId=PROJECT_ID,
jobId=insertResponse).execute()
#Do something with job.get('status')
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
logging.error(traceback.format_exception(exc_type, exc_value, exc_traceback))
time.sleep(30)
This gives me status error, or major connectivity errors, but what I am looking for is functional errors from BigQuery, like fields formats conversion errors, schema structure issues, or other issues BigQuery may have while trying to insert rows to tables.
If any "functional" error on BigQuery's side happens, this code will run successfully and complete normally, but no table will be written on BigQuery. Not easy to debug when this happens...
You can use the HTTP error code from the exception. BigQuery is a REST API, so the response codes that are returned match the description of HTTP error codes here.
Here is some code that handles retryable errors (connection, rate limit, etc), but re-raises when it is an error type that it doesn't expect.
except HttpError, err:
# If the error is a rate limit or connection error, wait and
# try again.
# 403: Forbidden: Both access denied and rate limits.
# 408: Timeout
# 500: Internal Service Error
# 503: Service Unavailable
if err.resp.status in [403, 408, 500, 503]:
print '%s: Retryable error %s, waiting' % (
self.thread_id, err.resp.status,)
time.sleep(5)
else: raise
If you want even better error handling, check out the BigqueryError class in the bq command line client (this used to be available on code.google.com, but with the recent switch to gCloud, it isn't any more. But if you have gcloud installed, the bq.py and bigquery_client.py files should be in the installation).
The key here is this part of the pasted code:
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
logging.error(traceback.format_exception(exc_type, exc_value, exc_traceback))
time.sleep(30)
This "except" is catching every exception, logging it, and letting the process continue without any consideration for re-trying.
The question is, what would you like to do instead? At least the intention is there with the "#Do something" comment.
As a suggestion, consider App Engine's task queues to check the status, instead of a loop with a 30 second wait. When tasks get an exception, they are automatically retried - and you can tune that behavior.

How can I get AppEngine to log info level only for my app?

So I've tried configuring AppEngine logging according to this guide, ensuring I've configured the logging.properties file to be used in web.xml. I've configured logging.properties the following way:
.level = WARNING
nilsnett.chinese.backend.level = INFO
The package name of my logging wrapper is nilsnett.chinese.backend. The problem is that even with this configuration, info-level log output from my app is filtered. Evidence:
I've also tried the following config, which yielded the same result (including the logger class name at the end of the package name):
.level = WARNING
nilsnett.chinese.backend.JavaUtilLogger.level = INFO
To demonstrate that the logging.properties-file is actually read, and that I actually do write info-level logging data to app-engine in this service call, let me show you what happens when I set.level=INFO:
So my desired result is to have INFO and higher-level log outputs from my packages, while other packages, like org.datanucleus, only shows output if WARNING or more severe. In the example above, I want only the two lines marked with the purple star. Am I doing anything wrong?
change your config to:
.level = WARNING
# Set the default logging level for the datanucleus loggers
DataNucleus.JDO.level=WARNING
DataNucleus.Persistence.level=WARNING
DataNucleus.Cache.level=WARNING
DataNucleus.MetaData.level=WARNING
DataNucleus.General.level=WARNING
DataNucleus.Utility.level=WARNING
DataNucleus.Transaction.level=WARNING
DataNucleus.Datastore.level=WARNING
DataNucleus.ClassLoading.level=WARNING
DataNucleus.Plugin.level=WARNING
DataNucleus.ValueGeneration.level=WARNING
DataNucleus.Enhancer.level=WARNING
DataNucleus.SchemaTool.level=WARNING
# FinalizableReferenceQueue tries to spin up a thread and fails. This
# is inconsequential, so don't scare the user.
com.google.common.base.FinalizableReferenceQueue.level=WARNING
com.google.appengine.repackaged.com.google.common.base.FinalizableReferenceQueue.level=WARNING
this is are coming from logging config template, so to set datanucleus to warning you have todo like in this template.
https://developers.google.com/appengine/docs/java/#Logging
and then just add your own logging config:
nilsnett.chinese.backend.level = INFO
this should solve it

Resources