How to catch BigQuery loading errors from an AppEngine pipeline - google-app-engine

I have built a pipeline on AppEngine that loads data from Cloud Storage to BigQuery. This works fine, ..until there is any error. How can I can loading exceptions by BigQuery from my AppEngine code?
The code in the pipeline looks like this:
#Run the job
credentials = AppAssertionCredentials(scope=SCOPE)
http = credentials.authorize(httplib2.Http())
bigquery_service = build("bigquery", "v2", http=http)
jobCollection = bigquery_service.jobs()
result = jobCollection.insert(projectId=PROJECT_ID,
body=build_job_data(table_name, cloud_storage_files))
#Get the status
while (not allDone and not runtime.is_shutting_down()):
try:
job = jobCollection.get(projectId=PROJECT_ID,
jobId=insertResponse).execute()
#Do something with job.get('status')
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
logging.error(traceback.format_exception(exc_type, exc_value, exc_traceback))
time.sleep(30)
This gives me status error, or major connectivity errors, but what I am looking for is functional errors from BigQuery, like fields formats conversion errors, schema structure issues, or other issues BigQuery may have while trying to insert rows to tables.
If any "functional" error on BigQuery's side happens, this code will run successfully and complete normally, but no table will be written on BigQuery. Not easy to debug when this happens...

You can use the HTTP error code from the exception. BigQuery is a REST API, so the response codes that are returned match the description of HTTP error codes here.
Here is some code that handles retryable errors (connection, rate limit, etc), but re-raises when it is an error type that it doesn't expect.
except HttpError, err:
# If the error is a rate limit or connection error, wait and
# try again.
# 403: Forbidden: Both access denied and rate limits.
# 408: Timeout
# 500: Internal Service Error
# 503: Service Unavailable
if err.resp.status in [403, 408, 500, 503]:
print '%s: Retryable error %s, waiting' % (
self.thread_id, err.resp.status,)
time.sleep(5)
else: raise
If you want even better error handling, check out the BigqueryError class in the bq command line client (this used to be available on code.google.com, but with the recent switch to gCloud, it isn't any more. But if you have gcloud installed, the bq.py and bigquery_client.py files should be in the installation).

The key here is this part of the pasted code:
except:
exc_type, exc_value, exc_traceback = sys.exc_info()
logging.error(traceback.format_exception(exc_type, exc_value, exc_traceback))
time.sleep(30)
This "except" is catching every exception, logging it, and letting the process continue without any consideration for re-trying.
The question is, what would you like to do instead? At least the intention is there with the "#Do something" comment.
As a suggestion, consider App Engine's task queues to check the status, instead of a loop with a 30 second wait. When tasks get an exception, they are automatically retried - and you can tune that behavior.

Related

Google AppEngine application log assigned to the wrong request log

When I look at the logs in the Google Log Viewer for my GAE project, I see that often the logs that I write myself in the code are assigned to the wrong request. Most of the time the log is assigned to the request directly after the request that produced the log entry.
As the root of every application log in GAE must be a request, this means that the wrong request is sometimes marked as error, because another request before produced an error, but the log is somehow assigned to the request after that.
I don't really do anything special, I use Ktor as my servlet and have an interceptor that creates a log when an exception occurs before returning status 500.
I use Java logging via SLF4J with the google cloud logging handler, but before that I used logback via SLf4J and had the same problem.
The content of the logs itself is also correct, the returned status of the request, the level of the log entry, the message, everything is ok.
I thought that it may be because I use kotlin and switch coroutine contexts during a single request, but in some cases the point where I write the log and where I send the response are exactly next to each other, so I'm not sure if kotlin has anything to do with it.
My logging.properties:
# To use this configuration, add to system properties : -Djava.util.logging.config.file="/path/to/file"
#
.level = INFO
# it is recommended that io.grpc and sun.net logging level is kept at INFO level,
# as both these packages are used by Stackdriver internals and can result in verbose / initialization problems.
io.grpc.netty.level=INFO
sun.net.level=INFO
handlers=com.google.cloud.logging.LoggingHandler
# default : java.log
com.google.cloud.logging.LoggingHandler.log=custom_log
# default : INFO
com.google.cloud.logging.LoggingHandler.level=INFO
# default : ERROR
com.google.cloud.logging.LoggingHandler.flushLevel=WARNING
# default : auto-detected, fallback "global"
#com.google.cloud.logging.LoggingHandler.resourceType=container
# custom formatter
com.google.cloud.logging.LoggingHandler.formatter=java.util.logging.SimpleFormatter
java.util.logging.SimpleFormatter.format=%1$tY-%1$tm-%1$td %1$tH:%1$tM:%1$tS %4$-6s %2$s %5$s%6$s%n
#optional enhancers (to add additional fields, labels)
#com.google.cloud.logging.LoggingHandler.enhancers=com.example.logging.jul.enhancers.ExampleEnhancer
My logging relevant dependencies:
implementation "org.slf4j:slf4j-jdk14:1.7.30"
implementation "com.google.cloud:google-cloud-logging:1.100.0"
An example logging call:
exception<Throwable> { e ->
logger().error("Error", e)
call.respondText(e.message ?: "", ContentType.Text.Plain, HttpStatusCode.InternalServerError)
}
with logger() being:
import org.slf4j.Logger
import org.slf4j.LoggerFactory
inline fun <reified T : Any> T.logger(): Logger = LoggerFactory.getLogger(T::class.java)
Edit:
An example of the log in Google cloud. The first request has the query parameter GAID=cdda802e-fb9c-47ad-0794d394c913, but as you can see the error log for that request is in the one below, marked in red.

How to log Stackdriver log messages correlated by trace id using stdout Go 1.11

I'm using Google App Engine Standard Environment with the Go 1.11 runtime. The documentation for Go 1.11 says "Write your application logs using stdout for output and stderr for errors". The migration from Go 1.9 guide also suggests not calling the Google Cloud Logging library directly but instead logging via stdout.
https://cloud.google.com/appengine/docs/standard/go111/writing-application-logs
With this in mind, I've written a small HTTP Service (code below) to experiment logging to Stackdriver using JSON output to stdout.
When I print plain text messages they appear as expected in the Logs Viewer panel under textPayload. When I pass a JSON string they appear under jsonPayload. So far, so good.
So, I added a severity field to the output string and Stackdriver Log Viewer successfully categorizes the message according to the levelled logging NOTICE, WARNING etc.
https://cloud.google.com/logging/docs/reference/v2/rest/v2/LogEntry
The docs say to set the trace identifier to correlate log entries with the originating request log. The trace ID is extracted from the X-Cloud-Trace-Context header set by the container.
Simulate it locally using curl -v -H 'X-Cloud-Trace-Context: 1ad1e4f50427b51eadc9b36064d40cc2/8196282844182683029;o=1' http://localhost:8080/
However, this does not cause the messages to be threaded by request, but instead the trace property appears in the jsonPayload object in the logs. (See below).
Notice that severity has been interpreted as expected and does not appear in the jsonPayload. I had expected the same to happen for trace, but instead it appears to be unprocessed.
How can I achieve nested messages within the original request log message? (This must be done using stdout on Go 1.11 as I do not wish to log directly with the Google Cloud logging package).
What exactly is GAE doing to parse the stdout stream from my running process? (In the setup guide for VMs on GCE there is something about installing an agent program to act as a conduit to Stackdriver logging- is this what GAE has installed?)
app.yaml file looks like this:
runtime: go111
handlers:
- url: /.*
script: auto
package main
import (
"fmt"
"log"
"net/http"
"os"
"strings"
)
var projectID = "glowing-market-234808"
func parseXCloudTraceContext(t string) (traceID, spanID, traceTrue string) {
slash := strings.Index(t, "/")
semi := strings.Index(t, ";")
equal := strings.Index(t, "=")
return t[0:slash], t[slash+1 : semi], t[equal+1:]
}
func sayHello(w http.ResponseWriter, r *http.Request) {
xTrace := r.Header.Get("X-Cloud-Trace-Context")
traceID, spanID, _ := parseXCloudTraceContext(xTrace)
trace := fmt.Sprintf("projects/%s/traces/%s", projectID, traceID)
warning := fmt.Sprintf(`{"trace":"%s","spanId":"%s", "severity":"WARNING","name":"Andy","age":45}`, trace, spanID)
fmt.Fprintf(os.Stdout, "%s\n", warning)
message := "Hello"
w.Write([]byte(message))
}
func main() {
http.HandleFunc("/", sayHello)
port := os.Getenv("PORT")
if port == "" {
port = "8080"
}
log.Fatal(http.ListenAndServe(fmt.Sprintf(":%s", port), nil))
}
Output shown in the Log viewer:
...,
jsonPayload: {
age: 45
name: "Andy"
spanId: "13076979521824861986"
trace: "projects/glowing-market-234808/traces/e880a38fb5f726216f94548a76a6e474"
},
severity: "WARNING",
...
I have solved this by adjusting the program to use logging.googleapis.com/trace in place of trace and logging.googleapis.com/spanId in place of spanId.
warning := fmt.Sprintf(`{"logging.googleapis.com/trace":"%s","logging.googleapis.com/spanId":"%s", "severity":"WARNING","name":"Andy","age":45}`, trace, spanID)
fmt.Fprintf(os.Stdout, "%s\n", warning)
It seems that GAE is using the logging agent google-fluentd (a modified version of the fluentd log data colletor.)
See this link for a full explanation.
https://cloud.google.com/logging/docs/agent/configuration#special-fields
[update] June 25th, 2019: I've written a Logrus plugin that will help to thread log entries by HTTP request. It's available under on GitHub https://github.com/andyfusniak/stackdriver-gae-logrus-plugin.
[update]] April 3rd, 2020: I've since switched to using Cloud Run and the Logrus plugin appears to work fine with this platform also.

Aggregate after exception from ftp consumer: FatalFallbackErrorHandler

My camel route tries to pick up some files from sftp, transfer them to network, and delete them from sftp. If the sftp is unreachable after 3 attempts, I want the route to send an email warning the admin about the problem.
For this reason my sftp address has the following parameters:
maximumReconnectAttempts=2&throwExceptionOnConnectFailed=true&consumer.bridgeErrorHandler=true
In case the network location is not available, i want the route to notify the admin and not delete the files from sftp.
For this reason i have set .handled(false) in onException.
However, when connecting to sftp fails, aggregation also fails and no emails are coming. I have made a minimalist example below:
/configure
onException(Throwable.class)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.redeliveryDelay(1000)
.handled(false)
.log(LoggingLevel.ERROR, LOG, "XXX - Error moving files")
.to(AGGREGATEROUTE)
.end();
from(downloadFrom)
.to(to)
.log(LoggingLevel.INFO, LOG, "XXX - Moving file OK")
.to(AGGREGATEROUTE);
from(AGGREGATEROUTE)
.log(LoggingLevel.INFO, LOG, "XXX - Starting aggregation.")
.aggregate(constant(true), new GroupedExchangeAggregationStrategy())
.completionFromBatchConsumer()
.completionTimeout(10000)
.log(LoggingLevel.INFO, LOG, "XXX - Aggregation completed, sending mail.");
In the logs i see:
16:02| ERROR | CamelLogger.java 156 | XXX - Error moving files
Then the logs for the Exception occurring during connection.
And then this:
16:02| ERROR | FatalFallbackErrorHandler.java 174 | Exception occurred while trying to handle previously thrown exception on exchangeId: ID-LP0641-1552662095664-0-2 using: [Pipeline[[Channel[Log(proefjes.camel_cursus.routebuilders.MoveWithPickupExceptions)[XXX - Error moving files]], Channel[sendTo(direct://aggregate)]]]].
16:02| ERROR | FatalFallbackErrorHandler.java 172 | \--> New exception on exchangeId: ID-LP0641-1552662095664-0-2
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot connect to sftp://user#mycompany.nl:22
at org.apache.camel.component.file.remote.SftpOperations.connect(SftpOperations.java:149)
I do not see "XXX - Starting aggregation." which i would expect to see in the log. Does some kind of error occur befor aggregation? The breakpoint on aggregate(*, *) is never reached.
First, I just want to clarify something. You write "In case the network location is not available, i want the route to notify the admin and not delete the files from sftp", but shouldn't that be obvious anyhow? I mean, if the network location is not available, wouldn't deleting the files from sftp be impossible?
It's a little confusing that your exception handler is also routing .to(AGGREGATEROUTE). Given that you want to email an admin, shouldn't that be in the exception handler, not in the happy path? Why would you and how would you "aggregate" a connection failure?
Finally, and here I think is a real problem with your implementation, you may have misunderstood what handled(false) does. Setting this to false means routing should stop and propagate the exception returned to the caller. I'm not sure what having to the .to(AGGREGATEROUTE) would do in this case, but I'm not surprised it's not being called.
I suggest trying a few things. I don't have your code so I'm not sure which will work best. These are all related and any might work:
Change handled(false) to handled(true).
Replace handled with continued(true).
Use a Dead Letter Channel.
Reference:
Handle and Continue Exceptions
Dead Letter Channel
Since errorhandling is different depending on which endpoint causes the error, i have solved this by having two different versions of onException:
//configure exception on sft end
onException(Throwable.class)
.maximumRedeliveries(2)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.redeliveryDelay(1000)
.onWhen(new hasSFTPErrorPredicate())
// .continued(true) // tries to connect once, mails and continues to aggregation with empty exchange
//.handled(false) // tries to connect twice but does not reach mail
.handled(true) // tries to connect once, does reach mail
// handled not defined: tries to connect twice but does not reach mail
.log(LoggingLevel.INFO, LOG, "XXX - SFTP exception")
.to(MAIL_ROUTE)
.end();
// exception anywhere else
onException(Throwable.class)
.maximumRedeliveries(2)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.redeliveryDelay(1000)
.log(LoggingLevel.ERROR, LOG, "XXX - Error moving file ${file:name}: ${exception}")
.to(AGGREGATEROUTE)
.handled(false)
.end();
Exceptions occuring at the sftp end are handled in the first onException, because there the hasSFTPErrorPredicate returns 'true'. All this predicate does is check if any exception or their cause has "Cannot connect to sftp:" in the message.
No rollback is required in this case because nothing has happened yet.
Any other exception is handled by the second onException.

Getting error asInvalidSession id while invoking upsert bulk operation in mule salesforce connector

Polling app daily once. We are loading data using upsert bulk operation in salesforce. First day it is working fine . Second day on wards we are getting the below error(InvalidSession id ). I found workaround as reconnection strategy and configured even after it is not working.
[2017-06-07 10:02:24.519] ERROR org.mule.retry.notifiers.ConnectNotifier [[test-salesforce-bulk].checkbulkupsert.stage1.05]: Failed to connect/reconnect: Work Descriptor. Root Exception was: InvalidSessionId : Invalid session id. Type: class com.sforce.async.AsyncApiException
[2017-06-07 10:02:24.547] ERROR org.mule.exception.CatchMessagingExceptionStrategy [[test-salesforce-bulk].checkbulkupsert.stage1.05]:
********************************************************************************
Message : Failed to invoke upsertBulk.
Element : /test-salesforce-bulk/processors/4/prepareAccountRequestSubFlow/subprocessors/1 # test-salesforce-bulk
--------------------------------------------------------------------------------
Exception stack is:
Failed to invoke upsertBulk. (org.mule.api.MessagingException)
com.sforce.async.BulkConnection.parseAndThrowException(BulkConnection.java:180)
com.sforce.async.BulkConnection.createOrUpdateJob(BulkConnection.java:164)
com.sforce.async.BulkConnection.createOrUpdateJob(BulkConnection.java:132)
com.sforce.async.BulkConnection.createJob(BulkConnection.java:122)
org.mule.modules.salesforce.SalesforceConnector.createJobInfo(SalesforceConnector.java:2570)
org.mule.modules.salesforce.SalesforceConnector.upsertBulk(SalesforceConnector.java:672)
org.mule.modules.salesforce.generated.processors.UpsertBulkMessageProcessor$1.process(UpsertBulkMessageProcessor.java:153)
(50 more...)
(set debug level logging or '-Dmule.verbose.exceptions=true' for everything)
Can you please help on this.
There error says Invalid Session Id.
This means that the initial session has expired. Check your login history and see if there's even an attempt to login. If there's not, you need to update your username in Mulesoft. If there is and it was unsuccessful, you need to update your password/security token. If there is and it was successful, then it could be that your Session Settings are set to limit calls to the initial IP address or something like that.

Google app engine send mail service raises exception

I was trying to use the app engine mail sending service on my site.
Currently, I've put something as simple as:
message = mail.EmailMessage(sender="Subhranath Chunder <subhranath#gmail.com>", \
subject="Your account has been approved")
message.to = "Subhranath Chunder <subhranathc#yahoo.com>"
message.body = "Just another message"
message.send()
But every time the code handler raises this error:
2011-08-20 04:44:36.988
ApplicationError: 1 Internal error
Traceback (most recent call last):
File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", line 703, in __call__
handler.post(*groups)
File "/base/data/home/apps/s~subhranath/0-2.352669327317975990/index.py", line 71, in post
message.send()
File "/base/python_runtime/python_lib/versions/1/google/appengine/api/mail.py", line 895, in send
raise e
ApplicationError: ApplicationError: 1 Internal error
My code as a different version, and not the default one. I'm not able to change this version as the default version, unless this error gets resolved. Any idea why this is happening?
The sender must either have privilegies or be the logged in user. So subhranath#gmail.com should have admin rights to your location or be the logged in user to run your code.
Most basic example would be just a HTTP get to send an email:
def get(self):
message = mail.EmailMessage(sender='subhranath_app_admin#gmail.com', subject='Email')
message.to='any...#gmail.com'
message.body = os.environ.get('HTTP_HOST')
message.send()
If you want to send an email as the logged in user you can make the sender a variable:
senderemail = users.get_current_user().email() if users.get_current_user() else 'subhranath#gmail.com'
I think the error message you posted was not very clear and if it persists I recommend to get more info about the error message using logging and exceptions.
Seems to be a app engine system glitch. The same code is now working, without any new deployment.

Resources