Google Cloud Storage (gcs) Error 200 on non-final Chunk - google-app-engine

I'm running into the following error when running an export to CSV job on AppEngine using the new Google Cloud Storage library (appengine-gcs-client). I have about ~30mb of data I need to export on a nightly basis. Occasionally, I will need to rebuild the entire table. Today, I had to rebuild everything (~800mb total) and I only actually pushed across ~300mb of it. I checked the logs and found this exception:
/task/bigquery/ExportVisitListByDayTask
java.lang.RuntimeException: Unexpected response code 200 on non-final chunk: Request: PUT https://storage.googleapis.com/moose-sku-data/visit_day_1372392000000_1372898225040.csv?upload_id=AEnB2UrQ1cw0-Jbt7Kr-S4FD2fA3LkpYoUWrD3ZBkKdTjMq3ICGP4ajvDlo9V-PaKmdTym-zOKVrtVVTrFWp9np4Z7jrFbM-gQ
x-goog-api-version: 2
Content-Range: bytes 4718592-4980735/*
262144 bytes of content
Response: 200 with 0 bytes of content
ETag: "f87dbbaf3f7ac56c8b96088e4c1747f6"
x-goog-generation: 1372898591905000
x-goog-metageneration: 1
x-goog-hash: crc32c=72jksw==
x-goog-hash: md5=+H27rz96xWyLlgiOTBdH9g==
Vary: Origin
Date: Thu, 04 Jul 2013 00:43:17 GMT
Server: HTTP Upload Server Built on Jun 28 2013 13:27:54 (1372451274)
Content-Length: 0
Content-Type: text/html; charset=UTF-8
X-Google-Cache-Control: remote-fetch
Via: HTTP/1.1 GWA
at com.google.appengine.tools.cloudstorage.oauth.OauthRawGcsService.put(OauthRawGcsService.java:254)
at com.google.appengine.tools.cloudstorage.oauth.OauthRawGcsService.continueObjectCreation(OauthRawGcsService.java:206)
at com.google.appengine.tools.cloudstorage.GcsOutputChannelImpl$2.run(GcsOutputChannelImpl.java:147)
at com.google.appengine.tools.cloudstorage.GcsOutputChannelImpl$2.run(GcsOutputChannelImpl.java:144)
at com.google.appengine.tools.cloudstorage.RetryHelper.doRetry(RetryHelper.java:78)
at com.google.appengine.tools.cloudstorage.RetryHelper.runWithRetries(RetryHelper.java:123)
at com.google.appengine.tools.cloudstorage.GcsOutputChannelImpl.writeOut(GcsOutputChannelImpl.java:144)
at com.google.appengine.tools.cloudstorage.GcsOutputChannelImpl.waitForOutstandingWrites(GcsOutputChannelImpl.java:186)
at com.moose.task.bigquery.ExportVisitListByDayTask.doPost(ExportVisitListByDayTask.java:196)
The task is pretty straightforward, but I'm wondering if there is something wrong with the way I'm using waitForOutstandingWrites() or the way I'm serializing my outputChannel for the next task run. One thing to note, is that each task is broken into daily groups, each outputting their own individual file. The day tasks are scheduled to run 10 minutes apart concurrently to push out all 60 days.
In the task, I create a PrintWriter like so:
OutputStream outputStream = Channels.newOutputStream( outputChannel );
PrintWriter printWriter = new PrintWriter( outputStream );
and then write data out to it 50 lines at a time and call the waitForOutstandingWrites() function to push everything over to GCS. When I'm coming up to the open-file limit (~22 seconds) I put the outputChannel into Memcache and then reschedule the task with the data iterator's cursor.
printWriter.print( outputString.toString() );
printWriter.flush();
outputChannel.waitForOutstandingWrites();
This seems to be working most of the time, but I'm getting these errors which is creating ~corrupted and incomplete files on the GCS. Is there anything obvious I'm doing wrong in these calls? Can I only have one channel open to GCS at a time per application? Is there some other issue going on?
Appreciate any tips you could lend!
Thanks!
Evan

A 200 response indicates that the file has been finalized. If this occurs on an API other than close, the library throws an error, as this is not expected.
This is likely occurring do to the way you are rescheduling the task. It may be that when you reschedule the task, the task queue is duplicating the delivery of the task for some reason. (This can happen) and if there are no checks to prevent this, there could be two instances attempting to write to the same file at the same time. When one closes the file the other sees an error. The net result is a corrupt file.
The simple solution is not to re-schedule the task. There is no time limit on how long a file can be held open with the GCS client. (Unlike the deprecated Files API.)

Related

FaspManager embedded client stops prematurely while sending multiple files

I am using FaspManager as an embedded client in my Java application. My program works fine when I am sending just a single file. When I am trying to send multiple files (each having its own session & jobId) they are starting well and progressing for some time. However, after several minutes when one or two of the transfers complete, rest all of the transfers are stopping without completing.
In the aspera log I can see below messages:
2019-02-11 20:48:22.985 INFO 11120 --- [il.SelectThread] c.c.e.t.aspera.FaspTransferListener : Client session: 149aaa9b-d632-43e4-9653-fbbf768c69b5 | PROGRESS | Rate: 353.6 Kb/s | Target rate: 1.0 Gb/s
2019-02-11 20:48:23.024 INFO 11120 --- [il.SelectThread] com.asperasoft.faspmanager.Session : 149aaa9b-d632-43e4-9653-fbbf768c69b5 - cancel sent
I have not been able to find out who/how a cancel request has been sent. I have tried searching in Google for possible cause but have not been able to resolve it yet. So, I will really appreciate any help on this.
Thank you,
Sourav
The cancel sent message in Session is called if the user specifically calls FaspManager#cancelTransfer(String sessionId), or FaspManager#stop(), or if an error occurs while reading an input stream in FileTransferSession#addSource(StreamReader, String).
I'd guess you're calling stop on the FaspManager after the first session finishes, but I'd need a more complete log, or a snippet of your code to see.

Thunderbird Lightning caldav sync doesn't show any data/events

when i try to synchronize my caldav server implementation with Thunderbird 45.4.0 and Lightning 4.7.4 (one particular calendar collection) it doesnt show any data or events in the calendar though the last call of the sequence provided the data.
In the Thunderbird error log i can see one error:
Zeitstempel: 07.11.16, 14:21:12
Fehler: [calCachedCalendar] replay action failed: null,
uri=http://127.0.0.1:8003/sap/sports/webdav/appsvc/webdav/services/
server.xsjs/cal/_D043133/, result=2147500037, op=[xpconnect wrapped
calIOperation]
Quelldatei:
file:///Users/d043133/Library/Thunderbird/Profiles/hfbvuk9f.default/
extensions/%7Be2fda1a4-762b-4020-b5ad-a41df1933103%7D/calendar-
js/calCachedCalendar.js
Zeile: 327
the call sequence is as follows (detailed content via gist-links):
Propfind Request - Response
Options Request - Response
Propfind Request - Response
Report Request - Response - Response Raw
The synchronization with other clients like macOS-calendar and ios-calendar works in principle and shows the data. Does anyone has a clue what is going wrong here?
Not sure whether that is the cause but I can see two incorrect things:
a) Your <href/> property has trailing spaces:
<d:href>/sap/sports/webdav/appsvc/webdav/services/server.xsjs/cal/_D043133/EVENT%3A070768ba5dd78ff15458f1985cdaabb1.ics
</d:href>
b) your ORGANIZER property is not a valid URI
ORGANIZER:_D043133
i was able to find the cause of the above issue by debugging Thunderbird as propsed by Philipp. The Report Response has http status code 200, but as it is a multistatus response Thunderbird/Lightning expects status code 207 ;-)
Thanks for the hints!

Drive API Files Patch Method fails with "Precondition Failed" "conditionNotMet"

It seems that over night the Google Drive API methods files().patch( , ).execute() has stopped working and throws an exception. This problem is also observable through Google's reference page https://developers.google.com/drive/v2/reference/files/patch if you "try it".
The exception response is:
500 Internal Server Error
cache-control: private, max-age=0
content-encoding: gzip
content-length: 162
content-type: application/json; charset=UTF-8
date: Thu, 22 Aug 2013 12:32:06 GMT
expires: Thu, 22 Aug 2013 12:32:06 GMT
server: GSE
{
"error": {
"errors": [
{
"domain": "global",
"reason": "conditionNotMet",
"message": "Precondition Failed",
"locationType": "header",
"location": "If-Match"
}
],
"code": 500,
"message": "Precondition Failed"
}
}
This is really impacting our application.
We're experiencing this as well. A quick-fix solution is to add this header: If-Match: * (ideally, you should use the etag of the entity but you might not have a logic for conflict resolution right now).
Google Developers, please give us a heads up if you're planning to deploy breaking changes.
Looks like sometime in the last 24 hours the Files.Patch issue has been put back to how it used to work as per Aug 22.
We were also hitting this issue whenever we attempted to Patch the LastModified Timestamp of a file - see log file extract below:
20130826 13:30:45 - GoogleApiRequestException: retry number 0 for file patch of File/Folder Id 0B9NKEGPbg7KfdXc1cVRBaUxqaVk
20130826 13:31:05 - ***** GoogleApiRequestException: Inner exception: 'System.Net.WebException: The remote server returned an error: (500) Internal Server Error.
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at Google.Apis.Requests.Request.InternalEndExecuteRequest(IAsyncResult asyncResult) in c:\code.google.com\google-api-dotnet-client\default_release\Tools\BuildRelease\bin\Debug\output\default\Src\GoogleApis\Apis\Requests\Request.cs:line 311', Exception: 'Google.Apis.Requests.RequestError
Precondition Failed [500]
Errors [
Message[Precondition Failed] Location[If-Match - header] Reason[conditionNotMet] Domain[global]
]
'
20130826 13:31:07 - ***** Patch file request failed after 0 tries for File/Folder 0B9NKEGPbg7KfdXc1cVRBaUxqaVk
Today's run of the same process is succeeding whenever it Patches a files timestamp, just as it was prior to Aug 22.
As a result of this 4/5 day glitch, we now have hundreds (possibly thousands) of files with the wrong timestamps.
I know the API is Beta but please, please Google Developers "let us know in advance of any 'trialing fixes'" and at least post in this forum to acknowledge the issue to save us time trying to find the fault in our user programs.
duplicated here Getting 500: Precondition Failed when Patching a folder. Why?
I recall a comment from one of dev videos saying "use Update instead of Patch as it has one less server roundtrip internally". I've inferred from this that Patch checks etags but Update doesn't. I've changed my code to use Update in place of Patch and the problem hasn't recurred since.
Gotta love developing against a moving target ;-)

Occasional, unpredictable exceptions with multi-threaded solrj queries

I have a multi-threaded application using solrj 4. There are a maximum of 25 threads. Each thread creates a connection using HttpSolrServer, and runs one query. Most of the time this works just fine. But occasionally I get the following exception:
Jan 10, 2013 9:29:07 AM org.apache.http.impl.client.DefaultRequestDirector tryConnect
INFO: I/O exception (java.net.NoRouteToHostException) caught when connecting to the target host: Cannot assign requested address
Jan 10, 2013 9:29:07 AM org.apache.http.impl.client.DefaultRequestDirector tryConnect
INFO: Retrying connect
Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.NullPointerException
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
at java.util.concurrent.FutureTask.get(FutureTask.java:111)
I wrote some code to retry the query, if it fails:
while(!querySuccess && queryAttempts<m_MaxQueryAttempts ){
try{
queryAttempts++;
rsp = m_Server.query( query );
querySuccess = true;
}catch(SolrServerException e){
querySuccess = false;
}
}
After one or more retries the query usually works. But sometimes it fails even after 100 retries. Either way, I'd like to understand what the cause of the problem is. Why does it work some of the time? Is it an issue with concurrent access to solr? Apart from this process, I only have one other process that it continually writing to the index using a single connection. The default server settings are below - so I don't think it's because of too many simultaneous connections.
INFO: Creating new http client, config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
Any suggestions on how to diagnose this would be much appreciated.

Timeout on bigquery v2 from GAE

I am doing a query to BigQuery from my app in google app engine, and receive a weird result from BQ sometimes (discovery#restDescription). It took me some time to understand that the problem occurs only when the amount of data i am querying is high, and thus making somehow my query time out within 10 sec.
I found a good description of my problem here:
Bad response to a BigQuery query
After reading again GAE docs, i found out that HTTP requests should be handled within a few seconds. So i guess, and this is only a guess, that bigquery might also be limiting itself in the same way, and therefor, has to respond to my queries "within seconds".
If this is the case, first of all, i will be a bit surprised, because my bigquery requests are for sure going to take more than few seconds... But anyway, i did a test by forcing a timeout of 1 second to my query, and then get the queryResult by polling the API call getQueryResults.
The outcome is very interesting. BigQuery is returning something within 3 secs, more or less (not 1 as i asked) and then i get my results later on, within 26 secs by polling. This looks like circumventing the 10 secs timeout issue.
But i hardly see myself doing this trick in production.
Did someone encontered the same problem with BigQuery? What am i supposed to do when the query lasts more than "few seconds"?
Here is the code i use to query:
query_config = {
'timeoutMs': 1000,
"defaultDataset": {
"datasetId": self.dataset,
"projectId": self.project_id
},
}
query_config.update(params)
result_json = (self.service.jobs()
.query(projectId=project,
body=query_config)
.execute())
And to retrieve the results, i poll with this:
self.service.jobs().getQueryResults(projectId=project,jobId=jobId).execute()
And those are the logs of what happens on BigQuery:
2012-12-03 12:31:19.835 /api/xxxxx/ 200 4278ms 0kb Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1
xx.xx.xx.xx - - [03/Dec/2012:02:31:19 -0800] "GET /api/xxxxx/ HTTP/1.1" 200 243 ....... ms=4278 cpu_ms=346 cpm_usd=0.000426 instance=00c61b117c1169753678c6d5dac736b223809b
I 2012-12-03 12:31:16.060
URL being requested: https://www.googleapis.com/discovery/v1/apis/bigquery/v2/rest?userIp=xx.xx.xx.xx
I 2012-12-03 12:31:16.061
Attempting refresh to obtain initial access_token
I 2012-12-03 12:31:16.252
URL being requested: https://www.googleapis.com/bigquery/v2/projects/xxxxxxxxxxxx/queries?alt=json
I 2012-12-03 12:31:19.426
URL being requested: https://www.googleapis.com/bigquery/v2/projects/xxxxxxxx/jobs/job_a1e74a6769f74cb997d998623b1b6b2e?alt=json
I 2012-12-03 12:31:19.500
This is what my query API call returns me. And in the meta data, status is 'RUNNING':
{u'kind': u'bigquery#queryResponse', u'jobComplete': False, u'jobReference': {u'projectId': u'xxxxxxxxxxx', u'jobId': u'job_a1e74a6769f74cb997d998623b1b6b2e'}}
with the jobId I am able to retrieve the results 26 secs later, when they are ready.
There must be another way! What am i doing wrong?

Resources