How to request a trace with google cloud-trace

How to request a trace with google cloud-trace - google-app-engine

Google Cloud has a powerful tracing tool for analyzing latency of requests and RPCs. But it seems to just pick some requests that it finds deserving of traces. Sometimes that's good enough, you can just browse through existing traces. But if you are working on a performance enhancement, you want the trace on your particular query right now, you don't want to wait until it is deemed interesting.
Questions are
What rules intervene in deciding which queries are traced ?
Is there a way to ask for traces to be captured for a given URI ?
Either from within developer console, or by calling some API from within our application ? Or through some app.yaml configuration ? Or do we have to just wait and pray for the great algorithm to chose our request ?

You can force tracing of a HTTP request by setting the cloud trace context header properly:
$ curl -H "X-Cloud-Trace-Context: 01234567890123456789012345678901;o=1" http://<your-app>.appspot.com/<path>
01234567890123456789012345678901 (32 hex characters) is the trace id. You want to use a different one each time.
o=1 enables tracing.
Use the following URL to view the trace (last part is the trace id):
http://console.developer.google.com/traces/details/01234567890123456789012345678901

Since you're interested in particular request, why don't you use appstats?
https://cloud.google.com/appengine/docs/python/tools/appstats?hl=en you can make performance enchancement, turn appstats on and deploy it to different version and have some control from appengine_config.py
I use cloud trace to get aggregate analysis, to get into more detail per request basis, I always use appstats as it contains more information.

What rules intervene in deciding which queries are traced ?
Currently there is a sampling rate that guides which request are traced. The requests are sampled at a small number of requests per second per instance.
Is there a way to ask for traces to be captured for a given URI ?
The following could possibly help depending on your scenario.
You could add a Trace Context to force the request to force tracing on the request. Trace Context is essentially a HTTP header (X-Cloud-Trace-Context)
Here is an pointer to help inject a Trace Context: https://github.com/liqianluo/gcloud-trace-java/blob/master/cloud-trace-sdk-java-core/src/main/java/com/google/cloud/trace/sdk/TraceContext.java

Related

App engine urlfetch: how to get the full response with allow_truncate = true ?

I've faced ResponseTooLargeError when using AppEngine's urlfetch.fetch because my response was several Mb big. (doc)
I see there is a allow_truncate param I can pass and it will truncate the response if too big. Is there any way then to request the rest of the response ? Something like making new calls to the same URL with an offset ?
Otherwise I don't really understand how this param can be useful (just checking if it returns without error ?)
Thanks

There is no standard way of requesting the "rest of the response", it all depends on the particular server-side implementation of the service you're working with: if it offers support for transferring data in smaller chunks, and, if it does, how exactly that works, i.e. the exact protocol for performing such transfers. Some may even consider such capability a part of the service itself.
Just to get an idea of what the implementation might look like you can see some possible options for a particular service discussed in Very large HTTP request vs many small requests
Yes, the usefulness of the option is to be able to get a partial response vs just getting a ResponseTooLargeError (which might be sufficient in some cases).

App Engine silently fails on some requests

Some requests silently fail in my python app, intermittently and unpredictably. The hallmarks of the failure are:
Request returns a 200, so the client doesn't know there's a problem.
Request does NOT successfully execute on the server.
No logging statements are recorded for the request.
Below is an example from my logs of a bunch of requests which are each supposed to write an entity to the datastore. You can see for the lower, successful request, a blue 'i' is present, indicating that info level logs were recorded. When I examine the datastore, an entity was successfully written for this request.
However, for the failed request, you can see there is just a white box, and there are no logging statements present at all. While the server returned a 200, no entity was written to the datastore for this request.
Has anyone encountered something like this before on App Engine? Any ideas on how to debug it? I've seen it in multiple different apps myself, but I've never been able to figure it out.
EDIT
To clarify, the main problem here is that code doesn't execute, as measured by the failure to write an entity. The spurious 200 and lack of logging is an associated symptom.

From a comment originally, but seems to be the resolution path for this issue:
Given that there are no log statements at all in the line and you appear to unpack the arguments and log them as soon as you enter the handler, this starts to look like an infrastructure/platform issue.
In such a case, it's best to open a public issue tracker issue, with "Type-Production" as a tag, including your app's app id and a timeframe, and as much information about your app and request handler involved as possible, and platform support will pick up the issue in the course of triage.
That said, it's worth examining the handler to make absolutely sure there's no way you could be exiting from the handler and sending a 200 without logging anything or seeing an exception. It all depends on what the code handling the request is capable of, what stack of libraries it's build upon, etc.

will gatling actually perform the operation or will it check only the urls' response time?

I have a gatling test for an application that will answer a survey and upon answering this survey, the application will identify possible answers that may pose a risk and create what we call riskareas. These riskareas are normally created in the background as soon as the survey answering is finished. My question is I have a gatling test with ten users who will go and answer the survey and logout, I used recorder to record the test; now after these ten users are finished I do not see any riskareas being created in the application. Am I missing something--should the survey be really answered by gatling (like it does in selenium) user or is it just the urls that the gatling test will touch ?
I am new to gatling please help.

Gatling should be indistinguishable from a user in a web browser (or Selenium) as far as the server is concerned, so the end result should be exactly the same as if you'd gone through the process yourself. However, writing a Gatling script is a little more work than writing a Selenium script.
For performance reasons, Gatling operates at a lower level than Selenium. Gatling works with the actual data that is sent and received from the server (i.e, the actual GETs and POSTs sent to the server), rather than with user-level interactions (such as clicking links and filling forms).
The recorder will generally produce a relaitvely "dumb" script. It records the exact data that was sent to the server, and makes no attempt to account for things that may change from run to run. For example, the web application you are testing might have hidden form fields that contain session information, or the link addresses might contain a unique identifier or a session id.
This means that your script may not be doing what you think it's doing.
To debug the script, the first thing to do is to add checks on each of the requests, to validate that you are getting the response you expect (for example, check that when you submit page 1 of the survey, you are taken to page 2 - check for something that you'd only expect to find on page 2, like a specific question).
Once you know which requests are failing, look at what data was sent with the request, and try to figure out where it came from. You will probably find that there are session ids, view state, or similar, that must be extracted from the previous page.
It will help to enable request and response logging, as per the documentation.
To simplify testing of web apps, we wrote some helper functions to allow tests to be written in a more Selenium-like way. Once you understand what your application is doing, you may find that it simplifies scripting for you too. However, understanding why your current script doesn't work the way you expect should be your first step.

Using AppStats on GAE with NDB, how can I tell what query is executing?

I'm looking at RPC calls, but the stack is all tasklet and ndb. This is making it difficult for me to tell what queries are running. Is there an appstats config setting I need to use?
Thanks for the help.

In appengine_config.py add: (Your code calls won't make it into the query stack without this)
appstats_MAX_STACK = 20
In AppStats, under Requests History, click on a request.
In RPC Call Traces click "+" for a datastore_v3.RunQuery
In the stack trace you can see links to your code that was executed.

Logging when application is running as XBAP?

Anybody here has actually implemented any logging strategy when application is running as XBAP ? Any suggestion (as code) as to how to implement a simple strategy base on your experience.
My app in desktop mode actually logs to a log file (rolling log) using integrated asop log4net implementation but in xbap I can't log cause it stores the file in cache (app2.0 or something folder) so I check if browser hosted and dont log since i dont even know if it ever logs...(why same codebase)....if there was a way to push this log to a service like a web service or post error to some endpoint...
My xbap is full trust intranet mode.

I would log to isolated storage and provide a way for users to submit the log back to the server using either a simple PUT/POST with HttpWebRequest or, if you're feeling frisky, via a WCF service.
Keep in mind an XBAP only gets 512k of isolated storage so you may actually want to push those event logs back to the server automatically. Also remember that the XBAP can only speak back to it's origin server, so the service that accepts the log files must run under the same domain.
Here's some quick sample code that shows how to setup a TextWriterTraceListener on top of an IsolatedStorageFileStream at which point you can can just use the standard Trace.Write[XXX] methods to do your logging.
IsolatedStorageFileStream traceFileStream = new IsolatedStorageFileStream("Trace.log", FileMode.OpenOrCreate, FileAccess.Write);
TraceListener traceListener = new TextWriterTraceListener(traceFileStream);
Trace.Listeners.Add(traceListener);
UPDATE
Here is a revised answer due to the revision you've made to your question with more details.
Since you mention you're using log4net in your desktop app we can build upon that dependency you are already comfortable working with as it is entirely possible to continue to use log4net in the XBAP version as well. Log4net does not come with an implementation that will solve this problem out of the box, but it is possible to write an implementation of a log4net IAppender which communicates with WCF.
I took a look at the implementation the other answerer linked to by Joachim Kerschbaumer (all credit due) and it looks like a solid implementation. My first concern was that, in a sample, someone might be logging back to the service on every event and perhaps synchronously, but the implementation actually has support for queuing up a certain number of events and sending them back to the server in batch form. Also, when it does send to the service, it does so using an async invocation of an Action delegate which means it will execute on a thread pool thread and not block the UI. Therefore I would say that implementation is quite solid.
Here's the steps I would take from here:
Download Joachim's WCF appender implementation
Add his project's to your solution.
Reference the WCFAppender project from your XBAP
Configure log4net to use the WCF appender. Now, there are several settings for this logger so I suggest checking out his sample app's config. The most important ones however are QueueSize and FlushLevel. You should set QueueSize high enough so that, based on how much you actually are logging, you won't be chattering with the WCF service too much. If you're just configuring warnings/errors then you can probably set this to something low. If you're configuring with informational then you want to set this a little higher. As far as FlushLevel you should probably just set this to ERROR as this will just guarantee that no matter how big the queue is at the time an error occurs everything will be flushed at the moment an error is logged.
The sample appears to use LINQ2SQL to log to a custom DB inside of the WCF service. You will need to replace this implementation to log to whatever data source best suits your needs.
Now, Joachim's sample is written in a way that's intended to be very easy for someone to download, run and understand very quickly. I would definitely change a couple things about it if I were putting it into a production solution:
Separate the WCF contracts into a separate library which you can share between the client and the server. This would allow you to stop using a Visual Studio service reference in the WCFAppender library and just reference the same contract library for the data types. Likewise, since the contracts would no longer be in the service itself, you would reference the contract library from the service.
I don't know that wsHttpBinding is really necessary here. It comes with a couple more knobs and switches than one probably needs for something as simple as this. I would probably go with the simpler basicHttpBinding and if you wanted to make sure the log data was encrypted over the wire I would just make sure to use HTTPS.

My approach has been to log to a remote service, keyed by a unique user ID or GUID. The overhead isn't very high with the usual async calls.
You can cache messages locally, too, either in RAM or in isolated storage -- perhaps as a backup in case the network isn't accessible.
Be sure to watch for duplicate events within a certain time window. You don't want to log 1,000 copies of the same Exception over a period of a few seconds.
Also, I like to log more than just errors. You can also log performance data, such as how long certain functions take to execute (particularly out-of-process calls), or more detailed data in response to the user explicitly entering into a "debug and report" mode. Checking for calls that take longer than a certain threshold is also useful to help catch regressions and preempt user complaints.

If you are running your XBAP under partial trust, you are only allowed to write to the IsolatedStorage on the client machine. And it's just 512 KB, which you would probably want to use in a more valuable way (than for logging), like for storing user's preferences.
You are not allowed to do any Remoting stuff as well under partial trust, so you can't use log4net RemotingAppender.
Finally, under partial trust XBAP you have WebPermission to talk to the server of your app origin only. I would recommend using a WCF service, like described in this article. We use similar configuration in my current project and it works fine.
Then, basically, on the WCF server side you can do logging to any place appropriate: file, database, etc. You may also want to keep your log4net logging code and try to use one of the wcf log appenders available on the internets (this or this).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight