Alternatives to App Engine's native logging API? - google-app-engine

Does anyone have any advice on making the logging in Google App Engine better? I am currently trying to use Splunk Storm, but they are finicky regarding input and go down often. Has anyone else encountered this and solved it in some capacity?
Currently I have a process that runs in a backend that reads from the LogService and pipes the logs into Splunk Storm via REST api. This often fails, or storm goes down, or the backend IP changes.
My issue is with the logging provided within App Engine, as the logs disappear when new versions are pushed and querying the logs with the provided dashboard is almost unusable. Splunk was a potential solution, but the cloud solution leaves a lot to be desired.
Anything that would provide a better interface into my logs would be appreciated.

You can export logs from GAE to BiqQuery which has quite capable query language. You can use Mache, an open-source project that already does this. You should write your own exporter, to expose (and make queryabe) fields (columns) you are interested in.

Since you've decided to use Splunk (or another external service) as permanent storage, it sounds like you need a location to buffer logs between the times when they're written to App Engine's log service and when Splunk is available to accept the logs. To avoid losing logs before version churn causes them to fall out of App Engine, this buffer needs to be fast and highly available.
One reasonable choice is the AE datastore. There's no unreliable hop to a 3rd party, it has an availability SLA, and it can be scaled arbitrarily by sharding writes. The downside would be the cost of R/W operations and the storage footprint of in-flight logs, but you'll incur a comparable cost for another backing store.
Whatever choice of service, have one batch process (e.g. backend or cronjob) write to the buffer from the logs reader API. As long as it runs more often than app updates, logs will always exist in durable storage. Then have another batch process wait for Splunk to be available then upload to it from the buffer and delete as you get receipt confirmation from Splunk.

Related

Google Cloud Bigtable Python Client Performance Issue

I'm running into a performance issue with Google Cloud Bigtable Python Client. I'm working on a flask API that writes to and reads from a GCP Bigtable instance. The API uses the python client to communicate with Bigtable, and was deployed to GCP App Engine flexible environment.
Under low traffic, the API works fine. However during a load test, the endpoints that reads and writes to Bigtable suffers a huge performance decrease compare to a similar endpoint that doesn't communicate with Bigtable. Also, a large percentage of requests went to the endpoint receives a 502 Bad Gateway, even when health check was turned off in App Engine.
I'm aware of that the client is currently in Alpha. I wonder if the performance issue is known, or if anyone also ran into the same issue
Update
I found a documentation from Google stating:
There are issues with the network connection. Network issues can
reduce throughput and cause reads and writes to take longer than
usual. In particular, you'll see issues if your clients are not
running in the same zone as your Cloud Bigtable cluster.
In my case, my client is in a different region, by moving it to the same region had a huge increase in performance. However the performance issue still exist, and the recommendation from the documentation is to put client in the same zone as Bigtable.
I also considered using Container engine or Compute Engine where it is easier to specify the zone, but I want stay with App Engine for its autoscale functionality and managed services.
Bigtable client take somewhere between 3 ms to 20 ms to complete each request, and because python is single threaded, during that period of time it will just wait until the response comes back. The best solution we found was for any writes, publish the request to Pubsub, then use Dataflow to write to Bigtable. It is significantly faster because publishing a message in Python would take way below 1 ms to complete, and because Dataflow can be set to exactly the same region as Bigtable, and it is easy to parallel, it can write much faster.
Though it doesn't solve the scenario where you need frequent read or write need to be instantaneous

Writing to Datastore from Backends without shutting down

I am trying to write a program in Google App Engine (Python) to continually run a resident Backend which is working on finding what a series converges to. I want to make it so that it runs in the Backend, writes to Datastore, and at any point in time, you can tell what item the series is on and what value it is. The Backend only writes to one entity in Datastore, so it does not overload the storage or anything.The probably I run into though is that the Backend does not write the entity to the Datastore so it is accessible by my frontend webpage until the Backend is shut down, which defeats the purpose of being able to continually check in on it. If there is some way to have the Backend write to the Datastore so the frontend page can check in on it, please tell me!
Datastore writes in a backend process should behave no differently than writes in your front end app, meaning that they should be available for read in your front end (nearly) instantly (within consistency constraints). Both backend and front end interact with the same datastore.
It sounds like you just need to implement a recurring write of the current status of your series (ie. once every x cycles), instead of writing once at the end of the backend process.
You post suggests two issues.
The first is "without shutting down". We don't guarantee that backends will run indefinitely. See the docs on Shutdown for some details.
The second issue, if I'm understanding you, is that you're not seeing values written by the backend until some time after they're written. You may be running into "eventual consistency", were "eventual" is usually pretty short, but can an rare occasions be surprisingly long. Understanding Isolation and Consistency can help here.

Google App Engine - How reliable are the logs?

How reliable are the Google App Engine logs?
Because logs are optimized for write speed, and the datastore is optimized for read speed, I'm thinking about storing some data by writing it to the logs rather than writing it to the datastore.
If I call Logger.info("something");, and the call succeeds, will that log entry definitely show up in the logs? Or will it sometimes silently fail?
About every hour I'll have my home computer download the logs to persist the data on my home computer.
Although it's very unlikely, it's possible the call could silently fail, because logs are written asynchronously (or else they wouldn't be so fast). If you need reliability, using the task queue or deferred to insert a datastore entity might be a better option.

How does Google App Engine infrastructure is fault tolerant?

I am actually implementing a web application on Google App Engine. This has taken me for the moment a huge time in re-designing the database and the application through GAE requirements and best practices.
My problem is this: How can I be sure that GAE is fault tolerant, or at what degree is it fault tolerant? I didn't find any documents in GAE on this, and it is an issue that could have drawbacks for me: My app would have, for example, to read an entity from the datastore, compute it in the application, and then put it on the datastore. In this case how could we be sure that this would be correctly done and that we get the right data : if for example the machine on which the computing have be done crash ?
Thank you for your help!
If a server crashes during a request, that request is going to fail, but any new requests would be routed to a different server. So one user might see an error, but the rest would not. The data in the datastore would be fine. If you have data that needs to be kept consistent, you would do your updates in a transaction, so that either the whole set of updates was applied or none.
Transactions operating on the same entity group are executed serially, but transactions operating on different entity groups run in parallel. So, unless there is a single entity which everything in your app wants to read and write, scalability will not suffer from transactions.

.NET CF mobile device application - best methodology to handle potential offline-ness?

I'm building a mobile application in VB.NET (compact framework), and I'm wondering what the best way to approach the potential offline interactions on the device. Basically, the devices have cellular and 802.11, but may still be offline (where there's poor reception, etc). A driver will scan boxes as they leave his truck, and I want to update the new location - immediately if there's network signal, or queued if it's offline and handled later. It made me think, though, about how to handle offline-ness in general.
Do I cache as much data to the device as I can so that I use it if it's offline - Essentially, each device would have a copy of the (relevant) production data on it? Or is it better to disable certain functionality when it's offline, so as to avoid the headache of synchronization later? I know this is a pretty specific question that depends on my app, but I'm curious to see if others have taken this route.
Do I build the application itself to act as though it's always offline, submitting everything to a local queue of sorts that's owned by a local class (essentially abstracting away the online/offline thing), and then have the class submit things to the server as it can? What about data lookups - how can those be handled in a "Semi-live" fashion?
Or should I have the application attempt to submit requests to the server directly, in real-time, and handle it if it itself request fails? I can see a potential problem of making the user wait for the timeout, but is this the most reliable way to do it?
I'm not looking for a specific solution, but really just stories of how developers accomplish this with the smoothest user experience possible, with a link to a how-to or heres-what-to-consider or something like that. Thanks for your pointers on this!
We can't give you a definitive answer because there is no "right" answer that fits all usage scenarios. For example if you're using SQL Server on the back end and SQL CE locally, you could always set up merge replication and have the data engine handle all of this for you. That's pretty clean. Using the offline application block might solve it. Using store and forward might be an option.
You could store locally and then roll your own synchronization with a direct connection, web service of WCF service used when a network is detected. You could use MSMQ for delivery.
What you have to think about is not what the "right" way is, but how your implementation will affect application usability. If you disable features due to lack of connectivity, is the app still usable? If you have stale data, is that a problem? Maybe some critical data needs to be transferred when you have GSM/GPRS (which typically isn't free) and more would be done when you have 802.11. Maybe you can run all day with lookup tables pulled down in the morning and upload only transactions, with the device tracking what changes it's made.
Basically it really depends on how it's used, the nature of the data, the importance of data transactions between fielded devices, the effect of data latency, and probably other factors I can't think of offhand.
So the first step is to determine how the app needs to be used, then determine the infrastructure and architecture to provide the connectivity and data access required.
I haven't used it myself, but have you looked into the "store and forward" capabilities of the CF? It may suit your needs. I believe it uses an Exchange mailbox as a message queue to send SOAP packets to and from the device.
The best way to approach this is to always work offline, then use message queues to handle sending changes to and from the device. When the driver marks something as delivered, for example, update the item as delivered in your local store and also place a message in an outgoing queue to tell the server it's been delivered. When the connection is up, send any queued items back to the server and get any messages that have been queued up from the server.

Resources