How to clear Stackdriver logs in Google Cloud Platform? - google-app-engine

I recently realized I was paying way too much for my logs:
As you can see in the image, logs are getting bigger each month
As you also can see I just today put a "limit" on the ingestion. Hopefully this will slow things down.
But as I understand it, my logs have gotten so big that I have to pay for their retention each month. I cannot figure out how to:
a) delete logs of a certain period (or just all of them)
b) make logs auto delete after x days
I also just today put a quota limit of 100 instead of 6000

The logs expire according to the retention policy:
Admin Activity 400 days
System Events 400 days
Data Access 30 days
Access Transparency 30 days
Other Logs 30 days
Note that you're not charged for Admin Activity or System Event logs.
Some solutions to control costs are exclusions and exports, but even if you use timestamp to specify the range of dates in the filter expressions to create an exclusion filter, since it's already loaded, it won't be excluded. The same applies to creating a log sink for exporting data, since it will export future matching logs.
You can use gcloud logging logs delete to delete all the logs for a given project or for a given resource, but you can't specify a range of time.
So, my suggestions are the next ones:
1.- Delete all the existing logs for resources you don't need logging.
2.- Create exclusions to keep only the logs you may need during 30 days.
3.- Create export sinks for all the logs you may need for more than 30 days.

Related

How to find the logs for a call for a certain time period

I am working on configuring a StackDriver alert, and we need to filter a log metric to only find logs for an hour at a time, and I just haven't found much online.
So far this is what we have for the metric in GCP:
resource.type="gae_app"
resource.labels.module_id="some_service"
logName:"log_url_here"
protoPayload.resource: "task"
Just curious if there's a way to filter from an hour ago, to now at all times.

Find the total instance hours in my Google Apps Engine

Where can I find the total front-end instance hours that I have used in my previous days? It seems I can only see today's total.
AFAIK the actual instance-hours numbers are not directly available, at least not in the developer console.
What you might find helpful would be the historical graph of the instance usage, for the last (up to) 30 days, which you'll find in the Dashboard after selecting the Instances display mode, the desired timescale and the billed instance estimate graph:
Hover the mouse cursor over the graph and the actual billed instance estimate value will be displayed below the graph and the corresponding date and time is displayed in the top right corner. Note that there is some noticeable delay in updating the values, tho.
The graph detail level depends on the selected timescale, the coarsest being ~2h (on the 30 days timescale), so if you need more than just a general idea about usage trending you'd need to take care of:
averaging and/or integrating on a daily basis
accounting for the free daily quotas
accounting for instance class (I'm unsure if that's already taken into account or not)
You could also try the billing date export feature. From
Export billing data to a file:
You can export your daily usage and cost estimates automatically to a
CSV or JSON file stored in a Google Cloud Storage bucket you
specify. You can then access the data via the Cloud Storage API, CLI
tool, or Google Cloud Platform console
...
Alternatively, you can export detailed data to a Google BigQuery
dataset. For more information, see Export billing data to
BigQuery.
Note: I haven't actually tried this export feature. so I can't tell if the instance hours values are in there. There's also the still open GAE issue 10716 which suggests GAE stats might not be included.

What is the difference between _A0 & _S0 log files when storing Google App Engine Standard logs in GCS

I have turned on the switch to send the logs of a GAE Standard app to a GCS bucket. I see there as expected a folder for each day. For every hour of every days I see a very big json file with the extension _S0.json. For some hours I also see a much smaller file with the extension _A0:.json. For instance:
01:00:00_01:59:59_S0.json & 01:00:00_01:59:59_A0:4679580000.json
What is the difference, I am trying to post process the files and need to know.
Logs exported to GCS are sharded, the _A0 and _S0 are simply identifiers of the logs shards.
From Log entries in Google Cloud Storage (emphasis mine):
The leaf directories (DD/) contain multiple files, each of which
holds the exported log entries for a time period specified in the file
name. The files are sharded and their names end in a shard number,
Sn or An (n=0, 1, 2, ...). For example, here are two files that might be stored within the directory
my-gcs-bucket/syslog/2015/01/13/:
08:00:00_08:59:59_S0.json
08:00:00_08:59:59_S1.json
These two files together contain the syslog log entries for all
instances during the hour beginning 0800 UTC. To get all the log
entries, you must read all the shards for each time period—in this
case, file shards 0 and 1. The number of file shards written can
change for every time period depending on the volume of log entries.
I got to the above page via the last link in the below quoted section from Quotas and limits:
Logs ingestion allotment
Logging for App Engine apps is provided by Stackdriver. By
default, logs are stored for an application free of charge for up to 7
days and 5GB. Logs older than the maximum retention time are deleted,
and attempts to store above the free ingestion limit of 5 gigabytes
will result in an error. You can update to the Premium Tier for
greater storage capacity and retention length. See Stackdriver
pricing for more information on logging rates and limits. If you
want to retain your logs for longer than what Stackdriver allows, you
can export logs to Google Cloud Storage, Google BigQuery, or Google
Cloud Pub/Sub.

how do you deploy a cron script in production?

i would like to write a script that schedules various things throughout the day. unfortunately it will do > 100 different tasks a day, closer to 500 and could be up to 10,000 in the future.
All the tasks are independent in that you can think of my script as a service for end users who sign up and want me to schedule a task for them. so if 5 ppl sign up and person A wants me to send them an email at 9 am, this will be different than person B who might want me to query an api at 10:30 pm etc.
now, conceptually I plan to have a database that tells me what each persons task will be and what time they asked to schedule that task and the frequency. once a day I will get this data from my database so I have an up-to-date record of all the tasks that need to be executed in the day
running them through a loop I can create channels that can execute timers or tickers for each task.
the question I have is how does this get deployed in production to, for example google app engine? since those platforms are for Web servers I'm not sure how this would work...Or am I supposed to use Google Compute Engine and have it act as a computation for 24 hours? Can google compute engine even make http calls?
also if I have to have say 500 channels in go open 24 hrs a day, does that count as 500 containers in google app engine? I imagine that will get very costly quickly, despite what is essentially a very low cost product.
so again the question comes back to, how does a cron script get deployed in production?
any help or guidance will be greatly appreciated as I have done a lot of googling and unfortunately everything leads back to a cron scheduler that has a limit of 100 tasks in google app engine...
Details about cron operation on GAE can be found here.
The tricky portion from your prospective is that updating the cron configuration is done from outside the application, so it's at least difficult (if not impossible) to customize the cron jobs based on your app user's actions.
It is however possible to just run a generic cron job (once a minute, for example) and have that job's handler read the users' custom job configs and further generate tasks accordingly to handle them. Running ~10K tasks per day is usually not an issue, they might even fit inside the free app quotas (depending on what the tasks are actually doing).
The same technique can be applied on a regular Linux OS (including on a GCE VM). I didn't yet use GCE, so I can't tell exactly if/how would a dynamically updated cron be possible with it.
You only need one cron job for your requirements. This cron job can run every 30 minutes - or once per day. It will see what has to be done over the next period of time, create tasks to do it, and add these tasks to the queue.
It can all be done by a single App Engine instance. The number of instances you need to execute your tasks depends, of course, on how long each task runs. You have a lot of control over running the task queue.

Datastore Quota Reached: Project quota page shows none reached

I'm receiving this error on our App:
The API call datastore_v3.Put() required more quota than is available.
However when I check out our quotas page, nothing is flagging as being over quota (or even close). We have billing enabled, we're not at our daily budget ($2 to test that it's not that - although normally $0) and these errors have been showing for over a minute (so I don't expect that it's the per-minute limits).
How can an API call fail due to being over quota, if everything seems to show that we're not over quota?
Budgets for API calls take a while to kick in. In this case, the project had hit the free limit for datastore operations (0.05M), however the increased daily budget had only just been enabled and so the app was still unable to use more of the operations.
This problem solved was solved after a couple of hours.
For others experiencing this issue, you can find the datastore free quotas here. Compare your current usage on the view in the question to these limits. If it looks like you've gone over then re-assess where your daily budget is (or whether you need so many datastore calls!).

Resources