Is it possible to chart historical data with Google Cloud Logging? - google-app-engine

I have some logs in Cloud Logging. I did not create metrics before the logs came in, so I do not have any metrics containing this data. I'd like to visualize this data in a chart. Is this possible?
I see from this answer (Can't display data with with log-based metric) and the docs:
The data for logs-based metrics comes from log entries received after the metrics are created. The metrics are not populated with data from log entries that are already in Logging.
that Metrics only contain log entries from after the Metric was created. Therefore, it seems impossible to chart historical data using a Metric that was created after the data.
The only ways I've found to create a chart are using Metrics Explorer and the Monitoring dashboard. Both of these ultimately require a Metric with the data, which I am not able to create. Are there any other ways to chart data that don't require a Metric? If not, does this mean it's impossible to chart historical data with Cloud Logging/Monitoring?

Related

How do I filter and view daily logs in Google App Engine?

I have an express API application running on GAE. It is my understanding that every time someone makes a GET request to it, it creates a log. In the Logs Explorer (Operations -> Logging -> Logs Explorer), I can filter to view only GET requests from a certain source by querying:
protoPayload.method="GET"
protoPayload.referrer="https://fakewebsite.com/"
In the top-right of the Logs Explorer, I can also select a time range of 1 day to view logs from the past 24 hours.
I want to be able to see how many GET requests the app receives from a given referrer every day. Is this possible? And is there functionality to display the daily logs in a bar chart (say, to easily visualize how many logs I get every day over the period of a week)?
You can achieve this but not directly in the Cloud Logging service. You have started well: you have created a custom filter.
Then, create a sink and save the logs to BigQuery.
So now, you can perform a query to count the GET every day, and you can build a datastudio dashboard to visualise your logs.
If only the count is needed on daily basis you can create a sink to stream data directly into the BigQuery. As the data needs to be segregated on daily basis while creating a sink , a better option would be to use a partition table which can help you in two ways:
You would have a new table everyday
Based on your usage although BigQuery provide a free tier , this data is not
needed in near future storing it this way will reduce your cost and querying cost
BigQuery comes with data studio , as soon as you query on table you'll have the option to explore the result in Data studio and generate reports as needed.

Data in database and google analytics does not match

Why are the counts I see in my database different than what I see in Google Analytics? The goal conversion number showing in Google Analytics is much lower than what I see in the database. This is the case for several months.
Few reasons here
Sampled data vs. unsampled data: You can read about here: https://support.google.com/analytics/answer/1042498?hl=en - For API work i normally use a web query explorer to verify that my API call's are being sent and responses match to verify the data: https://ga-dev-tools.appspot.com/explorer/
Adblockers: You might get hits/submissions from people where they are using an ad blocker, hence more entries in Database or Google Analytics.
Users vs. Sessions vs. Hits: You are looking at Unique Visitors/Sessions in Google Analytics instead of the total number of "Events", Not sure how your Goal is setup but best to use events and look at "Total Events" and "Unique Events" to get a sense.
Implementation: You may be firing JavaScript after the person has hit the button without waiting for the page change, can happen on some sites where you take them to a thank-you page or something. Best to check how this is setup and the order in which tag fires and page works.

Cloudant CDTDatastore to pull only part of the database

We're using Cloudant as the remote database for our app. The database contains documents for each user of the app. When the app launches, we need to query the database for all the documents belonging to a user. What we found is the CDTDatastore API only allows pulling the entire database and storing it inside the app then performing the query in the local copy. The initial pulling to the local datastore takes about 10 seconds and I imagine will take longer when adding more users.
Is there a way I can save only part of the remote database to the local datastore? Or, are we using the wrong service for our app?
You can use a server side replication filter function; you'll need to add information about your filter to the pull replicator. However replication will have a performance hit when using the function.
That being said a common pattern is to use one database per user, however this has other trade offs and it is something you should read up on. There is some information on the one database per user pattern here.

How to find Uniqueness in Azure Usage Records from Billing APIs

I am building an Azure Chargeback solution and for that I am pulling the Azure Usage data from Azure Billing REST APIs for multiple subscriptions and different dates. I need to store this into custom MS SQL database as per customer’s requirements. I get various usage records from Azure.
Problem: From these Usage records, I am not able to find any combination of the columns in the data I receive which will give me a
Unique Key to identify a Usage record for a particular subscription
and for a specific date. Only column I see as different is Quantity
but even that can be duplicated. E.g. If there are 2 VMs of type A1
with no data or applications on them, in the same cloud service, then
they will have exact quantity of usage. I do not get the exact name
of the VM or any other resource via the Usage APIs.
One Custom Solution (Ineffective): I can append a counter or unique ID to the usage records but if I fetch the data next time the
order may shuffle or new data may be introduced thereby affecting the
logic for uniqueness. Any logic I build to checking if any data is
missing in DB will result in bugs if there is any alteration in the
order the usage records are returned (for a specific subscription for
a specific date).
I am sure that Microsoft stores this data in some database. I can’t find the unique id to identify a usage record from many records returned by the Billing API. Maybe I am missing something here.
I will appreciate any help or any pointers on this.
When you call the Usage API set the ShowDetails parameter to true: &showDetails=true
MSDN Doc
This will populate the instance data in the returned JSON with the unique URI for the resource which includes the name for example:
Website:
"instanceData": "{\"Microsoft.Resources\":{\"resourceUri\":\"/subscriptions/xxx-xxxx/resourceGroups/mygoup/providers/Microsoft.Web/sites/WebsiteName\",\"...
Virtual Machine:
"instanceData": "{\"Microsoft.Resources\":{\"resourceUri\":\"/subscriptions/xxx-xxxx/resourceGroups/TESTINGBillGroup/providers/Microsoft.Compute/virtualMachines/TestingBillVM\",\...
If ShowDetails is false all your resources will be aggregated on the server side based on the resource type, all your websites will show as one entry.
The resource URI, date range and meterid will form a unique key as far as I know.
NOTE: If you are using the legacy API your VMs will be aggregated under the cloud service hosting them.

Using analytics raw data in BigQuery

Ii it possible to load own (my traced website with GA) analytics raw data using a GAP account, and now make deeper analysis with BigQuery?
If you contact you GAP re-seller then he/she should enable the BigQuery integration for your account and you will not need do anything. Otherwise if you have raw analytics data you can upload it into BigQuery yourself (but you will need tohandle tha update process if this is not a once off thing)

Resources