Charting data store entity count - google-app-engine

To track my app growth, I would like to see a chart of how many new entities of a certain type are created in the data store per day.
I thought about creating a cron job that at midnight would store the count of all entities created in the past 24 hours, but maybe there is a simpler way?

I ended up using Mixpanel and Google Analytics. Whenever I create a new entity, I fire an event to these third-party analytics services. Then I get nice charts to look at later without having to schedule my own cron jobs or make any charts.

Related

How do I filter and view daily logs in Google App Engine?

I have an express API application running on GAE. It is my understanding that every time someone makes a GET request to it, it creates a log. In the Logs Explorer (Operations -> Logging -> Logs Explorer), I can filter to view only GET requests from a certain source by querying:
protoPayload.method="GET"
protoPayload.referrer="https://fakewebsite.com/"
In the top-right of the Logs Explorer, I can also select a time range of 1 day to view logs from the past 24 hours.
I want to be able to see how many GET requests the app receives from a given referrer every day. Is this possible? And is there functionality to display the daily logs in a bar chart (say, to easily visualize how many logs I get every day over the period of a week)?
You can achieve this but not directly in the Cloud Logging service. You have started well: you have created a custom filter.
Then, create a sink and save the logs to BigQuery.
So now, you can perform a query to count the GET every day, and you can build a datastudio dashboard to visualise your logs.
If only the count is needed on daily basis you can create a sink to stream data directly into the BigQuery. As the data needs to be segregated on daily basis while creating a sink , a better option would be to use a partition table which can help you in two ways:
You would have a new table everyday
Based on your usage although BigQuery provide a free tier , this data is not
needed in near future storing it this way will reduce your cost and querying cost
BigQuery comes with data studio , as soon as you query on table you'll have the option to explore the result in Data studio and generate reports as needed.

Is it possible to chart historical data with Google Cloud Logging?

I have some logs in Cloud Logging. I did not create metrics before the logs came in, so I do not have any metrics containing this data. I'd like to visualize this data in a chart. Is this possible?
I see from this answer (Can't display data with with log-based metric) and the docs:
The data for logs-based metrics comes from log entries received after the metrics are created. The metrics are not populated with data from log entries that are already in Logging.
that Metrics only contain log entries from after the Metric was created. Therefore, it seems impossible to chart historical data using a Metric that was created after the data.
The only ways I've found to create a chart are using Metrics Explorer and the Monitoring dashboard. Both of these ultimately require a Metric with the data, which I am not able to create. Are there any other ways to chart data that don't require a Metric? If not, does this mean it's impossible to chart historical data with Cloud Logging/Monitoring?

Firestore retrieving and saving big documents

I'm developing an app to help on events how much sales a single event has done and some other features that doesn't matter directly to the problem.
So, I've done an attempt to use Firestore to save my data, but something is saying to me that I was using the wrong way. Every event has something around 2k ~ (20k ~40k) entries of sales entries. Firebase Realtime Database doesn't seem a good idea because of data duplication needed for the relations I need to create.
The most important part of the Techstack I'm using:
React Native
React Native Firebase (native solution for react native on firebase)
Redux
The problem is that whenever I try to retrieve those documents for, let's say a custom report for my clients, the app just crashes or freezes entirely. Talking to other developers, they said to me that maybe Firestore is not a good solution for my case because of this 'sort of' big data retrieving.
Structure
Organizations/organization_id
Org_name
members (Array of userIds)
Events/event_id/
Event Name
Event sales data (Array)
Event product list (Array)
Event Date
Event attendees (Array of Attendees (name, pin (int))
Organization name
I've checked also that Firestore has a limit of 20k registries or entries per document (something like this). A friend of mine who have more experience told me that an SQL database an a normal API would probably solve the problem, but maybe require more work as I am a single developer.
Do you think that Firestore is a good solution and I was probably using the wrong way?
Or would say that Firestore is not the case for this problem where I would have to save and retrieve data with such relationships?

How to schedule repeated jobs or tasks from user parameters in Google App Engine?

I'm using Google App Engine and I want like to be able to schedule jobs based users' parameters.
I know this can be done with cron jobs, but looks like it does not allow any flexibility from the user's point of view, but it allows only to schedule predefined jobs.
For example, suppose I have a news app where users can subscribe to different topics: I want the admin to be able to decide when to send a summary email, for instance, every day at 8am, and I want him to be able to edit this.
Is there anything that provides this?
You may want to star Issue 3638: Cron jobs to be scheduled programatically
Meanwhile you can write your own implementation: have a generic cron job running periodically (every 1 minute being the finest resolution) and inside that cron job check user-programmed scheduling data persisted somewhere (in the datastore for example) and, if needed, trigger execution of whatever is needed to be executed, either inlined or by enqueueing a task in some task queue.
It's possible to drive the scheduling resolution even under 1 minute, if needed, see High frequency data refresh with Google App Engine
For my debts tracking app DebtsTracker.io I've implemented it manually.
When a user creates a debt record he can specify due date that is stored in an unindexed DueDate and indexed ReminderDateTime fields.
I have a cron that query records with ReminderDateTime < today and sends notifications. Once notification sent the ReminderDateTime is set to null or far future so it's not picked in next cron run. If user hits Remind me again I am updating the ReminderDateTime to some date in future (user decides when).
If ReminderDateTime is closer then cron interval then I simply create I put a task to queue with appropriate delay.
This works very well and is very cheap to run.

Pulling facebook and twitter status updates into a SQL database via Coldfusion Page

I'd like to set up a coldfusion page that will pull the status updates from my own facebook account and twitter accounts and put them in a SQL database along with their timestamps. Whenever I run this page it should only grab information after the most recent time stamp it already has within the database.
I'm hoping this won't be too bad because all I'm interested in is just status updates and their time stamps. Eventually I'd like to pull other things like images and such, but for a first test just status updates is fine. Does anyone have sample code and/or pointers that could assist me in this endeavor?
I'd like it if any information relates to the current version of the apis (twitter with oAuth and facebook open graph) if they are necessary. Some solutions I've seen involve the creation of a twitter application and facebook application to interact with the APIs; is that necessary if all I want to do is access a subset of my own account information? Thanks in advance!
I would read the max(insertDate) from the database and if the API allows you, only request updates since that date. Then insert those updates. The next time you run you'll just need to get the max() of the last bunch of updates before calling for the next bunch.
You could run it every 5 minutes using a ColdFusion scheduled task.
How you communicate with the API is usually using <cfhttp />. One thing I always do is log every request and response, either in a text file, or in a database. That's can be invaluable when troubleshooting.
Hope that helps.
Use the cffeed tag to pull RSS feeds from Twitter and Facebook. Retain the date of the last feed scan somewhere (application variable or database) and loop over the feed entries. Any entry older than last scan is ignored, everything else gets committed. Make sure to wrap cffeed in a try/catch, as it will throw errors if the service is down (ahem, twitter) As mentioned in other answers, set it up as a scheduled task.
<cffeed action="read" properties="feedMetadata" query="feedQuery"
source="http://search.twitter.com/search.atom?q=+from:mytwitteraccount" />
Different approach than what you're suggesting, but it worked for us. We had two live events, where we asked people to post to a bespoke Facebook fan page, or to Twitter with a hashtag we endorsed for the event in realtime. Then we just fetched and parsed the RSS feeds of the FB page, and the Twitter search results, extracting what was new, on a short interval... I think it was approximately every three minutes. CFFEED was a little error-prone and wonky, just doing a CFHTTP get of the RSS feeds, and then processing the CFHTTP.filecontent struct item as XML worked fine
.LAG

Resources