I have data generated on daily basis.
let me explain through a example:
On World Market, the price of Gold change on seconds interval basis. and i want to store that price in Redis DBMS.
Gold 22 JAN 11.02PM X-price
22 JAN 11.03PM Y-Price
...
24 DEC 11.04PM X1-Price
Silver 22 JAN 11.02PM M-Price
22 JAN 11.03PM N-Price
I want to store this data on daily basis. want to apply ML (Machine Leaning) on last 52 Week data. Is this possible?
Because As my knowledge goes. redis work on Key Value.
if this is possible. Can i get data from a specific date(04 July) and DateRange(01 Feb to 31 Mar)
In redis, a Sorted Set is appropriate for time series data. If you score each entry with the timestamp of the price quote you can quickly access a whole day or group of days using the ZRANGEBYSCORE command (or ZSCAN if you need paging).
The quote data can be stored right in the sorted set. If you do this make sure each entry is unique. Adding a record to a sorted set that is identical to an existing one just updates the existing record's score (timestamp). This moves the old record to the present and erases it from the past, which is not what you want.
I would recommend only storing a unique key/ID for each quote in the sorted set, and store the data in it's own key or hash field. This will allow you to create additional indexes for the data as needed and access specific records more easily if necessary.
Related
in Salesforce, how to create a formula that calculate the highest figure for last month? for example, if I have an object that keeps records that created in Sept, now would like to calculate its max value (in this case, should be 20 on 3/8/2019) in last month's (August). If it's in July, then need to calculate for June. How to construct the right formula expression? Thanks very much!
Date Value
1/9/2019 10
1/8/2019 14
2/8/2019 15
3/8/2019 20
....
30/8/2019 15
You can't do this with normal formulas on records because they "see" only current records (and some related via lookup), not other rows in same table.
You could make another object called "accounting periods" or something like that. Link all these entries to periods (months) in master-detail relationship. You'll then be able to use rollup summary with MAX(). Still not great because you need lookup to previous month to pull it but should give you an idea.
You could make a report that achieves something like that. PREVGROUPVAL will let you do some amazing & scary stuff. https://trailhead.salesforce.com/en/content/learn/projects/rd-summary-formulas/rd-compare-groups Then... if all you need is a report - great. If you really need it saved somewhere - you could look into reporting snapshots & save results in helper object...
If you want to do it without any data model changes like that master-detail or helper object - you could also write some code. Nightly batch job (running daily? only on 1st day of month?) should be pretty simple.
Without code - in a pinch you could make a Flow that queries records from previous month. Bit expensive to run such thing for August every time you add a September record but if you discarded other options...
This should be simple so I think I am missing it. I have a simple line chart that shows Users per day over 28 days (X axis is date, Y axis is number of users). I am using hard-coded 28 days here just to get it to work.
I want to add a scorecard for average daily users over the 28 day time frame. I tried to use a calculated field AVG(Users) but this shows an error for re-aggregating an aggregated value. Then I tried Users/28, but the result oddly is the value of Users for today. The division seems to be completely ignored.
What is the best way to show average number of daily users over a time frame? Average daily users over 10 days, 20 day, etc.
Try to create a new metric that counts the dates eg
Count of Date = COUNT(Date) or
Count of Date = COUNT_DISTINCT(Date) in case you have duplicated dates
Then create another metric for average users
Users AVG = (Users / Count of Date)
The average depends on the timeframe you have selected. If you are selecting the last 28 days the average is for those 28 days (dates), if you filter 20 days the average is for those 20 days etc.
Hope that helps.
I have been able to do this in an extremely crude and ugly manner using Google Sheets as a means to do the calculation and serve as a data source for Data studio.
This may be useful for other people trying to do the same thing. This assumes you know how to work with GA data in Sheets and are starting with a Report Configuration. There must be a better way.
Example for Average Number of Daily Users over the last 7 days:
Edit the Report Configuration fields:
Report Name: create one report per day, in this case 7 reports. Name them (for example) Users-1 through Users-7. These are your Row 2 values. You'll have 7 columns, with the first report name in column B.
Start Date and End Date: use TODAY()-X where X is the number of days previous to define the start and end dates for each report. Each report will contain the user count for one day. Report Users-1 will use TODAY()-1 for start and end, etc.
Metrics: enter the metrics e.g. ga:users and ga:new users
Create the reports
Use 'Run reports' to have the result sheets created and populated.
Create a sheet for an interim data set you will use as the basis for the average calculation. The first column is date, the remaining columns are for the metrics, in this case Users and New Users.
Populate the interim data set with the dates and values. You will reference the Report Configuration to get the dates, and you will pull the metrics from each of the individual reports. At this stage you have a sheet with date in first columns and values in subsequent columns with a row for each day's values. Be sure to use a header.
Finally, create a sheet that averages the values in the interim data set. This sheet will have a column for each metric, with one value per column. The one value is calculated from the series in the interim data set, for example =AVG(interim_sheet_reference:range) or any other calculation you'd like to do.
At last, you can use Data Studio to connect to this data source and use the values. For counts of users such as this example, you would use Sum as the aggregation field type when you are creating the data source.
It's super ugly but it works.
I am having trouble working out the logic to this little scenario. Basically I have a data set and it is stored on weeks of the year and each week the previous weeks data is deleted from the data set. What I need to do is copy the previous weeks data before its removed from the data set and then add it back after it's removed. So for example if today is week 33, I need to save this and then next week add it back in. Then next week I need to take week 34 and save that to add in during week 35. A picture explains better than a thousand words so here it is.
As you can see I need the minimum week from the data set before I add the previous weeks data. The real issue that I'm finding is that the dataset can be rerun more than once each week so I would need to keep the temp data set until the next week while extracting the Minimum weeks data set.
It's more logic I'm after here...Hope it makes sense and thanks in Advance.
QVD's are the way forward! Although maybe not as another (very good) answer states.
--Load of data from system
Test:
Load *
, today() as RunDate
From SourceData
--Load of data from QVD
Test:
Load *
From Test.QVD
--Store current load into QVD
Store Test into Test.QVD
This way you only have one QVD of data that continually expands.
Some warnings
You will need to bear in mind that report runs multiple times a week so you will need to cater for duplication in the data load.
QVD loads aren't encrypted, so put your data somewhere safe
when loading from a QVD and then overwriting it, if something goes wrong (the load fails) you will need to recover your QVD so make sure your backup solution is up to the task.
I also added the RunDate field so that it is easier for you to take apart when reviewing as this gives you the same split as storing in separate QVD's would.
Sounds like you should store the data out into weekly QVD files as part of an Extract process and then load the resulting files in.
The logic would be something like the below...
First run (week 34 for week 33 data):
Get data for previous week
Store into file correctly dated - e.g. 2016-33 for week 33 of 2016
Drop this table
Load all QVDs (in this case just 1)
Next week run (week 35 for week 33 & 34 data):
Get data for previous week
Store into file correctly dated - e.g. 2016-34 for week 34 of 2016
Drop this table
Load all QVDs (in this case 2)
Repeat run next week (week 35 for week 33 & 34 again data):
Get data for previous week
Store into file correctly dated - e.g. 2016-34 for week 34 of 2016 (this time overwrite it)
Drop this table
Load all QVDs (in this case 2)
Sensible file naming solve the problem, but if you really actually need to inspect the data to check the week number, you would need to first load all existing QVDs, query the minimum week number and take it from there probably.
What am I doing wrong in this query?
SELECT * FROM TreatmentPlanDetails
WHERE
accountId = 'ag5zfmRvbW9kZW50d2ViMnIRCxIIQWNjb3VudHMYtcjdAQw' AND
status = 'done' AND
category = 'chirurgia orale' AND
setDoneCalendarEventStartTimestamp >= [timestamp for 6 june 2012] AND
setDoneCalendarEventStartTimestamp <= [timestamp for 11 june 2012] AND
deleteStatus = 'notDeleted'
ORDER BY setDoneCalendarEventStartTimestamp ASC
I am not getting any record and I am sure there are records meeting the where clause conditions. To get the correct records I have to widen the timestamp interval by 1 millisecond. Is it normal? Furthermore, if I modify this query by removing the category filter, I am getting the correct results. This is definitely weird.
I also asked on google groups, but I got no answer. Anyway, for details:
https://groups.google.com/forum/?fromgroups#!searchin/google-appengine/query/google-appengine/ixPIvmhCS3g/d4OP91yTkrEJ
Let's talk specifically about creating timestamps to go into the query. What code are you using to create the timestamp record? Apparently that's important, because fuzzing with it a little bit affects the query. It may be relevant that in the datastore, timestamps are recorded as integers representing posix timestamps with microseconds, i.e. the number of microseconds since 1/1/1970 UTC (not counting leap seconds). It's also relevant that dates (i.e. without a time) are represented as midnight, i.e. the earliest time on that day. But please show us the exact code. (It may also be important to show the actual content of the record that you're attempting to retrieve.)
An aside that is not specific to your question: Entity property names count as part of your storage quota. If this is going to be a huge dataset, you might pay more $$ than you'd like for property names like setDoneCalendarEventStartTimestamp.
Because you write :
if I modify this query by removing the category filter, I am getting
the correct results
this probably means that the category was not indexed at the time you write the matching records to the data store. You have to re-write your records to the data store if you want them added to the newly created index.
I need some help with the following:
I am setting up a booking system (kind of hotel booking) and I have inserted a check in date and a check out date into database, how do I go to check if a room is already booked?
I have no clue how to manage the already booked days in the database. Is there anyone who can give me a clue how this works? And maybe where I can find more information?
Well, I didn't understand very well your question, but my suggestion is to you to add a state field, in which you can control the current state of the "booked" item. something like
Available
Under Maintenance
Occupied
Or whatever bunch of states that work for you.
[EDIT]
The analysis that I use to do for that case is as follows:
Take for instance, your room is currently booked with these date range:
Init Date: Feb 8
End Date: Feb 14
Success Booking Examples
Init Date: Feb 2
End Date: Feb 6
Init Date: Feb 15
End Date: Feb 24
You should check that the booking attempt satisfies these conditions:
Neither "Booking Init Date" nor "Booking End Date" can be inside of the already booked date.
Example:
Init Date: Feb 2
End Date: Feb 10 (Inside the current range (Feb 8 to 14))
Init Date: Feb 12 (Inside the current range (Feb 8 to 14))
End Date: Feb 27
if "Booking Init Date" is less than current init date, "Booking End Date" should also be less than current init date
Init Date: Feb 2.
End Date: Feb 27 (Init date before, but end date later)
This is an interesting question - not least because I don't believe that there is a single ideal answer as it will depend to some extent on the nature of the "Hotel" and the way in which people are placed in rooms and to a further extent on the scale of the question.
At the most basic level you have two ways that you can track occupancy of rooms:
You can have a table with one row per room per day that defines the state of that room on that date (and possibly the related booking if occupied)
For each room you maintain a list of "bookings" (which as already suggested will have to include states for when a room is unavailable for maintenance).
The reason that its an interesting question is that these both immediately present you with certain challenges when maintaining the data and searching for occupancy in the case of the former you're holding a lot of data that may not be needed and in the case of the latter finding gaps in the occupancy for new bookings is perhaps more interesting.
You then (in both cases) have a futher problem of resource allocation - if your bookings tend to be for two days and you system results in 1 day gaps between bookings you're going to need to re-arrange the occupancy to optimise usage... but then you have to be careful with bookings that need to be explicitly tied to specific rooms.
Ideally you would defer assigning a booking to a room for as long as possible (which is why it depends on the hotel - fine for 400 modular rooms rather less so for a dozen unique ones) - so long as there are sufficient rooms of the necessary standard available (which you invert, so long as there are fewer rooms booked than real rooms) during a target period you can take the booking. Of course you've still then got to represent the state of the rooms so this is in addition to the data you've got to manage.
All of which is what makes it interesting - there is considerable depth to the problem, you need to have a fairly decent understanding of the specific problem to produce an appropriate solution.
I have come to the booking problem from the perspective of avoiding a highly populated table, giving that the inventory is thousands rather than hundreds.
solution one - intervals
solution two - populate slots only when are occupied using the smallest unit (1 day)
solution three - generate slots in advance for each resource and manage the status
Solution one has smallest size footprint but since you cannot guess if your searched range is already in an interval or not - you have to read and compare the whole table.
Solution two solves this problem and you can search for a specific time-frame only but the table contains more lines. However since the empty slots are not written anywhere a high vacancy will reduce the size of the table.
Another advantage is that old bookings can be transferred to a separate table.
Solution three increases the size of the table to a maximum minSlotresourcetime and the lines are generated in advance. The only advantage that I can think of is the cost of finding empty slots with a simple select.
However generating the slots in advance looks like a terrible idea to me.