Visualize attendance over time data in a meaningful way - database

I have data that looks like this
+-----------+-------------+----------+------------+------------+
| Date | Time | Initials | Location 1 | Location 2 |
+-----------+-------------+----------+------------+------------+
| 8/26/2019 | 11:00:00 AM | BI | 39 | 40 |
| 8/26/2019 | 1:30:00 PM | Kk | 12 | 2 |
| 8/27/2019 | 2:30:00 PM | BH | 18 | 37 |
| 8/28/2019 | 3:30:00 AM | BH | 23 | 29 |
+-----------+-------------+----------+------------+------------+
The output should be something very similar to the Google Maps "Popular Times" graph.
I would like to be able to visualize
A graph for each location in this style (attendance over time via hour), which is the average attendance per day of the week
I would also like to be able to specify a given date ex: 8/26/2019 and pull up the exact data for that date
So I figure either there can be a different graph for every location, or maybe have the various locations data show as different colored bars on the graph.
Ultimately I have this data in a spreadsheet and I'm not sure what would even be the best tool to use to report this data. I looked into data studio and google analytics and just using charts inside the sheet.
However the issue seems to be:
Since the data can be both various dates and various times. I'm not sure how or which tools to use to group the data by a given day, or average the data for a given day of the week. I tried using pivot tables but I'm not sure how to report based on that.

which tools to use to group the data by a given day, or average the data for a given day of the week
=QUERY(QUERY(A2:E,
"select A,count(A),sum(D),sum(E),sum(D)+sum(E),avg(D),avg(E),avg(D)+avg(E),max(D)+max(E),min(D)+min(E)
where A is not null
group by A", 0),
"offset 1", 0)
=QUERY(A2:E,
"select A,count(A),sum(D),sum(E),sum(D)+sum(E),avg(D),avg(E),avg(D)+avg(E),max(D)+max(E),min(D)+min(E)
where A is not null
group by A
pivot C", 0)
need to figure out how to take this input and arrange by Day of the week
=ARRAYFORMULA(IF(A2:A, TEXT(A2:A, "ddd"), ))
Also by hour instead of just by date
=ARRAYFORMULA(IF(A2:A, TEXT(TIME(HOUR(B2:B), 0, 0), "hh:mm:ss"), ))

Related

How to write a SQL for a calculation based on incremental window of batch table

My requirement is to calculate based on an incremental size window for a batch table.
For example, the first window has 1 row, the second window has 2 rows(including 1 row from the 1st window and a new row), then 3 rows in the 3rd window(including 2 rows from the 2nd window and a new row), and so on.
For example:
Source table:
datetime | productId | price |
3-1 | p1 | 10 |
3-2 | p1 | 20 |
3-3 | p1 | 30 |
3-4 | p1 | 40 |
Result table:
datetime | productId | average|
3-1 | p1 | 10/1 |
3-2 | p1 | (10+20)/2 |
3-3 | p1 | (10+20+30)/3 |
3-4 | p1 | (10+20+30+40)/4 |
I am trying to find a way to implement this requirement with Sql, to me seems the OVER action can do that but not yet implemented in flink, so I need an alternative way.
BTW:
I tried to use a TUMBLE window of 1 day and store the previous value in the user defined aggregation object but failed as the aggregation object will be reused by all product not a single object for each product
The OVER clause on a batch table is not supported by Flink's SQL yet. You can track the status of this effort here.
However, did you consider to implement this behavior on a streaming table instead? Streaming tables can also read from static files such as CSV files and many operations are supported there as well. This depends on the other operations you want to use in your query, though.

SSAS - MDX calculated member

I've a fact table that details individual line amounts for orders placed by my organisation. In this fact, at line level, I've included the total order amount to be used, as it's possible we might need that level of detail at some point.
Here's an example of what I've got:-
+------------+------------+---------------+------------+---------------------+
| BookingKey | Booking_ID | Category_FKey | Line_Value | Total_Booking_Value |
+------------+------------+---------------+------------+---------------------+
| 1 | 12 | 8 | 150 | 700 |
| 2 | 12 | 4 | 150 | 700 |
| 3 | 12 | 5 | 300 | 700 |
| 4 | 12 | 4 | 100 | 700 |
+------------+------------+---------------+------------+---------------------+
As you can see, the Total_Booking_Value here is the sum of the Line_Value for the booking in the example (Booking_ID = 12).
The Category_FKey looks up to a Categories dimension.
Using this structure I've created a simple cube and this works fine, mainly.
The issue I have is that I'd like to be able to view the Total Line_Value amount, and somehow include the Total_Booking_Value alongside it.
So, for example I might add the Categories dimension as a filter and want to filter by say Category_FKey = 4.
If this was the case I'd want the aggregates to tell me that the total Line_Value was 250 (for BookingKeys 2 and 4), and the Total_Booking_Value should be 700. Using normal aggregation (ie SUM) I'm getting the Total_Booking_Value as 1400 (obviously - because it's adding 700 * 2 for the two rows the cube would return).
So, the way I see it I'd like to create an MDX calculation that somehow takes the Total_Booking_Value and gives just the value for the Booking in question.
Should this be done using some kind of average, or division by the Distinct number of items? I can't figure this out. I tried something like this:-
create member currentcube.measures.[Calculated Booking Value]
as
[Measures].[Total_Booking_Value] / count(Measures.Booking_ID);
But this isn't working.
Hopefully this makes sense and you can point me in the right direction.
I find it strange that booking_ID is a measure - intuitively it strikes me as something that would be an attribute and therefore a hierarchy - in which case you'd be able to do the count like this:
[Measures].[Total_Booking_Value]
/
COUNT(EXISTING [Booking].[Booking_ID].[Booking_ID].members)
A straightforward solution would be to have two fact tables: one with granularity booking key and one with granularity booking id. The first would contain all columns except total booking value, and the second would contain columns booking id and total booking value.
Then each of both measures would easily be summable.
The reference type between the second fact table and the category dimension could be configures as many-to-many via the first fact table. Thus, you would see the full values of the involved bookings for each selected category, automatically eliminating double counting.

Pulling ill-formatted data in Libre Calc: What Function will work with this?

I am working on a project where I am pulling tables from a Fandom Wikia page and feeding it into a spreadsheet named 'WikiPullSheet'. The data in the wiki tables is irregular in format; sometimes using multiple rows for the same entry.
Here is an example of some rows as described above from the sheet:
Name | Power | Stamina | Agility
Townsman Shield | 2 | 1 | 2
Starter | | |
Broken Shield | 4(+1) | 2(+1) | 2(+1)
Z1 | | |
Heater | 2(+1) | 4(+1) | 2(+1)
Z1 | | |
Wood Elf Shield | 2(+1) | 2(+1) | 4(+1)
Z1 | | |
Shiv | 4 | 4 | 3
Z1 Shop | | |
Deimos* | 26 | 16 | 26
| 34 | 22 | 34
I want the sheet to auto-update from the wikia page but this format will not allow me to reference items as the sheet expands. For instance, if on another sheet I want to have a drop down list of all the names for items in this list, I would be referencing the blank and starter cells even though they are not actually unique items in the table. I have done research on VLOOKUP, COUNTIF, REGEX options, MATCH, and more, but none of these seem to work for the issue I am having.
How would I take this input and either create a formula to reformat it or pull from the sheet as is and use the columns appropriately for a drop-down box containing only the item names from the NAME column?
Desired Output:
I need the data to end up formatted with each row representing a different unique item. Since the information is pulling with rows that contain location of the item in the name column (Z1 for instance), this is proving difficult. I could simply remove the rows that cause problems such as 'Z1' & 'Z1 Shop' in the above example, however this does not help when an item has multiple upgrade paths like in the case of the 'Deimos' row entry.
If you insert a pivot table (there is a icon to do so, select ColumnA first) based on ColumnA (assuming that is where Name is to be found) you should get something like:
It is far from a complete solution (you don't show what the desired output should be) but I thought a sorted list, with each entry unique and the blanks at least out of the way, might have been a start.

TM1 - passing data between cubes by linking a measure in the target cube to a dimension in the source cube

TM1 - linking measures to dimensions
I have two cubes in TM1, and I am trying to source data from one cube by linking a calculated 'Age' field in the target cube to an 'Age' dimension in the source cube. However, while I can do this fine by writing code in the rules editor, I cannot work out how to do it using the rules Wizard. Unfortunately, policy in my company is that all TM1 models must be based around wizard-based rules, so I am hoping someone could explain how to do this via the wizard.
Cube 1 (the source) contains data on how quickly a loan balance reduces due to customer attrition, based on the loan's age in months - it looks a bit like this:
Age | Attrition %
-------|--------------
1 | 5%
2 | 6%
3 | 7%
Cube 2 (the target) contains a loan balance, and calculates how the balance reduces over several months, based on the data in Cube 1. It looks up the data in cube 1, based on the age that is calculated in the first row of Cube 2, on the basis of:
current month - start month + 1.
So if we assume the loan started in July, for August the age would be:
8 - 7 + 1 = 2 months old.
For the loan starting in July, Cube 2 would look a bit like this:
| Jul | Aug | Sep |
----------------|-------------------
Age | 1 | 2 | 3 |
Opening Balance | $100| $95 | $89 |
Attrition % | 5% | 6% | 7% | <-- sourced from Cube 1 on basis of Age
Attrition $ | -$5 |-$5.7|-$6.3|
Closing Balance | $95| $89 | $83 |
Creating this link is trivial in Excel, but whenever I try to do it using the TM1 Rules Wizard, I run into the problem that TM1 does not seem to allow the linking of a dimension ('Age' in cube 1) to a field within a dimension ('Age' in cube 2).
Can anyone advise?

Populate weekly calendar with available slots dynamically

I want to create a weekly calendar which shows like this
< Prev Week OCTOBER 2015 Next Week >
Sun 18 |Mon 19 |Tue 20|Wed 21 |Thu 22|Fri 23 |Sat24
11A.M.-12A.M.||10A.M.-11A.M.| |4P.M.-5P.M.| |11A.M.-12A.M.| |
3P.M.-4P.M. |1P.M.-2P.M. | | | | | |
|3P.M.-4P.M. | | | | | |
|5P.M.-6P.M. | | | | | |
These are the time slots in which a person is available in a week.The number of slots and time at which the person is available varies from day to day.I need to populate the time slots from a json data. Each slot is like a button in that day which can be clickable if that slot is available else it will be disabled.I searched for calendars, but everywhere i am getting calendars with time slots on the left side which i don't wan't. Someone please suggest me how to do this? Thanks in advance.
After too much of struggle i realised that this can be done by customizing the full calendar plugin. http://fullcalendar.io/

Resources