Best way to store quarter and year in SQL Server? - sql-server

What would be the best way to store Quarter and Year in database? I have payments table and I need to assign quarter/year so that it's easy to tell for which quarter the payment was made.
I was thinking on:
a) adding two int columns to each payment
b) adding another table and add possible values up to 5 years ahead and use the ID to join that table with payments one.
What are other options? Maybe some is better or/and easier to maintain. This database will be used with C# program.

If you have to use separate year and quarter instead of a date (since you seem to have specific reporting requirements), I would go for a tinyint for quarter and smallint for year and store them in the PAYMENT table itself.
I would not store it in a different table. This is bad since:
You have to make sure you have produced enough years/quarters
You have to join and use a foreign key
If you store the data with the record, it will help performance on reads. Your table could be small but it is always good to keep in mind performance.
WHY
Let's imagine you need to get
all payments in specific quarter
where payment has been more than
specific amount and customer is a
particular customer
In this case, you would need a covering index on all items and still does not help since your query is for specific quarter and not quarter year. Having the data on the table, however, will help with lighter execution plan.

I've always just used datetime value with the 1st of January/April/July/October representing each quarter. Makes computation of the start/end dates of the quarter simple:
Start Date: the datetime column itself.
End Data: dateadd(month,3,quarterColumn)
Another alternative would be ISO 8601. Here's an ISO 8601 profile for use in Internet protocols: RFC 3339 (proposed standard).
An ISO 8601 representation of each quarter of the year 2011 looks like this:
2011-01-01/P3M
2011-04-01/P3M
2011-07-01/P3M
2011-10-01/P3M
The above specify a duration by starting date and duration (in this case, 3 months).
The advantage of ISO 8601 date/time formats is that the strings are (A) human readable, (B) they collate properly, (C) they're easy to parse and (D) its an international standard.
Some people "extend" ISO 8601's week notation, where a week of the year looks like 2011W32 (the 32nd week of 2011), to a quarter notation. Using this unofficial extension, the quarters of the year 2011 looks like:
2011Q1
2011Q2
2011Q3
2011Q4

How about using computed columns based on the payment date? I'd rather do this than have both a date and quarter/year that might get out of sync. On the other hand, I suppose it's possible that you may need the ability to have a different year/quarter than the date indicates in which case you'd need to keep them separate. I'd at least think about using computed columns though as that seems the best way to ensure integrity.

For something so simple, I would just keep 2 int columns, and to build up the (pivotal) dates using dateadd when required to use date ranges.
Another option is a single date column, for which you can store the first day in the quarter, so the 4 dates in a year would be 1-Jan, 1-Apr, 1-Jul, 1-Oct. You can extract the quarter, year easily using datepart Q and Y.

How about two ints, one for the year, and one for the quarter (1-4). Is that what you meant by option "a"?
Option "b" would work, but you have to remember to maintain the table every year or so.

I agree two ints are fine.
I would add an index consisting of both columns in case you need to sort or filter by year and quarter.

You could even use a single tinyint. It's enough for storing in the form YYQ,like 111, 112,113,114, 121...for a few years.

Storing quarter and year in database depends on how your payment data is being organized. Examples would be; how many different payment values are being inserted. Will the quarter/year ranges vary? etc.
One good technique for "defining" a quarter/year range is making a separate table with a "DateTime" field that identifies a quarter. You don't need to join the table, you just need to do programming in C# to figure out if the range falls within a particular pay quarter.
For example:
Table 1: Payments
-----------------
paymentID (int)
paymentAmount (double(7,2))
paymentDateTime (DateTime)
Table 2: QuarterYear
--------------------
quarterYearID (int)
dateFrom (date)
dateTo (date)
quarter (tinyint)
description (varchar)
Example Data
paymentID | paymentAmount | paymentDateTime
------------------------------------------------
1 | 20.24 | 2011-04-18 08:14:20
2 | 34.15 | 2011-04-19 07:42:15
3 | 51.87 | 2011-04-20 13:04:22
quarterYearID | dateFrom | dateTo | quarter | description
-----------------------------------------------------------------
1 | 2011-01-01 | 2011-03-31 | 1 | first quarter
2 | 2011-04-01 | 2011-06-30 | 2 | second quarter
3 | 2011-07-01 | 2011-09-31 | 3 | third quarter
4 | 2011-10-01 | 2011-12-31 | 4 | forth quarter
Example Query for getting all payments for "Quarter 2"
dateValue is a dynamically pulled variable from the payments table. C# will handle 'dateValue' value.
SELECT quarter FROM QuarterYear WHERE cast('dateValue' AS date) BETWEEN dateFrom AND dateTo;

Related

Tableau remove duplicates based on a condition

I am trying to remove duplicates from the Ticket field in my database but I want to remove the duplicates that have older dates. example,
Ticket | Date
MG17000 | 1/1/2017
MG17000 | 1/1/2018
MG17010 | 1/1/2018
so I want the answer to be
MG17000 | 1/1/2018
MG17010 | 1/1/2018
I used countd(Ticket) but it does not remove the right tickets(it removes the ticket that corresponds to 1/1/2018 instead of 1/1/2017). any suggestions on how to perform this task.
Thanks!
Try this:
Create formula [Rank - Date] with below code:
RANK_UNIQUE((MAX(SPLIT([database field],'|',2))))
//This will create a values for every ticket
Now one more formula to filter only date with max value and drag to filter and select True
[Rank - Date]=1
You should be able to get required data
Use a level-of-detail (LOD) calculation. Create the calculation with this formula and it will give you the number of records per ticket, regardless of what dimensions you have on rows and shelves.
{FIXED [ticket] : count([date])}
If you have any date filtering and you want the calculation to count tickets outside the date filter range, switch FIXED to INCLUDE.
Drag that as one of you measures. Then use the max([date]) to show the most recent date.
From the sample data you showed in the question, you will see something like
MG17000 1/1/2018 2
MG17010 1/1/2018 1

Calculate daily targets based on monthly targets Sales Power bi

I had the following question which I just can't wrap my head around it to do it in a neat way.
I want to create a line graph with three lines. We call it a budget snake.
Created sales orders (black)
Invoiced orders (green)
Daily targets (red)
This per salesperson. The creation of this graph for the created and invoiced orders is easy as these are all on a daily granularity so creating the line graph is easy.
I just struggle how to create/generate such a line for the targets.
In this case, I manually created a table with date - salesperson - daily target
Eg.
Which is very cumbersome. What I would like to be able to do is
create a table on a monthly level for each salesperson and that
PowerBI can "generate/calculate" the daily target in such a way that
I can graph the red line without all the hassle of creating it for
each salesperson manually.
The input would be something like this
+-----------+----------+--------------+--------+----------------+--------------+---------------+
| Date | Month | Salesperson | Branch | Monthly Target | Daily Target | Business days |
+-----------+----------+--------------+--------+----------------+--------------+---------------+
| 1/01/2017 | January | salesperson1 | test | 73529 | 4325 | 17 |
| 1/02/2017 | February | salesperson1 | test | 73529 | 4325 | 20 |
+-----------+----------+--------------+--------+----------------+--------------+---------------+
I have a date dimension table so on my graph I have the date as the x-axis and then the runningorders/runningsales as the y-axis but I would something like a daily runningtarget so that the red line is nicely going with the orders and sales.
I had a look at this pattern but I just cannot figure out how this can generate
a line graph.
https://www.daxpatterns.com/budget-patterns/
So somehow, I guess I would need something which generates this first table with the second table as input. I tried some measures in Dax but none of them give me the cumulative steps for each day. It mostly just shows me the value.
These are the measures I use for the other lines. This works nicely when changing the date filters.
Running sales
RunningTotalSales = CALCULATE(sum(vw_invoice_trn_summary[NetInvoiceValue]),
FILTER(ALLSELECTED(DimTime),DimTime[Date] <= MAX(DimTime[Date])))
Running orders
RunningTotalOrders = CALCULATE(sum(vw_orders_raised[OrderTotal]),FILTER(ALLSELECTED(DimTime),DimTime[Date] <= MAX(DimTime[Date])))
In my current manual solution, the full year though does not work well with the targets line as I am not sure I do it right.
UPDATE
So thinking further about this. It feels like I just need to be able to create a table with a date - daily target - salesperson. based on the monthly targets but not sure how you can do that in power bi. Ideally, you can just add/remove a salesperson and that specific table gets regenerated.
I have two solutions to this. One using DAX and one using the query editor.
DAX Solution:
1. Create a calendar table that has all the dates you need.
If Targets is the table containing your monthly targets, create a new table using a formula like this:
Calendar = CALENDAR(EOMONTH(MIN(Targets[Date]),-1)+1,EOMONTH(MAX(Targets[Date]),0))
2. Create a new table DailyTargets as a cross join of your dates and salespersons.
The CROSSJOIN function creates a row for each date and salesperson combination:
DailyTargets = CROSSJOIN(VALUES('Calendar'[Date]),VALUES(Targets[Salesperson]))
3. Create a calculated column for your daily targets.
I do this by looking up the monthly target and dividing by the number of days in the month:
DailyTarget = DIVIDE(
LOOKUPVALUE(Targets[MonthlyTarget],
Targets[Month], FORMAT(DailyTargets[Date],"mmmm"),
Targets[Salesperson], DailyTargets[Salesperson]),
DAY(EOMONTH(DailyTargets[Date],0)))
Now you have a daily target for each date and each salesperson.
PowerQuery Solution:
1. Create a calendar table that has all the dates you need.
Create a blank query and use the following code:
= List.Dates(List.Min(Targets[Date]),
Duration.Days(Date.EndOfMonth(List.Max(Targets[Date]))
- List.Min(Targets[Date])) + 1,
#duration(1,0,0,0))
2. Convert this list to a table.
Click on "To Table" under the Transform tab and rename the column from "Column1" to "Date".
3. Create a custom column for the month name.
You can use the formulaDate.MonthName([Date]) for this.
4. Merge this query with the Targets table (joining on the Month columns).
5. After merging, expand the Salesperson and MonthlyTarget columns.
6. Create the daily target by dividing the monthly target by the number of days in the month.
You can use the formula [MonthlyTarget]/Date.DaysInMonth([Date]) for this.
The entire query should look like this:
let
Source = List.Dates(List.Min(Targets[Date]), Duration.Days(Date.EndOfMonth(List.Max(Targets[Date])) - List.Min(Targets[Date])) + 1, #duration(1,0,0,0)),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Date"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Month", each Date.MonthName([Date])),
#"Merged Queries" = Table.NestedJoin(#"Added Custom",{"Month"},Targets,{"Month"},"Targets",JoinKind.LeftOuter),
#"Expanded Targets" = Table.ExpandTableColumn(#"Merged Queries", "Targets", {"Salesperson", "MonthlyTarget"}, {"Salesperson", "MonthlyTarget"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Targets", "DailyTarget", each [MonthlyTarget]/Date.DaysInMonth([Date]))
in
#"Added Custom1"
Instead of going step by step, you can just paste this into the Advanced Editor if you'd like. (Just be sure you use whatever table name you have instead of Targets.)
I have a much simpler solution for this:
I just need to be able to create a table with a date - daily target - salesperson. based on the monthly targets
Let's say I have a table like this:
Where Month contains the date for the first day of each month. We then add a custom column "Date" using the query editor menu (Add Column > Custom Column). We paste this formula for our new column:
= List.Dates([Month], Date.Day(Date.EndOfMonth([Month])), #duration(1, 0, 0, 0))
Each row of the new column will contain a list of all dates within that row's month. Expand that column by clicking on the button on the top right corner and choosing "Expand to new rows".
Now you have a row for each day, and you can simply add another custom column, "Daily Target", that divides the monthly target by the number of days in each month:
= [Monthly Target]/Date.DaysInMonth([Date])
And your table is ready:

Calculate Facebook likes, comments, and shares for different time zones from saved UTC

I've been struggle with this for a while and hope someone can give me an idea to tackle this.
We have a service that goes out and collects Facebook likes, comments, and shares for each status update multiple times a day. The table that stores this data is something like this:
PostId EngagementTypeId Value CollectedDate
100 1(for likes) 10 1/1/2013 1:00
100 2 (comments) 2 1/1/2013 1:00
100 3 0. 1/1/2013 1:00
100. 1. 12 1/1/2013 3:00
100. 2. 3. 1/1/2013 3:00
100. 3 5. 1/1/2013 3:00
Value holds the total for each engagement type at the time of collection.
I got a requirement to create a report that shows new value per day at different time zones.
Currently,I'm doing the calculation in a stored procedure that takes in a time zone offset and based on that I calculate the delta for each day. If this is for someone in California, the report will show 12 likes, 3 comments, and 5 shares for 12/31/2012. But if someone with the time zone offset of -1, he will see 10 likes on 12/31/2012 and 2 likes on 1/1/2013.
The problem I'm having is doing the calculation on the fly can be slow if we have a lot of data and a big date range. We're talking about having the delta pre-calculated for each day and stored in a table and I can just query from that ( we're considering SSAS but that's for the next phase). But doing this, I would need to have the data for each day for 24 time zones. Am I correct (and if so, this is not ideal) or is there a better way to approach this?
I'm using SQL 2012.
Thank you!
You need to convert UTC DateTime stored in your column to Date based on users UTC time. This way you don't have to worry about any table that has to be populated with data. To get users date from your UTC column you will use something like this
SELECT CONVERT(DATE,(DATEADD(mi, DATEDIFF(mi, GETUTCDATE(), GETDATE()), '01/29/2014 04:00')))
AS MyLocalDate
The select statement above figures out Local date based on the difference of UTC date and local Date. You will need to replace GETDATE() with users DATETIME that is passed in to your procedure and replace '01/29/2014 04:00' with your column. This way when you select any date from your table it will be according to what that date was at users local time. Than you can calculate other fields accordingly.

Storing occurrences for reporting

What is the best way to store occurrences of an event in a database so you can quickly pull reports on it? ie (total number of occurrences, number of occurrences between date range).
right now I have two database tables, one which holds all individual timestamps of the event - so I can query on a date range, and one which holds a total count so I can quickly pull that number for a tally
Table 1:
Event | Total_Count
------+------------
bar | 1
foo | 3
Table 2:
Event | Timestamp
------+----------
bar | 1/1/2010
foo | 1/1/2010
foo | 1/2/2010
foo | 1/2/2010
Is there a better approach to this problem? I'm thinking of converting Table 2, to hold date tallies, it should be more efficient, since my date range queries are only done on whole dates, not a timestamp (1/1/2010 vs 1/1/2010 00:01:12)
ie:
Updated Table 2
Event | Date | Total_Count
------+----------+------------
bar | 1/1/2010 | 1
foo | 1/1/2010 | 1
foo | 1/2/2010 | 2
Perhaps theres an even smarter way to tackle this problem? any ideas?
Your approach seems good. I see table 2 more as a detail table, while table 1 as a summary table. For the most part, you would be doing inserts only to table 2, and inserts and updates on table 1.
The updated table 2 may not give you much additional benefit. However, you should consider it if aggregations by day is most important to you.
You may consider adding more attributes (columns) to the tables. For example, you could add a first_date, and last date to table 1.
I would just have the one table with the timestamp of your event(s). Then your reporting is simply setting up your where clause correctly...
Or am I missing something in your question?
Seems like you don't really have any requirements:
Changing from timestamp to just the date portion is a big deal.
You don't ever want to do a time-of-day analysis?
like what's the best time of day to do maintenance if that stops "foo" from happening.
And you're not worried about size? You say you have millions of records (like that's a lot) and then you extend every single row by an extra column. One column isn't a lot until the row count skyrockets and then you really have to think about each column.
So to get the sum of event for the last 3 days you'd rather do this
SELECT SUM(totcnt) FROM (
SELECT MAX(Total_count) as totcnt from table where date = today and event = 'Foo'
UNION ALL
SELECT MAX(Total_count) from table where date = today-1 and event = 'Foo'
UNION ALL
SELECT MAX(Total_count) from table where date = today-2 and event = 'Foo'
)
Yeah, that looks much easier than>
SELECT COUNT(*) FROM table WHERE DATE BETWEEN today-2 and today and event = 'foo'
And think about the trigger it would take to add a row... get the max for that day and event and add one... every time you insert?
Not sure what kind of server you have but I summed 1 Million rows in 285ms. So... how many millions will you have and how many times do you need to sum them and is each time for the same date range or completely random?

DATE lookup table (1990/01/01:2041/12/31)

I use a DATE's master table for looking up dates and other values in order to control several events, intervals and calculations within my app. It has rows for every single day begining from 01/01/1990 to 12/31/2041.
One example of how I use this lookup table is:
A customer pawned an item on: JAN-31-2010
Customer returns on MAY-03-2010 to make an interest pymt to avoid forfeiting the item.
If he pays 1 months interest, the employee enters a "1" and the app looks-up the pawn
date (JAN-31-2010) in date master table and puts FEB-28-2010 in the applicable interest
pymt date. FEB-28 is returned because FEB-31's dont exist! If 2010 were a leap-year, it
would've returned FEB-29.
If customer pays 2 months, MAR-31-2010 is returned. 3 months, APR-30... If customer
pays more than 3 months or another period not covered by the date lookup table,
employee manually enters the applicable date.
Here's what the date lookup table looks like:
{ Copyright 1990:2010, Frank Computer, Inc. }
{ DBDATE=YMD4- (correctly sorted for faster lookup) }
CREATE TABLE datemast
(
dm_lookup DATE, {lookup col used for obtaining values below}
dm_workday CHAR(2), {NULL=Normal Working Date,}
{NW=National Holiday(Working Date),}
{NN=National Holiday(Non-Working Date),}
{NH=National Holiday(Half-Day Working Date),}
{CN=Company Proclamated(Non-Working Date),}
{CH=Company Proclamated(Half-Day Working Date)}
{several other columns omitted}
dm_description CHAR(30), {NULL, holiday description or any comments}
dm_day_num SMALLINT, {number of elapsed days since begining of year}
dm_days_left SMALLINT, (number of remaining days until end of year}
dm_plus1_mth DATE, {plus 1 month from lookup date}
dm_plus2_mth DATE, {plus 2 months from lookup date}
dm_plus3_mth DATE, {plus 3 months from lookup date}
dm_fy_begins DATE, {fiscal year begins on for lookup date}
dm_fy_ends DATE, {fiscal year ends on for lookup date}
dm_qtr_begins DATE, {quarter begins on for lookup date}
dm_qtr_ends DATE, {quarter ends on for lookup date}
dm_mth_begins DATE, {month begins on for lookup date}
dm_mth_ends DATE, {month ends on for lookup date}
dm_wk_begins DATE, {week begins on for lookup date}
dm_wk_ends DATE, {week ends on for lookup date}
{several other columns omitted}
)
IN "S:\PAWNSHOP.DBS\DATEMAST";
Is there a better way of doing this or is it a cool method?
This is a reasonable way of doing things. If you look into data warehousing, you'll find that those systems often use a similar system for the time fact table. Since there are less than 20K rows in the fifty-year span you're using, there isn't a huge amount of data.
There's an assumption that the storage gives better performance than doing the computations; that most certainly isn't clear cut since the computations are not that hard (though neither are they trivial) and any disk access is very slow in computational terms. However, the convenience of having the information in one table may be sufficient to warrant having to keep track of an appropriate method for each of the computed values stored in the table.
It depends on which database you are using. SQL Server has horrible support for temporal data and I almost always end up using a date fact table there. But databases like Oracle, Postgres and DB2 have really good support and it is typically more efficient to calculate dates on the fly for OLTP applications.
For instance, Oracle has a last_day() function to get the last day of a month and an add_months() function to, well, add months. Typically in Oracle I'll use a pipelined function that takes start and end dates and returns a nested table of dates.
The cool way of generating a rowset of dates in Oracle is to use the hierarchical query functionality, connect by. I have posted an example of this usage in another thread.
It gives a lot of flexibility without the PL/SQL overhead of a pipelined function.
OK, so I tested my app using 31 days/month to calculate interest rates & pawnshops are happy with it! Local Law prays as follows: From pawn or last int. pymt. date to 5 elapsed days, 5% interest on principal, 6 to 10 days = 10%, 11 to 15 days = 15%, and 16 days to 1 "month" = 20%.
So the interest table is now defined as follows:
NUMBER OF ELAPSED DAYS SINCE
PAWN DATE OR LAST INTEREST PYMT
FROM TO ACUMULATED
DAY DAY INTEREST
----- ---- ----------
0 5 5.00%
6 10 10.00%
11 15 15.00%
16 31 20.00%
32 36 25.00%
37 41 30.00%
42 46 35.00%
47 62 40.00%
[... until day 90 (forfeiture allowed)]
from day 91 to 999, daily prorate based on 20%/month.
Did something bad happen in the UK on MAR-13 or SEP-1752?

Resources