I have 2 Tables with the following fields:
Table_1:
Date, Month, Dayname
2013-01-01, January, Tuesday
This goes from 2013-01-01 to 2013-12-31
Table_2:
Workers_ID, Monday, Monday_Hours, Tuesday, Tuesday_Hours and so forth
1, TRUE, 8, TRUE, 8 ...
2, False, 0, FALSE, 4
What i need is a Table:
Date, Month, Dayname, Hours_to_work, Workers_ID
This Table has Number of workers x Days a year rows and in each row you can see how many hours a specific worker works.
My problem, i have no clue how to accomplish that. It would be greate if someone could help me with it.
Ian P gives good advice as far as revising your schema.
If you're absolutely stuck with the current design, here's a strategy you could use:
SELECT Date, Month, Dayname, Workers_ID,
CASE Dayname
WHEN 'Monday' THEN Monday_Hours
WHEN 'Tuesday' Then Tuesday_Hours
-- etc...
END AS Hours_to_work
FROM Table_1 CROSS JOIN Table_2
Well there are a few things here, but its all part of the learning curve! First, there are functions within MS SQL to provide Month an Dayname, look up DateName and DatePart so if you havnt already, these should be calculated columns on a DateTime or smallDate column. Second, Table_2 is a not an ideal design, in general, tables should be narrow, indexes wide. So Table2 should be of the form WorkersID, Date, Hours and may well have a covering index, or at worst a composite clustered index on Date, EmployeeID. It is then (relatively) easy to produce well perfoming Queries on your data.
If you maintain your structure I would also say a pivot would not help you, I think you want a union against date and employeeId if you maintain your structure. MS SQL also now contains functionality to produce a sequence of values in a range (contiguous range of dates) http://www.sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1
Related
Is there a way to create a table with the following?
Label
“week #1: 1/1/18 - 1/7/18”
“week #2: 1/8/18 - 1/15/18”
And so forth?
Basically, I’m looking for the week number and the date range that week includes.
I think what you want as a starting point, is a "date dimension" or "calendar table". Here's one of many examples for creating them (creating them is not really the issue though, it's how you use them that's more important).
In your example, it looks like you want to pivot the data (create a crosstab). As a rule of thumb, you're generally better off pivoting on the client application, than you are persisting that denormalised anti-pattern in a relational database.
Here's a fictitious example:
DECLARE #start_date as datetime = '20180301';
DECLARE #end_date as datetime = dateadd(dd,datediff(dd,0,GETDATE()),0);--midnight last night
SELECT cal.week_starting --The date of the start of the week eg 15 April 2018.
,dateadd(d,6,cal.week_starting) as week_ending -- The date of the last day of the week eg 21 April 2018. You can cast as varchar, format and concatenate to the previous field to suit yourself.
,my_events.my_category
,count(*) as recs
FROM my.CALENDAR cal
JOIN dbo.big_list_of_events my_events ON cal.census_dttm = my_events.event_date
WHERE my_events.event_date >= #start_date
and my_events.event_date < #end_date
GROUP BY cal.week_starting
,my_events.my_category
ORDER BY cal.week_starting
,my_events.my_category
;
Once you get to this point you're ready to query it with your client application (eg Pivot Tables in Excel) and slice and dice to your heart's content. Again, you probably don't want data stored in your db as a crosstab.
I have these two columns on separate tables:
A column on table 1, which shows the year and the week. There are duplicates as there are different sets of data belonging to each row in this column on the right side of the table, but I have left this out to make the problem clearer:
|.Year Week.|
|..2015 52..|
|..2015 52..|
|..2015 53..|
|..2016 01..|
|....etc....|
|....etc....|
|..2017 23..|
|..2017 23..|
|..2017 23..|
|..2017 24..|
|..2017 25..|
|..2017 25..|
Then there's this column on table 2, which shows chronologically the length of a week as you go down the years in 1-week increments. This is more like a calendar table/database:
|----Week-----|
|080115_140115|
|150115_210115|
|220115_280115|
|.....etc.....|
|.....etc.....|
|090217_150217|
|160217_220217|
|230217_010317|
My question is how do I most efficiently join the weekly column in table 2 with the year week column in table 1 without hard coding any join statements
where [year week] = 'yyyy ww'
etc etc?
I was thinking a joint statement at first but realised I have no common column in both tables.
I have to keep the formats for both tables the same. Was thinking, is there away to extract the weeks from the |year week| column into a separate column then join it that way?
I'm not sure where to begin.
Try extracting the year , week from table 1 using string functions and join with the table 2 year and week. Extract year and week from table 2 as shown below.
year is
select DATEPART( year,CONVERT(DATE,SUBSTRING(SUBSTRING('080115_140115',0,7),5,2) + SUBSTRING(SUBSTRING('080115_140115',0,7),3,2) +SUBSTRING(SUBSTRING('080115_140115',0,7),0,3),102))
week is
select DATEPART( wk,CONVERT(DATE,SUBSTRING(SUBSTRING('080115_140115',0,7),5,2) + SUBSTRING(SUBSTRING('080115_140115',0,7),3,2) +SUBSTRING(SUBSTRING('080115_140115',0,7),0,3),102))
Ok, so this might not be the most efficient answer, but i did a select distinct statement for table 1 then joined table 2 through excel.
Since this is a static/historical table (and i only need to do it once) i imported the template/data from excel into sql again and used year week as the common column to join the two tables.
How I can calculate the working time in SQL Server between two datetime variables, excluding the holidays?
Any ideas?
Holidays aren't universal - they depends very much on your location. Not even the fact which days of the week are "working" days is the same - it depends on your location.
Because of that, a general, universal answer will not be possible, and for that reason, there's also no system-provided function in T-SQL for doing this. How would SQL Server know what holidays you have in your corner of the world??.
You need to have a table of your holidays somewhere in your system and handle it yourself.
Some posts that might be of some help to you:
Calculate Number of Working Days in SQL Server: this just basically removes any Saturdays and Sundays - but doesn't include other holidays
How do I count the number of business days between two dates? : shows the same main approach, with the addition of a table that contains other holidays like Easter, 4th of July (US National Holiday) and so on
Like marc_s says, you currently need a custom solution. I really hope Microsoft adds some standard functionality: it's tough to get right, and holidays are pretty much standardized by location.
Here's an example:
declare #start_date datetime
declare #end_date datetime
set #start_date = '2010-12-20'
set #end_date = '2010-12-26'
-- A table with all non-working days. This just adds Christmass, but you
-- probably should add weekends as well.
declare #non_working_days table (dt datetime)
insert #non_working_days values ('2010-12-25'), ('2010-12-26')
-- Remove the time part
set #start_date = DATEADD(D, 0, DATEDIFF(D, 0, #start_date))
set #end_date = DATEADD(D, 0, DATEDIFF(D, 0, #end_date))
-- Find the number of non-working-days
declare #nwd_count int
select #nwd_count = count(*)
from #non_working_days
where dt >= #start_date and dt < #end_date
-- Print result
select datediff(DAY, #start_date, #end_date) - #nwd_count
This prints 5, because the 25th is not a working day.
Have a table which has a row for every date you're interested in, and, say, a "working hours" column, or just a "working day" indicator if you want to do it at day granularity. (I find this approach makes the final SQL simpler, plus enables all sorts of other useful queries, but then I'm into data warehousing, rather than operational databases, so you may find the "just list the holidays" approach better, depending...)
You will, of course, have to create that table yourself, working from some feed of holiday dates for the region you're interested in.
Typically you can project these forward at least a year, as most public holidays are agreed a long way in advance (though there are some that pop up at the "last minute" -- in the UK, for example, 29 April will be an extra public holiday in 2010, as there's a royal wedding taking place, and we got less than a year's notice of that.
Then you just
SELECT
SUM(working_hours)
FROM
all_dates
WHERE
the_date BETWEEN #start_date AND #end_date
If you want to do this internationally, it gets incredibly difficult to get your data; there's no sensible source that I know of for international holiday dates, and different regions in a "country" might have different dates -- e.g. you may know that someone's in the United Kingdom, but unless you know if they're in Scotland or not, you won't know if the first two days of the year are a public holiday, or just the first...
I have an appointments table with appointments for a number of 'resources'
what i need to do is query that and return (for a particular resource) all free appointment slots across a date range.
i had thought the best way to approach this would be to generate a temp table of possible appointment times (as the length of appointment may be 30/60/90 minutes - the appointment length would be specified for the query.) and then select the intersect of those two recordsets. i.e. all of those - across the date range - where there are NOT appointments in the appointments table. thus returning all possible appointments for that resource.
or maybe just - again - generate the records of possible appointment datetimes, and then except the actual appointments already booked..?
unless of course someone can suggest an easier option.?
also not entirely sure how to generate the table of possibles ie a table with records for 2010-12-08 09:00, 2010-12-08 10:00, and so on (for 1 hr appointments)...
any ideas?
edit: have a vague idea on the possibles...
DECLARE #startDate DateTime
DECLARE #EndDate DateTime
set #startDate = '2010-12-08 09:00'
set #endDate = '2010-12-11 09:00';
with mycte as
(
select cast(#startDate as datetime) DateValue
union all
select dateadd(mi,30,DateValue)
from mycte
where DateValue <= #endDate
and datepart(hh, dateadd(mi,30,DateValue)) Between 9 AND 16
)
select DateValue
from mycte
This is a classic gaps and islands problem. It essentially a common problem where you need to identify missing values (gaps) in a sequence. Fortunately there is a free sample chapter on this very topic from the Manning book, SQL Server MVP Deep Dives. Hopefully it will provide inspiration as it gives guidance on a number of possible approaches.
http://www.manning.com/nielsen/SampleChapter5.pdf
Here is Itzik Ben-Gan's description of the problem, quoted from the above chapter.
Gaps and islands problems involve
missing values in a sequence
... The sequences involved can also
be temporal, such as order dates, some
of which are missing due to inactive
periods (weekends, holidays). Finding
periods of inactivity is an example of
the gaps problem, and finding periods
of activity is an example of the
islands problem.
I don't really know how to handle facts that happened over a period of time. I sually deal with facts that happened on a specific date.
In clear, my facts have a start_date and an end_date. So, let's say my start_date is 01/01/2008 and my end_date is 01/01/2011. I need to get the number of those facts that happened in 2009 and those that happened this year. The same fact can have happened on both years. The way to determine a fact is part of 2009 is to check for 12/31/2009.
I was thinking about a StartDate and EndDate dimensions, using date ranges (so from the first date of my StartDate dimension to 12/31/2009 and from 12/31/2009 to the last date in my EndDate dimension). I would cross join those.
I tried it, it works, but it's REALLY slow.
Any thoughts?
I found the solution to what I wanted. David and Chris for the anwsers tho! Here's what I wanted to achieve, but I was lacking MDX syntax :
SELECT [Measures].[NumberX] ON COLUMNS
FROM [MyCube]
WHERE ([START DATE].[REFDATE].FirstMember:[START DATE].[REFDATE].&[2009-12-31T00:00:00],
[END DATE].[REFDATE].&[2010-01-01T00:00:00]:[END DATE].[REFDATE].LastMember)
Pretty simple. I think my question was not clear, that's why I got different answers ;-)
You can always use a date range with the two time dimensions like:
Select [start date].[year].[2009]:[end date].[year].[2010] on 0
from cube
If I'm understanding the question correctly. Two time dimensions should work together fine. I have two in a project I'm doing at work and they work rather fast together. Make sure that you set them up in dimension usage section of the cube so you can differentiate the two dates.
You just need one Date dimension DimDate. Your fact table can have 2 foreign keys to the DimDate, one for startdate and one for enddate.
FactTable
{
FactID int,
StartDate int,
EndDate int
-- Other fields
}
DimDate
{
DimDateID int,
Year int,
Month int,
Day int,
-- Other fields if needed
}
To get all facts that fall on the year 2009, use
SELECT f.FactID FROM FactTable f
INNER JOIN DimDate dStart ON dStart.DimDateID = f.StartDate
INNER JOIN DimDate dEnd ON dEnd.DimDateID = f.EndDate
WHERE dStart.Year <= 2009
AND dEnd.Year >= 2009