Related
I have a question about the application of the aggregation function that used in pivot function.
The table OCCUPATIONS looks like this:
+-----------+------------+
| Name | Occupation |
+-----------+------------+
| Ashley | Professor |
| Samantha | Actor |
| Julia | Doctor |
| Britney | Professor |
| Maria | Professor |
| Meera | Professor |
| Priya | Doctor |
| Priyanka | Professor |
| Jennifer | Actor |
| Ketty | Actor |
| Belvet | Professor |
| Naomi | Professor |
| Jane | Singer |
| Jenny | Singer |
| Kristeen | Singer |
| Christeen | Singer |
| Eve | Actor |
| Aamina | Doctor |
+-----------+------------+
The first column is name and second is occupation.
Now I want to make a pivot table that each column is one kind of occupation and name is sorted alphabetically and print NULL when no more names for an occupation.
The output should looks like this:
+--------+-----------+-----------+----------+
| Doctor | Professor | Singer | Actor |
+--------+-----------+-----------+----------+
| Aamina | Ashley | Christeen | Eve |
| Julia | Belvet | Jane | Jennifer |
| Priya | Britney | Jenny | Ketty |
| NULL | Maria | Kristeen | Samantha |
| NULL | Meera | NULL | NULL |
| NULL | Naomi | NULL | NULL |
| NULL | Priyanka | NULL | NULL |
+--------+-----------+-----------+----------+
Here the first column is Doctor, second is Professor, third is Singer and fourth is Actor. The code to generate result is
select [Doctor],[Professor],[Singer],[Actor] from (select o.Name,
o.Occupation, row_number() over(partition by o.Occupation order by
o.Name) id from OCCUPATIONS o) as src
pivot
(max(src.Name)
for src.Occupation in ([Doctor],[Professor],[Singer],[Actor])
) as m
But when I replace the table generated from here:
(select o.Name, o.Occupation, row_number() over(partition by o.Occupation order by o.Name) id from OCCUPATIONS o) as src' to 'OCCUPATIONS'
the result is like this:
Priya Priyanka Kristeen Samantha
I understand why this happens, because we take a MAX() in each group. However, in the previous result, I also use a MAX() function to generate NULL when there's no more names coming, it doesn't return a max value as my expected, instead it return every name.
My question is why this happens?
Thank you!
Here could be the source of issue:
row_number() over(partition by o.Occupation order by
o.Name) id from OCCUPATIONS o
The Row_Number here you are using is PARTITION BY o.Occupation, so in your PIVOT, it will pivot the records by the occupation group, which means the id is repeating. If you get rid of the PARTITION BY and just keep the Order by part, it should work.
Try this approach:
find the occupations with more people associated
generate table with a sequence of numbers from 1 to the number of people calculated in the previous point
join the table generated in point 2. four times with the original table each time filtering on a different Occupation
This is the query:
declare #tmp table([Name] varchar(50),[Occupation] varchar(50))
insert into #tmp values
('Ashley','Professor') ,('Samantha','Actor') ,('Julia','Doctor') ,('Britney','Professor') ,('Maria','Professor') ,('Meera','Professor') ,('Priya','Doctor') ,('Priyanka','Professor') ,('Jennifer','Actor') ,('Ketty','Actor') ,('Belvet','Professor') ,('Naomi','Professor') ,('Jane','Singer') ,('Jenny','Singer') ,('Kristeen','Singer') ,('Christeen','Singer') ,('Eve','Actor') ,('Aamina','Doctor')
--this variable contains the occuation that has more Names (rows) in the table
--it will be the number of total rows in output table
declare #Occupation_with_max_rows varchar(50)
--populate #Occupation_with_max_rows variable
select top 1 #Occupation_with_max_rows=Occupation
from #tmp
group by Occupation
order by count(*) desc
--generate final results joining 4 times the original table with the sequence table
select D.Name as Doctor,P.Name as Professor,S.Name as Singer,A.Name as Actor
from
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord from #tmp where Occupation = #Occupation_with_max_rows) O
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Doctor') D on O.ord = D.ord
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Professor') P on O.ord = P.ord
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Singer') S on O.ord = S.ord
left join
(select ROW_NUMBER() OVER (ORDER BY [Name]) as ord, [Name] from #tmp where Occupation='Actor') A on O.ord = A.ord
Results:
Please find below code which works as expected :
select [Doctor],[Professor],[Singer],[Actor]
from
(
select row_number() over (partition by occupation order by name)[A],name,occupation
from occupations
)src
pivot
(
max(Name)
for occupation in ([Doctor],[Professor],[Singer],[Actor])
)piv;
I have three tables. I want to get data from all those tables and put it in a virtual table. i am using SQL Server 2012.
Sorry if my format or tags are wrong because I m getting error Stack overflow requires external javascrip from another source domain, which is blocked of failed to load.
Booking Table
BookingId | date
======================
2 | 7/1/2017 (MM/dd/yyyy)
3 | 7/1/2017
BookingCost Table
Id | bookinId | Cost
==========================
1 | 2 | 2000
2 | 3 | 4000
Expense Table
Id | ExpenseCost | Date
======================
1 | 1400 | 7/2/2017 (MM/dd/yyyy)
2 | 1422 | 7/1/2017
3 | 4000 | 6/3/2017
I want to get Monthly result like following Table.
Date | Expense | Bookings
===================================
jan/2017 | 0 | 0
feb/2017 | 0 | 0
. | . | .
. | . | .
. | . | .
jun/2017 | 4000 | 0
jul/2017 | 2822 | 6000
. | . | .
. | . | .
. | . | .
How is something like this (assuming your dates are DATE types and not VARCHAR - otherwise you could convert them).
SELECT COALESCE(EXPENSE.MONTH, BOOKINGS.MONTH) [Date], EXPENSE.Cost Expense, BOOKINGS.Cost Bookings
FROM (
SELECT DATEADD(DD,1-DAY([date]),[date]) MONTH, SUM(Cost) Cost
FROM Booking
INNER JOIN BookingCost
ON Booking.BookingID = BookingCost.BookingID
GROUP BY DATEADD(DD,1-DAY([date]),[date])
) BOOKINGS
FULL JOIN (
SELECT DATEADD(DD,1-DAY([date]),[date]) MONTH, SUM(ExpenseCost) Cost
FROM Expense
GROUP BY DATEADD(DD,1-DAY([date]),[date])
) EXPENSE
ON EXPENSE.MONTH = BOOKINGS.MONTH
ORDER BY 1
To also get the 0 counts, you could left join the totals to a tally table which has all the months for the year.
The sql is using FORMAT to transform the Date
For example:
;WITH MONTHS AS
(
select
[Year], [Month],
format(datefromparts([Year],[Month],1),'MMM/yyyy') as [MonthYear]
from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) m([Month])
cross join (values (2017)) y([Year])
)
select
m.[MonthYear] as [Date],
coalesce(e.TotalExpense,0) as Expense,
coalesce(bc.TotalCost,0) as Bookings
from MONTHS m
left join (
select
datepart(year,[Date]) as [Year],
datepart(month,[Date]) as [Month],
sum(ExpenseCost) as TotalExpense
from Expense
where datepart(year,[Date]) in (select distinct [Year] from MONTHS)
group by datepart(year,[Date]), datepart(month,[Date])
)e on (e.[Year] = m.[Year] and e.[Month] = m.[Month])
left join (
select
datepart(year,b.[date]) as [Year],
datepart(month,b.[date]) as [Month],
sum(c.Cost) as TotalCost
from Booking b
join BookingCost c on c.BookingId = b.BookingId
where datepart(year,b.[date]) in (select distinct [Year] from MONTHS)
group by datepart(year,b.[date]), datepart(month,b.[date])
) bc
on (bc.[Year] = m.[Year] and bc.[Month] = m.[Month])
order by m.[Year], m.[Month];
Test data I used
declare #Booking table (BookingId int, [date] date);
insert into #Booking (BookingId,[date]) values (2,'2017-07-01'),(3,'2017-07-01');
declare #BookingCost table (Id int, BookingId int, Cost int);
insert into #BookingCost (Id, BookingId, Cost) values (1,2,2000),(2,3,4000);
declare #Expense table (Id int, ExpenseCost int, [Date] date);
insert into #Expense (Id, ExpenseCost, [Date]) values
(1,1400,'2017-07-02'),(2,1422,'2017-07-01'),(3,4000,'2017-06-03');
Due to company policies I cannot give the actual query I am working with but heres the breakdown and general idea. We have an attendance register that records for each day if an employee was at work or not and where the employee works at. I am trying to make a summary of this to say between this and that date the employee worked 5 shifts. The problem I am sitting with is that one particular employee worked in workplace A for 2 days and was then transferred to workplace B. After a few days at workplace B the employee was then transferred back to workplace A.
My results to my attempt has showed that the employee begun working at workplace A from 1-Jan and ended at 10-Jan with only 2 working shifts. I have a group by on the working place and the begin and end dates are a min and max selection.
SELECT att.Employee, att.Workplace, dte.BeginDate, dte.EndDate, shf.WorkShift FROM
(SELECT * FROM Attendance WHERE WorkDate BETWEEN '1-Jan' AND '30-Jan') att
CROSS APPLY (SELECT COUNT(Shift) WorkShift FROM Attendance WHERE WorkDate BETWEEN '1-Jan' AND '30-Jan' AND Employee = att.Employee AND WorkPlace = att.WorkPlace AND Shift = 'Worked') shf
CROSS APPLY (SELECT MAX(WorkDate) BeginDate, MIN(WorkDate) EndDate FROM Attendance WHERE WorkDate BETWEEN '1-Jan' AND '30-Jan' AND Employee = att.Employee AND WorkPlace = att.WorkPlace) dte
So this employees records should appear like this (I am sorry for the very bad grid, I don't know how to make it look pretty, you are more than welcome to edit it to look better)
| Name | Workplace | beginDate | endDate | WorkShift |
| Jane | WorkPlaceA | 1-Jan | 2-Jan | 2 |
| Jane | WorkPlaceB | 3-Jan | 8-Jan | 5 |
| Jane | WorkPlaceA | 9-Jan | 10-Jan | 2 |
The attendance table looks something like this
| Name | Workplace | Date | Shift |
| Jane | WorkplaceA | 1-Jan | Worked |
| Jane | WorkplaceA | 2-Jan | Worked |
| Jane | WorkplaceB | 3-Jan | Worked |
| Jane | WorkplaceB | 4-Jan | Worked |
| Jane | WorkplaceB | 5-Jan | Worked |
| Jane | WorkplaceA | 6-Jan | Absent |
| Jane | WorkplaceA | 7-Jan | Absent |
| Jane | WorkplaceA | 8-Jan | Worked |
| Jane | WorkplaceB | 9-Jan | Worked |
| Jane | WorkplaceB | 10-Jan | Worked |
I believe you can accomplish this using CTE's. Here is a sample working code that shows your expected values.
;WITH CTE1 AS (
SELECT Employee, WorkPlace, TransactionDate,
ROW_NUMBER() OVER(PARTITION BY WorkPlace ORDER BY TransactionDate) AS WP,
ROW_NUMBER() OVER(ORDER BY TransactionDate) AS RN FROM Attendance WHERE Shift = 'Worked'),
CTE2 AS (SELECT Employee, WorkPlace, TransactionDate, WP, RN, WP-RN AS GB FROM CTE1),
CTE3 AS (SELECT Employee, WorkPlace, MIN(TransactionDate) AS TransactionDate, COUNT(1) AS Shifts FROM CTE2 GROUP BY Employee, WorkPlace, GB)
SELECT Employee, WorkPlace, TransactionDate AS [Start Date], DATEADD(DAY,Shifts - 1,TransactionDate) AS [End Date], Shifts FROM CTE3 ORDER BY TransactionDate ASC
I think your given output is wrong.
I think the way you are populating table is wrong.
Check my query,it can be further optmize,it do not count absent days
declare #t table(Name varchar(100),Workplace varchar(100), AttnDate date ,Shifts varchar(100))
insert into #t values
('Jane','WorkplaceA',' 1-Jan-16','Worked')
,('Jane','WorkplaceA',' 2-Jan-16','Worked')
,('Jane','WorkplaceB',' 3-Jan-16','Worked')
,('Jane','WorkplaceB',' 4-Jan-16','Worked')
,('Jane','WorkplaceB',' 5-Jan-16','Worked')
,('Jane','WorkplaceA',' 6-Jan-16','Absent')
,('Jane','WorkplaceA',' 7-Jan-16','Absent')
,('Jane','WorkplaceA',' 8-Jan-16','Worked')
,('Jane','WorkplaceB',' 9-Jan-16','Worked')
,('Jane','WorkplaceB','10-Jan-16','Worked')
DECLARE #Name VARCHAR(100) = 'Jane'
DECLARE #FromDate DATE = '01-Jan-16'
DECLARE #ToDate DATE = '31-Jan-16';
WITH CTE
AS (
SELECT *
,row_number() OVER (
ORDER BY attndate
) rn
FROM #t
WHERE NAME = #Name
AND (
AttnDate BETWEEN #FromDate
AND #ToDate
)
)
,CTE1
AS (
SELECT A.NAME
,A.workplace
,A.AttnDate
,Shifts
,rn
,1 RN1
FROM cte A
WHERE rn = 1
UNION ALL
SELECT a.NAME
,a.workplace
,a.AttnDate
,a.Shifts
,CASE
WHEN a.workplace = b.workplace
THEN b.rn
ELSE b.rn + 1
END rn
,RN1 + 1
FROM CTE A
INNER JOIN CTE1 b ON a.attndate > b.attndate
WHERE a.rn = RN1 + 1
)
,CTE2
AS (
SELECT NAME
,Workplace
,AttnDate beginDate
,(
SELECT max(AttnDate)
FROM CTE1 b
WHERE b.rn = a.rn
) endDate
,(
SELECT count(*)
FROM CTE1 b
WHERE b.rn = a.rn
AND Shifts = 'Worked'
) WorkShift
,rn
,ROW_NUMBER() OVER (
PARTITION BY rn ORDER BY rn
) rn3
FROM cte1 a
)
SELECT NAME
,workplace
,beginDate
,endDate
,WorkShift
FROM cte2
WHERE rn3 = 1
I need to write a statement joining two tables based on dates.
Table 1 contains time recording entries.
+----+-----------+--------+---------------+
| ID | Date | UserID | DESC |
+----+-----------+--------+---------------+
| 1 | 1.10.2010 | 5 | did some work |
| 2 | 1.10.2011 | 5 | did more work |
| 3 | 1.10.2012 | 4 | me too |
| 4 | 1.11.2012 | 4 | me too |
+----+-----------+--------+---------------+
Table 2 contains the position of each user in the company. The ValidFrom date is the date at which the user has been or will be promoted.
+----+-----------+--------+------------+
| ID | ValidFrom | UserID | Pos |
+----+-----------+--------+------------+
| 1 | 1.10.2009 | 5 | PM |
| 2 | 1.5.2010 | 5 | Senior PM |
| 3 | 1.10.2010 | 4 | Consultant |
+----+-----------+--------+------------+
I need a query which outputs table one with one added column which is the position of the user at the time the entry has been made. (the Date column)
All date fileds are of type date.
I hope someone can help. I tried a lot but don't get it working.
Try this using a subselect in the where clause:
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE TimeRecord
(
ID INT,
[Date] Date,
UserID INT,
Description VARCHAR(50)
)
INSERT INTO TimeRecord
VALUES (1,'2010-01-10',5,'did some work'),
(2, '2011-01-10',5,'did more work'),
(3, '2012-01-10', 4, 'me too'),
(4, '2012-11-01',4,'me too')
CREATE TABLE UserPosition
(
ID Int,
ValidFrom Date,
UserId INT,
Pos VARCHAR(50)
)
INSERT INTO UserPosition
VALUES (1, '2009-01-10', 5, 'PM'),
(2, '2010-05-01', 5, 'Senior PM'),
(3, '2010-01-10', 4, 'Consultant ')
Query 1:
SELECT TR.ID,
TR.[Date],
TR.UserId,
TR.Description,
UP.Pos
FROM TimeRecord TR
INNER JOIN UserPosition UP
ON UP.UserId = TR.UserId
WHERE UP.ValidFrom = (SELECT MAX(ValidFrom)
FROM UserPosition UP2
WHERE UP2.UserId = UP.UserID AND
UP2.ValidFrom <= TR.[Date])
Results:
| ID | Date | UserId | Description | Pos |
|----|------------|--------|---------------|-------------|
| 1 | 2010-01-10 | 5 | did some work | PM |
| 2 | 2011-01-10 | 5 | did more work | Senior PM |
| 3 | 2012-01-10 | 4 | me too | Consultant |
| 4 | 2012-11-01 | 4 | me too | Consultant |
You can do it using OUTER APPLY:
SELECT ID, [Date], UserID, [DESC], x.Pos
FROM table1 AS t1
OUTER APPLY (
SELECT TOP 1 Pos
FROM table2 AS t2
WHERE t2.UserID = t1.UserID AND t2.ValidFrom <= t1.[Date]
ORDER BY t2.ValidFrom DESC) AS x(Pos)
For every row of table1 OUTER APPLY operation fetches all table2 rows of the same user that have a ValidFrom date that is older or the same as [Date]. These rows are sorted in descending order and the most recent of these is finally returned.
Note: If no match is found by the OUTER APPLY sub-query then a NULL value is returned, meaning that no valid position exists in table2 for the corresponding record in table1.
Demo here
This works by using a rank function and subquery. I tested it with some sample data.
select sub.ID,sub.Date,sub.UserID,sub.Description,sub.Position
from(
select rank() over(partition by t1.userID order by t2.validfrom desc)
as 'rank', t1.ID as'ID',t1.Date as'Date',t1.UserID as'UserID',t1.Descr
as'Description',t2.pos as'Position', t2.validfrom as 'validfrom'
from temployee t1 inner join jobs t2 on -- replace join tables with your own table names
t1.UserID=t2.UserID
) as sub
where rank=1
This query would work
select t1.*,t2.pos from Table1 t1 left outer join Table2 t2 on
t1.Date=t2.Date and t1.UserID=t2.UserID
First of all, execuse the longer question, but I will try to put it as simply as possible...
I'm trying to write a kind of a reporting query, but I'm having a problem getting the desired results. The problem:
Employee table
Id | Name
---------------
1 | John Smith
2 | Alan Jones
3 | James Jones
Task table
Id | Title | StartDate | EmployeeId | Estimate (integer - ticks)
----------------------------------------------------------------------------
1 | task1 | 21.08.2011 | 1 | 90000000000
2 | task2 | 21.08.2011 | 1 | 150000000
3 | task3 | 22.08.2011 | 2 | 1230000000
Question:
How to get the estimate summary per day, grouped, but to include all the employees?
Like this:
Date | EmployeeId | EmployeeName | SummaryEstimate
-------------------------------------------------------------
19.08.2011 | 1 | John Smith | NULL
19.08.2011 | 2 | Alan Jones | NULL
19.08.2011 | 3 | James Jones | NULL
20.08.2011 | 1 | John Smith | NULL
20.08.2011 | 2 | Alan Jones | NULL
20.08.2011 | 3 | James Jones | NULL
21.08.2011 | 1 | John Smith | 90150000000
21.08.2011 | 2 | Alan Jones | NULL
21.08.2011 | 3 | James Jones | NULL
22.08.2011 | 1 | John Smith | NULL
22.08.2011 | 2 | Alan Jones | 1230000000
22.08.2011 | 3 | James Jones | NULL
What I currently do is I have a "dates" table with 30years of days. I left join and group by that table to get other dates included too. Well, here is the query:
SELECT dates.value, employee.Id, employee.Name, sum(task.Estimate)
FROM TableOfDates as dates
left join Tasks as task on (dates.value = convert(varchar(10), task.StartTime, 101))
left join Employees as employee on (employee.Id = task.EmployeeId)
WHERE dates.value >= '2011-08-19' and dates.value < '2011-08-22'
GROUP BY dates.value, employee.Id, employee.Name
ORDER BY dates.value, employee.Id
The convert call is to get the date part of the DateTime column.
The result that I get is:
Date | EmployeeId | EmployeeName | SummaryEstimate
-------------------------------------------------------------
19.08.2011 | NULL | NULL | NULL
20.08.2011 | NULL | NULL | NULL
21.08.2011 | 1 | John Smith | 90150000000
22.08.2011 | 2 | Alan Jones | 1230000000
I am there half of the way, I get dates that are not in the two base joined tables (Employees and Tasks) but I cannot also have all the employees included as in the table shown before this one.
I've tried cross-joining, then subqueries, but little luck there. Any help would be very much appreciated ! Thank you for having the time to go through all of this, I hope I was clear enough...
SELECT DE.DateValue, DE.EmployeeId, DE.EmployeeName, sum(task.Estimate)
FROM
( SELECT
D.value AS DateValue
, E.Id AS EmployeeId
, E.Name AS EmployeeName
FROM
TableOfDates D
CROSS JOIN Employees E ) DE
left join Tasks as task on DE.DateValue = convert(varchar(10), task.StartTime, 101)
AND DE.EmployeeId = task.EmployeeId
WHERE DE.DateValue >= '2011-08-19' and DE.DateValue < '2011-08-22'
GROUP BY DE.DateValue, DE.EmployeeId, DE.EmployeeName
ORDER BY DE.DateValue, DE.EmployeeId
Note that this solution offers the possibility to drop the day-table as you may use a dynamic recursive CTE instead.
The other CTE:s (Employees and Tasks) can be substituted with the real tables.
DECLARE #startDate DATETIME = '2011-08-01'
DECLARE #endDate DATETIME = '2011-09-01'
;WITH Employees(Id,Name)
AS
(
SELECT 1, 'John Smith'
UNION ALL
SELECT 2, 'Alan Jones'
UNION ALL
SELECT 3, 'James Jones'
)
,Tasks (Id, Title, StartDate, EmployeeId, Estimate)
AS
(
SELECT 1, 'task1', '2011-08-21', 1, 90000000000
UNION ALL
SELECT 2, 'task2', '2011-08-21', 1, 150000000
UNION ALL
SELECT 3, 'task3', '2011-08-22', 2, 1230000000
)
,TableOfDates(value)
AS
(
SELECT DATEADD(DAY, DATEDIFF(DAY, 0, #startDate), 0)
UNION ALL
SELECT DATEADD(DAY, 1, value)
FROM TableOfDates
WHERE value < #endDate
)
SELECT dates.value
,employee.Id
,employee.Name
,SUM(task.Estimate) AS SummaryEstimate
FROM TableOfDates dates
CROSS JOIN Employees employee
LEFT JOIN Tasks task
ON dates.value = task.StartDate
AND (employee.Id = task.EmployeeId)
WHERE dates.value >= '2011-08-19'
AND dates.value < '2011-08-26'
GROUP BY
dates.value
,employee.Id
,employee.Name
ORDER BY
dates.value
,employee.Id
use this query:
create table #T_dates (id_date int identity(1,1),inp_date datetime)
create table #T_tasks (id_task int identity(1,1),key_date int, key_emp int, est int)
create table #T_emp (id_emp int identity(1,1),name varchar(50))
insert #T_dates (inp_date) values ('08.19.2011')
insert #T_dates (inp_date) values ('08.20.2011')
insert #T_dates (inp_date) values ('08.21.2011')
insert #T_dates (inp_date) values ('08.22.2011')
insert #T_dates (inp_date) values ('08.23.2011')
insert #T_dates (inp_date) values ('08.24.2011')
--select * from #T_dates
insert #T_emp (name) values ('John Smith')
insert #T_emp (name) values ('Alan Jones')
insert #T_emp (name) values ('James Jones')
--select * from #T_emp
insert #T_tasks (key_date,key_emp,est) values (4,1,900000)
insert #T_tasks (key_date,key_emp,est) values (4,1,15000)
insert #T_tasks (key_date,key_emp,est) values (5,2,123000)
--select * from #T_tasks
select inp_date,id_emp,name,EST
from #T_emp
cross join #T_dates
left join
(
select key_date,key_emp,SUM(est) 'EST' from #T_tasks group by key_date,key_emp
) Gr
ON Gr.key_emp = id_emp and Gr.key_date = id_date
where inp_date >= '2011-08-19' and inp_date <= '2011-08-22'
order by inp_date,id_emp