Row to Column Group by ID SQL Server

Row to Column Group by ID SQL Server - sql-server

I have a table like this:
id name date
-------------------
1 Adam 2018-10-01
1 Adam 2018-08-01
2 Eve 2018-07-01
2 Eve 2018-05-01
I want it to become like this:
id name firstdate lastdate
--------------------------------
1 Adam 2018-08-01 2018-10-01
2 Eve 2018-05-01 2018-07-01
I tried to use this query but it failed:
SELECT * FROM View_MySource
PIVOT (
MIN(mydate)
FOR id IN ([firstdate], [lastdate])
) piv
I am new to pivot, can someone help me?

I might even avoid using PIVOT in this case:
SELECT
id,
name,
MIN(date) AS firstdate,
MAX(date) AS maxdate
FROM View_MySource
GROUP BY
id,
name;

This has actually nothing to do with pivoting data. All you want to do is get the minimum and maximum date per ID; a simple aggregation:
select id, name, min(date) as firstdate, max(date) as lastdate
from view_mysource
group by id, name
order by id;

Related

Bigquery - Add full date range to each id

How can i apply GENERATE_DATE_ARRAY(start_date, end_date[, INTERVAL INT64_expr date_part]) to each record in a dataset. I understand how to apply it to get a single date range from start to end, but don't know how to apply the same date array to each id.
Say i have two distinct ID's x and y with the following dates:
|id|date
--------------
1 |x |2021-01-01
2 |x |2021-01-03
3 |y |2021-01-06
4 |y |2021-01-09
and i want to fill in the date gap for each ID
How can i achieve the following output?
|id|date
--------------
1 |x |2021-01-01
2 |x |2021-01-02
3 |x |2021-01-03
4 |y |2021-01-06
5 |y |2021-01-07
6 |y |2021-01-08
7 |y |2021-01-09

Below is for BigQuery Standard SQL
select id, date from (
select id, date, lead(date) over(partition by id order by date) next_date
from `project.dataset.table`
), unnest(generate_date_array(date, next_date)) date
where not next_date is null
-- order by date
if to apply to sample data from your question - output is

Try the following in standard SQL in BigQuery:
with data as (
select 'x' as id, date '2021-01-01' as date
UNION ALL
select 'x' as id, date '2021-01-03' as date
UNION ALL
select 'y' as id, date '2021-01-06' as date
UNION ALL
select 'y' as id, date '2021-01-09' as date
)
select d1.id, date
from data d1
join data d2
on d1.id = d2.id
and d1.date < d2.date, unnest(GENERATE_DATE_ARRAY(d1.date, d2.date, INTERVAL 1 DAY)) as date;

return amount per year/month records based on start and enddate

I have a table with, for example this data:
ID |start_date |end_date |amount
---|------------|-----------|--------
1 |2019-03-21 |2019-05-09 |10000.00
2 |2019-04-02 |2019-04-10 |30000.00
3 |2018-11-01 |2019-01-08 |20000.00
I would like te get the splitted records back with the correct calculated amount based on the year/month.
I expect the outcome to be like this:
ID |month |year |amount
---|------|-------|--------
1 |3 | 2019 | 2200.00
1 |4 | 2019 | 6000.00
1 |5 | 2019 | 1800.00
2 |4 | 2019 |30000.00
3 |11 | 2018 | 8695.65
3 |12 | 2018 | 8985.51
3 |1 | 2019 | 2318.84
What would be the best way to achieve this? I think you would have to use DATEDIFF to get the number of days between the start_date and end_date to calculate the amount per day, but I'm not sure how to return it as records per month/year.
Tnx in advance!

This is one idea. I use a Tally to create a day for every day the amount is relevant for for that ID. Then, I aggregate the value of the Amount divided by the numbers of days, which is grouped by Month and year:
CREATE TABLE dbo.YourTable(ID int,
StartDate date,
EndDate date,
Amount decimal(12,2));
GO
INSERT INTO dbo.YourTable (ID,
StartDate,
EndDate,
Amount)
VALUES(1,'2019-03-21','2019-05-09',10000.00),
(2,'2019-04-02','2019-04-10',30000.00),
(3,'2018-11-01','2019-01-08',20000.00);
GO
--Create a tally
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (SELECT MAX(DATEDIFF(DAY, t.StartDate, t.EndDate)+1) FROM dbo.YourTable t) --Limits the rows, might be needed in a large dataset, might not be, remove as required
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1, N N2, N N3), --1000 days, is that enough?
--Create the dates
Dates AS(
SELECT YT.ID,
DATEADD(DAY, T.I, YT.StartDate) AS [Date],
YT.Amount,
COUNT(T.I) OVER (PARTITION BY YT.ID) AS [Days]
FROM Tally T
JOIN dbo.YourTable YT ON T.I <= DATEDIFF(DAY, YT.StartDate, YT.EndDate))
--And now aggregate
SELECT D.ID,
DATEPART(MONTH,D.[Date]) AS [Month],
DATEPART(YEAR,D.[Date]) AS [Year],
CONVERT(decimal(12,2),SUM(D.Amount / D.[Days])) AS Amount
FROM Dates D
GROUP BY D.ID,
DATEPART(MONTH,D.[Date]),
DATEPART(YEAR,D.[Date])
ORDER BY D.ID,
[Year],
[Month];
GO
DROP TABLE dbo.YourTable;
GO
DB<>Fiddle

SQL query to calculate Throughput based "subtracting" two Select statements using Group By

I'm trying to formulate a SQL query to calculate the difference in the number of people "arriving" and "departing" grouped by City and Date.
TravelerID ArrivalDate DepartureDate City
1 2015-10-01 2015-10-03 New York
2 2015-10-02 2015-10-03 New York
3 2015-10-02 2015-10-04 Chicago
4 2015-10-01 2015-10-02 Chicago
I'm hoping to get a table that looks like
NumOfTravelers Date City
1 2015-10-01 New York
1 2015-10-02 New York
-2 2015-10-03 New York
1 2015-10-01 Chicago
0 2015-10-02 Chicago
-1 2015-10-04 Chicago
A positive number for NumOfTravelers means that more people arrived in that city on that particular date. A negative number for NumOfTravelers means that more people left that city on that particular date.
In trying to break down this SQL query, I've tried
SELECT COUNT(TravelerID) as NumTravelersArrivng, ArrivalDate, City FROM TravelTable GROUP BY ArrivalDate, City;
SELECT COUNT(TravelerID) as NumTravelersDeparting, DepartureDate, City FROM TravelTable GROUP BY DepartureDate, City;
I'm trying to get "NumTravelersArriving" - "NumTravelersDeparting" into a column that represents "traveler throughput" grouped by City and Date.
I've been so stumped on this. I'm using SQL Server, and having a frustrating time using Table aliases and Column aliases.

Try this:
SELECT *
FROM (
SELECT City, ArrivalDate As Date, COUNT(TravelerID) As NumOfTravelers
FROM TravelTable
GROUP BY City, ArrivalDate
) a
FULL JOIN (
SELECT City, DepartureDate As Date, COUNT(TravelerID) * -1 As NumOfTravelers
FROM TravelTable
GROUP BY City, DepartureDate
) b ON b.City = a.City AND b.Date = a.Date

Combine two results in one row

EmpID Name Date Earn
1 A 7/1/2014 2
1 A 7/1/2014 4
1 A 7/2/2014 1
1 A 7/2/2014 2
2 B 7/1/2014 5
2 B 7/2/2014 5
I would like combine two results in one row as below.here is my statement but i want to find the solution to get the Total_Earn?. Thank
"SELECT EmpID, Name, Date, Sum(earn) FROM employee WHERE Date between DateFrom and DateTo
GROUP BY EmpID, Name, Date"
EmpID Name Date Earn Total_Earn
1 A 7/2/2014 3 9
2 B 7/2/2014 5 10

It looks like you want the Max date and the Sum of Earn for each employee. Assuming you want one record for each ID/Name, you would do this:
select EmpID, Name, Max(Date), Sum(Earn)
from YourTableName
group by EmpID, Name

Try this. Substitute the date for whatever value you want.
SELECT table1.EmpID, table1.Name, table1.Date, table1.Earn, table2.Total_Earn
FROM
(SELECT EmpID, Name, Date, Earn
FROM yourtablename
WHERE Date = "2014-07-02"
GROUP BY EmpID) table1
LEFT JOIN
(SELECT EmpID, SUM(Earn)
FROM yourtablename
WHERE Date <= "2014-07-02"
GROUP BY EmpID) table2
ON table1.EmpID = table2.EmpID
This will perform two SELECTs and join their results. The first select (defined as table1) well select the employee ID and earnings for the specified date.
The second statement (defined as table2) will select the total earnings for an employee up to and including that date.
The two statements are then joined together according to the employee ID.

How do I perform the following multi-layered pivot with TSQL in Access 2010?

I have looked at the following relevant posts:
How to create a PivotTable in Transact/SQL?
SQL Server query - Selecting COUNT(*) with DISTINCT
SQL query to get field value distribution
Desire: The have data change from State #1 to State #2.
Data: This data is a collection of the year(s) in which a person (identified by their PersonID) has been recorded performing a certain activity, at a certain place.
My data currently looks as follows:
State #1
Row | Year | PlaceID | ActivityID | PersonID
001 2011 Park Read 201a
002 2011 Library Read 202b
003 2012 Library Read 202b
004 2013 Library Read 202b
005 2013 Museum Read 202b
006 2011 Park Read 203c
006 2010 Library Read 203c
007 2012 Library Read 204d
008 2014 Library Read 204d
Edit (4/2/2014): I decided that I want State #2 to just be distinct counts.
What I want my data to look like:
State #2
Row | PlaceID | Column1 | Column2 | Column3
001 Park 2
002 Library 1 1 1
003 Museum 1
Where:
Column1: The count of the number of people that attended the PlaceID to read on only one year.
Column2: The count of the number of people that attended the PlaceID to read on two different years.
Column3: The count of the number of people that attended the PlaceID to read on three different years.
In the State #2 schema, a person cannot be counted in more than one column for each row (place). If a person reads at a particular place for 2010, 2011, 2012, they appear in Row 001, Column3 only. However, that person can appear in other rows, but once again, in only one column of that row.
My methodology (please correct me if I am doing this wrong):
I believe that the first step is to extract distinct counts of the number of years each person attended the place to perform the activity of interest (please correct me on this methodology if incorrect).
As such, this is where I am with the T-SQL:
SELECT
PlaceID
,PersonID
,[ActivityID]
,COUNT(DISTINCT [Year]) AS UNIQUE_YEAR_COUNT
FROM (
SELECT
Year
,PlaceID
,ActivityID
,PersonID
FROM [my].[linkeddatabasetable]
WHERE ActivityID = 'Read') t1
GROUP BY
PlaceID
,PersonID
,[ActivityID]
ORDER BY 1,2
Unfortunately, I do not know where to take it from here.

I think you have two options.
A traditional pivot
select placeID,
, Column1 = [1]
, Column2 = [2]
, Column3 = [3]
from
(
SELECT
PlaceID
,COUNT(DISTINCT [Yearvalue]) AS UNIQUE_YEAR_COUNT
FROM (
SELECT
yearValue
,PlaceID
,ActivityID
,PersonID
FROM #SO
WHERE ActivityID = 'Read') t1
GROUP BY
PlaceID
,PersonID
,[ActivityID]) up
pivot (count(UNIQUE_YEAR_COUNT) for UNIQUE_YEAR_COUNT in ([1],[2],[3]) ) as pvt
or as a case/when style pivot.
select I.PlaceID
, Column1 = count(case when UNIQUE_YEAR_COUNT = 1 then PersonID else null end)
, Column2 = count(case when UNIQUE_YEAR_COUNT = 2 then PersonID else null end)
, Column3 = count(case when UNIQUE_YEAR_COUNT = 3 then PersonID else null end)
from (
SELECT
PlaceID
, PersonID
,COUNT(DISTINCT [Yearvalue]) AS UNIQUE_YEAR_COUNT
FROM (
SELECT
yearValue
,PlaceID
,ActivityID
,PersonID
FROM #SO
WHERE ActivityID = 'Read') t1
GROUP BY
PlaceID
,PersonID
,[ActivityID]) I
group by I.PlaceID

Since you are in Access I would think using an aggregate functions would do the work.
Try with DCOUNT() to begin with http://office.microsoft.com/en-us/access-help/dcount-function-HA001228817.aspx.
Replace your count() with dcount("year", "linkeddatabasetable", "placeid=" & [placeid])