How can I select first inserted row in SQL Server? - sql-server

may be this question be unrelated to stackoverflow . but this is my problem and i do not know the syntax .
with this query i select the persons who had transactions by their date time .
this is my query
i want to write query that select the their first TransactionsTimeStamp?

I assume that you are looking for ranking functions like ROW_NUMBER, you could use them for example with a Common Table Expression (CTE):
WITH CTE AS
(
SELECT ..., RN = ROW_NUMBER() OVER (PARTITION BY FirstName, LastName
ORDER BY TransactionsTimeStamp ASC)
FROM dbo.TableName ... (join tables here)
)
SELECT ....
FROM CTE
WHERE RN = 1
... are the columns that you want to select, you can select all, as opposed to a GROUP BY.
But if you just want to select the TransactionsTimeStamp-column for every user:
SELECT MIN(TransactionsTimeStamp) AS TransactionsTimeStamp, FirstName, LastName
FROM dbo.tableName ... (join tables here)
GROUP BY FirstName, LastName

The problem in your query is that you are grouping by Date column. So you are getting all different Date values as a result. You should group only by FirstName and LastName and apply some aggregation functions to Date column.
If just Min date is needed then you can get that date using aggregate function like:
DECLARE #test TABLE
(
first_name NVARCHAR(MAX) ,
last_name NVARCHAR(MAX) ,
transaction_date DATETIME
)
INSERT INTO #test
VALUES ( 'A', 'B', '20150101' )
INSERT INTO #test
VALUES ( 'A', 'B', '20150120' )
INSERT INTO #test
VALUES ( 'C', 'D', '20150103' )
INSERT INTO #test
VALUES ( 'C', 'D', '20150119' )
SELECT first_name ,
last_name ,
MIN(transaction_date) AS min_transaction_date
FROM #test
GROUP BY first_name ,
last_name
Output:
first_name last_name min_transaction_date
A B 2015-01-01 00:00:00.000
C D 2015-01-03 00:00:00.000

Select firstname, lastname, min(date) as minimum_date from clubprofile_cp
group by firstname, lastname

Related

Snowflake : IN operator

so I want something as below in my query
select * from table a
where a.id in(select id, max(date) from table a group by id)
I am getting error here , as IN is equivalent to = .
how to do it?
example :
id
date
1
2022-31-01
1
2022-21-03
2
2022-01-01
2
2022-02-01
I need to get only one record based on date(max). The table has more columns than just id and date
so I need to something like this in snowflake
select * from table a
where id in(select id,max(date) from table a group by id)
```-----------------------
All solutions are working , if i select from table .
but i have case statement in view where duplicate records are coming
example :
create or replace view v_test
as
select * from
(
select id,lastdatetime,*,
case when start_date < timestamp and timestamp < end
and move_date = '9999-12-31' then 'Y'
else 'N' end as IND
from table a
) a
so if any one select view where IND= 'Y', more than 1 records are coming
what i want is to select latest records for ID where IND='Y' and max(lastdatetime)
how to incorporate this logic in view?
I think you are trying to get the latest record for each id?
select *
from table a
qualify row_number() over (partition by id order by date desc) = 1
So if we look at your sub-select:
using this "data" for the examples:
with data (id, _date) as (
select column1, to_date(column2, 'yyyy-dd-mm') from values
(1, '2022-31-01'),
(1, '2022-21-03'),
(2, '2022-01-01'),
(2, '2022-02-01')
)
select id, max(_date)
from data
group by 1;
it gives:
ID
MAX(_DATE)
1
2022-03-21
2
2022-01-02
which makes it seem you want the "the last date, per id"
which can classically (ansi sql) be written:
select d.*
from data as d
join (
select
id,
max(_date) as max_date
from data
group by 1
) as c
on d.id = c.id and d._date = c.max_date
;
ID
_DATE
1
2022-03-21
2
2022-01-02
which gives you "all the rows values". BUT if you have many rows with the same last date, you will get those, in the output.
Another methods is to use a ROW_NUMBER to pick one and only one row, which is the style of answer Mike has given:
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify row_number() over (partition by id order by _date desc) =1 ;
gives:
ID
_DATE
EXTRA
1
2022-03-21
extra_b_double_a
2
2022-01-02
extra_d
now if you want the "all rows of the last day" you method works, albeit the QUALIFY/ROW_NUMBER is faster. You can use RANK
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify dense_rank() over (partition by id order by _date desc) =1 ;
ID
_DATE
EXTRA
1
2022-03-21
extra_b_double_a
1
2022-03-21
extra_b_double_b
2
2022-01-02
extra_d
Now the last thing that it almost seems you are asking for, is "how do find the ID with the most recent data (here 1) and get all rows for that"
with data (id, _date, extra) as (
select column1, to_date(column2, 'yyyy-dd-mm'), column3 from values
(1, '2022-31-01', 'extra_a'),
(1, '2022-21-03', 'extra_b_double_a'),
(1, '2022-21-03', 'extra_b_double_b'),
(2, '2022-01-01', 'extra_c'),
(2, '2022-02-01', 'extra_d')
)
select *
from data
qualify id = last_value(id) over (order by _date);
Here is an example of how to use the in operator with a subquery:
select * from table1 t1 where t1.id in (select t2.id from table2 t2);
Usage of IN is possible to match on both columns:
select *
from tab AS a
where (a.id, a.date) in (select id, max(date) from tab group by id);
For sample data:
CREATE TABLE tab (id, date)
AS
SELECT column1, to_date(column2, 'yyyy-dd-mm')
FROM VALUES
(1, '2022-31-01'),
(1, '2022-21-03'),
(2, '2022-01-01'),
(2, '2022-02-01');
Output:

Variable within SQL query

I have this:
SELECT NEWID() as id,
'OwnerReassign' as name,
1 as TypeId,
'MyOrganisation' as OrgName,
'07DA8E53-74BD-459C-AF94-A037897A51E3' as SystemUserId,
0 as StatusId,
GETDATE() as CreatedAt,
'{"EntityName":"account","Ids":["'+CAST(AccountId as varchar(50))+'"],"OwnerId":"0C01C994-1205-E511-988E-26EE4189191B"}' as [Parameters]
FROM Account
WHERE OwnerIdName IN ('John Smith') AND New_AccountType = 1
Within the parameter field is an id (0C01C994-1205-E511-988E-26EE4189191B). Is it possible it could sequentially assign a different id from a list for each row? There are 5 id's in total.
What i'm trying to get to is this result set equally split between the 5 different id's.
Thanks
You can add one more NEWID() in the sub query and handle in the SELECT as below:
SELECT id, [name], TypeId, OrgName, SystemUserId, StatusId, CreatedAt,
'{"EntityName":"account","Ids":["' + AccountId +'"],"OwnerId":"' + ParamId + '"}' as [Parameters]
FROM (
SELECT NEWID() as id,
'OwnerReassign' as name,
1 as TypeId,
'MyOrganisation' as OrgName,
'07DA8E53-74BD-459C-AF94-A037897A51E3' as SystemUserId,
0 as StatusId,
GETDATE() as CreatedAt,
CAST(NEWID() AS VARCHAR (36)) as ParamId,
CAST(AccountId as varchar(50)) as AccountId
FROM Account
WHERE OwnerIdName IN ('John Smith') AND New_AccountType = 1
) A
You can use something like the following. Basically, use a row number for both your IDs and your data table to update, then do a MOD (%) operation with the amount of ID's you want to assign, so your data table to update is split into N groups. Then use that group ID to assign each ID.
IF OBJECT_ID('tempdb..#IDsToAssign') IS NOT NULL
DROP TABLE #IDsToAssign
CREATE TABLE #IDsToAssign (
IDToAssign VARCHAR(100))
-- 3 IDs example
INSERT INTO #IDsToAssign (
IDToAssign)
SELECT IDToAssign = NEWID()
UNION ALL
SELECT IDToAssign = NEWID()
UNION ALL
SELECT IDToAssign = NEWID()
DECLARE #AmountIDsToAssign INT = (SELECT COUNT(1) FROM #IDsToAssign)
IF OBJECT_ID('tempdb..#Account') IS NOT NULL
DROP TABLE #Account
CREATE TABLE #Account (
PrimaryKey INT PRIMARY KEY,
AssignedID VARCHAR(100))
-- 10 Rows example
INSERT INTO #Account (
PrimaryKey)
VALUES
(100),
(200),
(351),
(154),
(194),
(345),
(788),
(127),
(124),
(14)
;WITH DataRowNumber AS
(
SELECT
A.*,
RowNumber = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM
#Account AS A
),
IDsRowNumbers AS
(
SELECT
D.IDToAssign,
RowNumber = ROW_NUMBER() OVER (ORDER BY D.IDToAssign)
FROM
#IDsToAssign AS D
),
NewIDAssignation AS
(
SELECT
R.*,
IDRowNumberAssignation = (R.RowNumber % #AmountIDsToAssign) + 1
FROM
DataRowNumber AS R
)
UPDATE A SET
AssignedID = R.IDToAssign
FROM
NewIDAssignation AS N
INNER JOIN IDsRowNumbers AS R ON N.IDRowNumberAssignation = R.RowNumber
INNER JOIN #Account AS A ON N.PrimaryKey = A.PrimaryKey
SELECT
*
FROM
#Account AS A
ORDER BY
A.AssignedID
/* Results:
PrimaryKey AssignedID
----------- ------------------------------------
124 1CC7F0F1-7EDE-4F7F-B0A3-739D74A62390
194 1CC7F0F1-7EDE-4F7F-B0A3-739D74A62390
351 1CC7F0F1-7EDE-4F7F-B0A3-739D74A62390
788 2A58A573-EDCB-428E-A87A-6BFCED265A9C
200 2A58A573-EDCB-428E-A87A-6BFCED265A9C
127 2A58A573-EDCB-428E-A87A-6BFCED265A9C
14 2A58A573-EDCB-428E-A87A-6BFCED265A9C
100 FD8036DA-0E15-453E-8A59-FA3C2BDB8FB1
154 FD8036DA-0E15-453E-8A59-FA3C2BDB8FB1
345 FD8036DA-0E15-453E-8A59-FA3C2BDB8FB1
*/
The ordering of the ROW_NUMBER() function will determine how ID's are assigned.
You could potentially do this by using the ROW_NUMBER() field in a subquery; for example:
SELECT NEWID() as id, 'OwnerReassign' as name, 1 as TypeId,
'MyOrganisation' as OrgName,
'07DA8E53-74BD-459C-AF94-A037897A51E3' as SystemUserId,
0 as StatusId, GETDATE() as CreatedAt,
case B / ##ROWCOUNT
when 0 then '0C01C994-1205-E511-988E-26EE4189191B'
when 1 then '12345677-1205-E511-988E-26EE4189191B'
when 2 then '66666666-1205-E511-988E-26EE4189191B'
etc...
end
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY A.Id)
FROM Account A
WHERE OwnerIdName IN ('John Smith') AND New_AccountType = 1
) AS B
If you want the system to pick those values then you could put then in their own temporary table, too.

SQL Server or functionality in where condition

I have an SQL query where I want to get the rows with values "all" or "female" in [gender] column and value "A" in [group] column. If there are 2 rows with [group] = A and [gender] = all and the other [group] = A and [gender] = female I want to get only the row with [gender] = all. Now I use:
where group=A and (gender=all or gender=female)
But I get both rows
In the example table below I want to get only the row: A all
But if I use the where group=A and (gender=all or gender=female) query I will get both rows for group A
group gender
A all
A female
B all
C female
C all
You can use something like row_number() to prioritize the various subsets of records you're looking at and then select only one record from each. From the wording of your question I assume there is some other field in the table on which you're "grouping" records together—in other words, a field whose every distinct value should produce at most one record in the result set whose group and gender values match your criteria. In the following example I've assumed that this field is called Category; if you share the actual schema of your table then I can improve the example, but this should suffice to illustrate the idea.
declare #SampleData table
(
Category bigint,
[Group] char(1),
Gender varchar(16)
);
insert #SampleData values
(1, 'A', 'Female'), -- include
(2, 'B', 'Female'), -- exclude; wrong group
(3, 'A', 'Female'), -- exclude; right group and gender but superseded by (3, 'A', 'All')
(3, 'A', 'All'), -- include
(4, 'A', 'All'), -- include
(5, 'A', 'Male'); -- exclude; wrong gender
with PrioritizedData as
(
select
D.*,
[Priority] = row_number() over (partition by D.Category order by case D.Gender when 'All' then 0 else 1 end)
from
#SampleData D
where
D.[Group] = 'A' and
D.Gender in ('Female', 'All')
)
select * from PrioritizedData P where P.[Priority] = 1;
You can use the RANK() window function with results grouped by group and ordered by gender (this works because all is alphabetically before female or male. If your ordering gets more complex than that, you'll have to look at another way to order them.
/* TEST DATA */
; WITH a AS (
SELECT 'A' AS thegroup, 'all' AS gender UNION ALL
SELECT 'A' AS thegroup, 'all' AS gender UNION ALL
SELECT 'A' AS thegroup, 'female' AS gender UNION ALL
SELECT 'B' AS thegroup, 'all' AS gender UNION ALL
SELECT 'C' AS thegroup, 'female' AS gender UNION ALL
SELECT 'C' AS thegroup, 'all' AS gender UNION ALL
SELECT 'D' AS thegroup, 'female' AS gender
)
/* THE QUERY */
SELECT b.*
FROM (
SELECT thegroup, gender, RANK() OVER (PARTITION BY thegroup ORDER BY gender) AS rn /* Sets the ranked groups of 'thegroup' */
FROM a
) b
WHERE b.rn = 1 /* Gets first group. */
AND thegroup = 'A'
data script
declare #data table ([group] char(1), [gender] varchar(16));
insert into #data values ('A', 'all'), ('A', 'female') ,('B', 'all') ,('C', 'female') ,('C', 'all');
query
select
[group] = [d].[group]
,[gender] = [x].[gender]
from
#data as [d]
cross apply
(
select top 1 [gender] from #data where [group] = [d].[group] order by iif([gender] = 'all', 0, 1) asc
) as [x]
group by
[d].[group]
,[x].[gender];

Suggestions for creating a Unique ID when doing a UNION of more tables

I have a view made like this
CREATE VIEW PEOPLE
AS
SELECT CustomerId, CustomerName FROM Customers
UNION ALL
SELECT EmployeeId, EmployeeName FROM Employees
UNION ALL
SELECT FriendId, FriendName From Friends
now I need to add unique ID for the view, because of course i can have a CustomerId = 15 and an EmployeeID = 15
So the trick I am doing is the following
SELECT
CAST('1' + CAST(CustomerId AS VARCHAR(30)) AS INT) as UniqueCustomerId,
CustomerId, CustomerName FROM Customers
UNION ALL
SELECT
CAST('2' + CAST(EmployeeId AS VARCHAR(30)) AS INT) as UniqueEmployeeId,
EmployeeId, EmployeeName FROM Employees
UNION ALL
SELECT
CAST('3' + CAST(FriendId AS VARCHAR(30)) AS INT) as UniqueFriendId,
FriendId, FriendName From Friends
Anyway this casting to varchar(30) and back to int is an overhead since I have many records.
Could you suggest a better approach?
If you've got to have a single id, just do math:
SELECT 1000000 + CustomerID AS UniqueCustomerId
, CustomerId
, CustomerName
FROM Customers
UNION ALL
SELECT 2000000 + EmployeeId AS UniqueEmployeeId
, EmployeeId
, EmployeeName
FROM Employees
UNION ALL
SELECT 3000000 + FriendId AS UniqueFriendId
, FriendId
, FriendName
FROM Friends
I prefer this approach as it can still use indexes on the CustomerID, EmployeeID, and FriendID
SELECT 1 [PersonType],CustomerId [PersonId], CustomerName [PersonName] FROM Customers
UNION ALL
SELECT 2, EmployeeId, EmployeeName FROM Employees
UNION ALL
SELECT 3, FriendId, FriendName From Friends
This should work fine:
CREATE VIEW PEOPLE AS
SELECT CustomerId, null as EmployeeId, null as FriendId, CustomerName FROM Customers
UNION ALL
SELECT null, EmployeeId, null, EmployeeName FROM Employees
UNION ALL
SELECT null, null, FriendId, FriendName From Friends
Clean and well readable, I use this in Postgres.
select CAST(row_number() OVER () AS bigint) AS id,
*
from (
select mtm.id as mkt_tech_map_id,
'vesivagu' as type,
mv.geom,
'' as label
from mkt_tech_map mtm
left join mkt_vesivagu mv on mtm.id = mv.mkt_tech_map_id
union
select mtm.id as mkt_tech_map_id,
'kraavisete' as type,
mk.geom,
'' as label
from mkt_tech_map mtm
left join mkt_kraavisete mk on mtm.id = mk.mkt_tech_map_id
) as q

Date Range Intersection Splitting in SQL

I have a SQL Server 2005 database which contains a table called Memberships.
The table schema is:
PersonID int, Surname nvarchar(30), FirstName nvarchar(30), Description nvarchar(100), StartDate datetime, EndDate datetime
I'm currently working on a grid feature which shows a break-down of memberships by person. One of the requirements is to split membership rows where there is an intersection of date ranges. The intersection must be bound by the Surname and FirstName, ie splits only occur with membership records of the same Surname and FirstName.
Example table data:
18 Smith John Poker Club 01/01/2009 NULL
18 Smith John Library 05/01/2009 18/01/2009
18 Smith John Gym 10/01/2009 28/01/2009
26 Adams Jane Pilates 03/01/2009 16/02/2009
Expected result set:
18 Smith John Poker Club 01/01/2009 04/01/2009
18 Smith John Poker Club / Library 05/01/2009 09/01/2009
18 Smith John Poker Club / Library / Gym 10/01/2009 18/01/2009
18 Smith John Poker Club / Gym 19/01/2009 28/01/2009
18 Smith John Poker Club 29/01/2009 NULL
26 Adams Jane Pilates 03/01/2009 16/02/2009
Does anyone have any idea how I could write a stored procedure that will return a result set which has the break-down described above.
The problem you are going to have with this problem is that as the data set grows, the solutions to solve it with TSQL won't scale well. The below uses a series of temporary tables built on the fly to solve the problem. It splits each date range entry into its respective days using a numbers table. This is where it won't scale, primarily due to your open ranged NULL values which appear to be inifinity, so you have to swap in a fixed date far into the future that limits the range of conversion to a feasible length of time. You could likely see better performance by building a table of days or a calendar table with appropriate indexing for optimized rendering of each day.
Once the ranges are split, the descriptions are merged using XML PATH so that each day in the range series has all of the descriptions listed for it. Row Numbering by PersonID and Date allows for the first and last row of each range to be found using two NOT EXISTS checks to find instances where a previous row doesn't exist for a matching PersonID and Description set, or where the next row doesn't exist for a matching PersonID and Description set.
This result set is then renumbered using ROW_NUMBER so that they can be paired up to build the final results.
/*
SET DATEFORMAT dmy
USE tempdb;
GO
CREATE TABLE Schedule
( PersonID int,
Surname nvarchar(30),
FirstName nvarchar(30),
Description nvarchar(100),
StartDate datetime,
EndDate datetime)
GO
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Poker Club', '01/01/2009', NULL)
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Library', '05/01/2009', '18/01/2009')
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Gym', '10/01/2009', '28/01/2009')
INSERT INTO Schedule VALUES (26, 'Adams', 'Jane', 'Pilates', '03/01/2009', '16/02/2009')
GO
*/
SELECT
PersonID,
Description,
theDate
INTO #SplitRanges
FROM Schedule, (SELECT DATEADD(dd, number, '01/01/2008') AS theDate
FROM master..spt_values
WHERE type = N'P') AS DayTab
WHERE theDate >= StartDate
AND theDate <= isnull(EndDate, '31/12/2012')
SELECT
ROW_NUMBER() OVER (ORDER BY PersonID, theDate) AS rowid,
PersonID,
theDate,
STUFF((
SELECT '/' + Description
FROM #SplitRanges AS s
WHERE s.PersonID = sr.PersonID
AND s.theDate = sr.theDate
FOR XML PATH('')
), 1, 1,'') AS Descriptions
INTO #MergedDescriptions
FROM #SplitRanges AS sr
GROUP BY PersonID, theDate
SELECT
ROW_NUMBER() OVER (ORDER BY PersonID, theDate) AS ID,
*
INTO #InterimResults
FROM
(
SELECT *
FROM #MergedDescriptions AS t1
WHERE NOT EXISTS
(SELECT 1
FROM #MergedDescriptions AS t2
WHERE t1.PersonID = t2.PersonID
AND t1.RowID - 1 = t2.RowID
AND t1.Descriptions = t2.Descriptions)
UNION ALL
SELECT *
FROM #MergedDescriptions AS t1
WHERE NOT EXISTS
(SELECT 1
FROM #MergedDescriptions AS t2
WHERE t1.PersonID = t2.PersonID
AND t1.RowID = t2.RowID - 1
AND t1.Descriptions = t2.Descriptions)
) AS t
SELECT DISTINCT
PersonID,
Surname,
FirstName
INTO #DistinctPerson
FROM Schedule
SELECT
t1.PersonID,
dp.Surname,
dp.FirstName,
t1.Descriptions,
t1.theDate AS StartDate,
CASE
WHEN t2.theDate = '31/12/2012' THEN NULL
ELSE t2.theDate
END AS EndDate
FROM #DistinctPerson AS dp
JOIN #InterimResults AS t1
ON t1.PersonID = dp.PersonID
JOIN #InterimResults AS t2
ON t2.PersonID = t1.PersonID
AND t1.ID + 1 = t2.ID
AND t1.Descriptions = t2.Descriptions
DROP TABLE #SplitRanges
DROP TABLE #MergedDescriptions
DROP TABLE #DistinctPerson
DROP TABLE #InterimResults
/*
DROP TABLE Schedule
*/
The above solution will also handle gaps between additional Descriptions as well, so if you were to add another Description for PersonID 18 leaving a gap:
INSERT INTO Schedule VALUES (18, 'Smith', 'John', 'Gym', '10/02/2009', '28/02/2009')
It will fill the gap appropriately. As pointed out in the comments, you shouldn't have name information in this table, it should be normalized out to a Persons Table that can be JOIN'd to in the final result. I simulated this other table by using a SELECT DISTINCT to build a temp table to create that JOIN.
Try this
SET DATEFORMAT dmy
DECLARE #Membership TABLE(
PersonID int,
Surname nvarchar(16),
FirstName nvarchar(16),
Description nvarchar(16),
StartDate datetime,
EndDate datetime)
INSERT INTO #Membership VALUES (18, 'Smith', 'John', 'Poker Club', '01/01/2009', NULL)
INSERT INTO #Membership VALUES (18, 'Smith', 'John','Library', '05/01/2009', '18/01/2009')
INSERT INTO #Membership VALUES (18, 'Smith', 'John','Gym', '10/01/2009', '28/01/2009')
INSERT INTO #Membership VALUES (26, 'Adams', 'Jane','Pilates', '03/01/2009', '16/02/2009')
--Program Starts
declare #enddate datetime
--Measuring extreme condition when all the enddates are null(i.e. all the memberships for all members are in progress)
-- in such a case taking any arbitary date e.g. '31/12/2009' here else add 1 more day to the highest enddate
select #enddate = case when max(enddate) is null then '31/12/2009' else max(enddate) + 1 end from #Membership
--Fill the null enddates
; with fillNullEndDates_cte as
(
select
row_number() over(partition by PersonId order by PersonId) RowNum
,PersonId
,Surname
,FirstName
,Description
,StartDate
,isnull(EndDate,#enddate) EndDate
from #Membership
)
--Generate a date calender
, generateCalender_cte as
(
select
1 as CalenderRows
,min(startdate) DateValue
from #Membership
union all
select
CalenderRows+1
,DateValue + 1
from generateCalender_cte
where DateValue + 1 <= #enddate
)
--Generate Missing Dates based on Membership
,datesBasedOnMemberships_cte as
(
select
t.RowNum
,t.PersonId
,t.Surname
,t.FirstName
,t.Description
, d.DateValue
,d.CalenderRows
from generateCalender_cte d
join fillNullEndDates_cte t ON d.DateValue between t.startdate and t.enddate
)
--Generate Dscription Based On Membership Dates
, descriptionBasedOnMembershipDates_cte as
(
select
PersonID
,Surname
,FirstName
,stuff((
select '/' + Description
from datesBasedOnMemberships_cte d1
where d1.PersonID = d2.PersonID
and d1.DateValue = d2.DateValue
for xml path('')
), 1, 1,'') as Description
, DateValue
,CalenderRows
from datesBasedOnMemberships_cte d2
group by PersonID, Surname,FirstName,DateValue,CalenderRows
)
--Grouping based on membership dates
,groupByMembershipDates_cte as
(
select d.*,
CalenderRows - row_number() over(partition by Description order by PersonID, DateValue) AS [Group]
from descriptionBasedOnMembershipDates_cte d
)
select PersonId
,Surname
,FirstName
,Description
,convert(varchar(10), convert(datetime, min(DateValue)), 103) as StartDate
,case when max(DateValue)= #enddate then null else convert(varchar(10), convert(datetime, max(DateValue)), 103) end as EndDate
from groupByMembershipDates_cte
group by [Group],PersonId,Surname,FirstName,Description
order by PersonId,StartDate
option(maxrecursion 0)
[Only many, many years later.]
I created a stored procedure that will align and break segments by a partition within a single table, and then you can use those aligned breaks to pivot the description into a ragged column using a subquery and XML PATH.
See if the below help:
Documentation: https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.md
Stored Procedure: https://github.com/Quebe/SQL-Algorithms/blob/master/Temporal/Date%20Segment%20Manipulation/DateSegments_AlignWithinTable.sql
For example, your call might look like:
EXEC dbo.DateSegments_AlignWithinTable
#tableName = 'tableName',
#keyFieldList = 'PersonID',
#nonKeyFieldList = 'Description',
#effectivveDateFieldName = 'StartDate',
#terminationDateFieldName = 'EndDate'
You will want to capture the result (which is a table) into another table or temporary table (assuming it is called "AlignedDataTable" in below example). Then, you can pivot using a subquery.
SELECT
PersonID, StartDate, EndDate,
SUBSTRING ((SELECT ',' + [Description] FROM AlignedDataTable AS innerTable
WHERE
innerTable.PersonID = AlignedDataTable.PersonID
AND (innerTable.StartDate = AlignedDataTable.StartDate)
AND (innerTable.EndDate = AlignedDataTable.EndDate)
ORDER BY id
FOR XML PATH ('')), 2, 999999999999999) AS IdList
FROM AlignedDataTable
GROUP BY PersonID, StartDate, EndDate
ORDER BY PersonID, StartDate

Resources