I am trying to assign a group number to distinct groups of rows in a dataset that has changing data over time. The changing fields are tran_seq, prog_id, deg-id, cur_id, and enroll_status in my example. When any of those fields are different from the previous row, I need a new grouping number. When the fields are the same as the prior row, then the grouping number should stay the same. When I try ROW_NUMBER(), RANK(), or DENSE_RANK(), I get increasing values for the same group (e.g. the first 2 rows in example). I feel I need to ORDER BY start_date as it is temporal data.
+----+----------+---------+--------+--------+---------------+------------+------------+---------+
| | tran_seq | prog_id | deg_id | cur_id | enroll_status | start_date | end_date | desired |
+----+----------+---------+--------+--------+---------------+------------+------------+---------+
| 1 | 1 | 6 | 9 | 3 | ENRL | 2004-08-22 | 2004-12-11 | 1 |
| 2 | 1 | 6 | 9 | 3 | ENRL | 2006-01-10 | 2006-05-06 | 1 |
| 3 | 1 | 6 | 9 | 59 | ENRL | 2006-08-29 | 2006-12-16 | 2 |
| 4 | 2 | 12 | 23 | 45 | ENRL | 2014-01-21 | 2014-05-16 | 3 |
| 5 | 2 | 12 | 23 | 45 | ENRL | 2014-08-18 | 2014-12-05 | 3 |
| 6 | 2 | 12 | 23 | 45 | LOAP | 2015-01-20 | 2015-05-15 | 4 |
| 7 | 2 | 12 | 23 | 45 | ENRL | 2015-08-25 | 2015-12-11 | 5 |
| 8 | 2 | 12 | 23 | 45 | LOAP | 2016-01-12 | 2016-05-06 | 6 |
| 9 | 2 | 12 | 23 | 45 | ENRL | 2016-05-16 | 2016-08-05 | 7 |
| 10 | 2 | 12 | 23 | 45 | LOAJ | 2016-08-23 | 2016-12-02 | 8 |
| 11 | 2 | 12 | 23 | 45 | ENRL | 2017-01-18 | 2017-05-05 | 9 |
| 12 | 2 | 12 | 23 | 45 | ENRL | 2018-01-17 | 2018-05-11 | 9 |
+----+----------+---------+--------+--------+---------------+------------+------------+---------+
Once I have grouping numbers, I think I can group by those to get what I'm ultimately after: a timeline of different statuses with start dates and end dates. For the example data above, that would be:
+---+----------+---------+--------+--------+---------------+------------+------------+
| | tran_seq | prog_id | deg_id | cur_id | enroll_status | start_date | end_date |
+---+----------+---------+--------+--------+---------------+------------+------------+
| 1 | 1 | 6 | 9 | 3 | ENRL | 2004-08-22 | 2006-05-06 |
| 2 | 1 | 6 | 9 | 59 | ENRL | 2004-08-29 | 2006-12-16 |
| 3 | 2 | 12 | 23 | 45 | ENRL | 2014-01-21 | 2014-12-05 |
| 4 | 2 | 12 | 23 | 45 | LOAP | 2015-01-20 | 2015-05-15 |
| 5 | 2 | 12 | 23 | 45 | ENRL | 2015-08-25 | 2015-12-11 |
| 6 | 2 | 12 | 23 | 45 | LOAP | 2016-01-12 | 2016-05-06 |
| 7 | 2 | 12 | 23 | 45 | ENRL | 2016-05-16 | 2016-08-05 |
| 8 | 2 | 12 | 23 | 45 | LOAJ | 2016-08-23 | 2016-12-02 |
| 9 | 2 | 12 | 23 | 45 | ENRL | 2017-01-17 | 2018-05-06 |
+---+----------+---------+--------+--------+---------------+------------+------------+
This is a classic XY problem, in that you are asking for an intermediate step to a different solution, rather than asking about the solution itself.
As you included your overall end goal as a bit of an addendum however, here is how you can reach that without your intermediate step:
declare #t table(tran_seq int, prog_id int, deg_id int, cur_id int, enroll_status varchar(4), start_date date, end_date date, desired int)
insert into #t values
(1,6,9,3 ,'ENRL','2004-08-22','2004-12-11',1)
,(1,6,9,3 ,'ENRL','2006-01-10','2006-05-06',1)
,(1,6,9,59 ,'ENRL','2006-08-29','2006-12-16',2)
,(2,12,23,45,'ENRL','2014-01-21','2014-05-16',3)
,(2,12,23,45,'ENRL','2014-08-18','2014-12-05',3)
,(2,12,23,45,'LOAP','2015-01-20','2015-05-15',4)
,(2,12,23,45,'ENRL','2015-08-25','2015-12-11',5)
,(2,12,23,45,'LOAP','2016-01-12','2016-05-06',6)
,(2,12,23,45,'ENRL','2016-05-16','2016-08-05',7)
,(2,12,23,45,'LOAJ','2016-08-23','2016-12-02',8)
,(2,12,23,45,'ENRL','2017-01-18','2017-05-05',9)
,(2,12,23,45,'ENRL','2018-01-17','2018-05-11',9)
;
select tran_seq
,prog_id
,deg_id
,cur_id
,enroll_status
,min(start_date) as start_date
,max(end_date) as end_date
from(select *
,row_number() over (order by end_date) - row_number() over (partition by tran_seq,prog_id,deg_id,cur_id,enroll_status order by end_date) as grp
from #t
) AS g
group by tran_seq
,prog_id
,deg_id
,cur_id
,enroll_status
,grp
order by start_date;
Output
+----------+---------+--------+--------+---------------+------------+------------+
| tran_seq | prog_id | deg_id | cur_id | enroll_status | start_date | end_date |
+----------+---------+--------+--------+---------------+------------+------------+
| 1 | 6 | 9 | 3 | ENRL | 2004-08-22 | 2006-05-06 |
| 1 | 6 | 9 | 59 | ENRL | 2006-08-29 | 2006-12-16 |
| 2 | 12 | 23 | 45 | ENRL | 2014-01-21 | 2014-12-05 |
| 2 | 12 | 23 | 45 | LOAP | 2015-01-20 | 2015-05-15 |
| 2 | 12 | 23 | 45 | ENRL | 2015-08-25 | 2015-12-11 |
| 2 | 12 | 23 | 45 | LOAP | 2016-01-12 | 2016-05-06 |
| 2 | 12 | 23 | 45 | ENRL | 2016-05-16 | 2016-08-05 |
| 2 | 12 | 23 | 45 | LOAJ | 2016-08-23 | 2016-12-02 |
| 2 | 12 | 23 | 45 | ENRL | 2017-01-18 | 2018-05-11 |
+----------+---------+--------+--------+---------------+------------+------------+
Similar to a previous question I have asked only this time with minor issue numbers.
I am wondering how I can correct this table:
+-----------+--------------+-------------+-------------+
| pkProduct | fkProductID |intMajorIssue|intMinorIssue|
+-----------+--------------+-------------+-------------+
| 1 | 10 | 1 | 0 |
| 2 | 10 | 2 | 0 |
| 3 | 10 | 2 | 1 |
| 4 | 10 | 2 | 1 |
| 5 | 10 | 2 | 2 |
| 6 | 11 | 1 | 0 |
| 7 | 11 | 1 | 1 |
| 8 | 11 | 1 | 1 |
| 9 | 11 | 1 | 1 |
| 10 | 11 | 2 | 0 |
| 11 | 11 | 2 | 1 |
| 12 | 12 | 1 | 0 |
| 13 | 12 | 2 | 1 |
+-----------+--------------+-------------+-------------+`
To look like this:
+-----------+--------------+-------------+-------------+
| pkProduct | fkProductID |intMajorIssue|intMinorIssue|
+-----------+--------------+-------------+-------------+
| 1 | 10 | 1 | 0 |
| 2 | 10 | 2 | 0 |
| 3 | 10 | 2 | 1 |
| 4 | 10 | 2 | 2 |
| 5 | 10 | 2 | 3 |
| 6 | 11 | 1 | 0 |
| 7 | 11 | 1 | 1 |
| 8 | 11 | 1 | 2 |
| 9 | 11 | 1 | 3 |
| 10 | 11 | 2 | 0 |
| 11 | 11 | 2 | 1 |
| 12 | 12 | 1 | 0 |
| 13 | 12 | 2 | 0 |
+-----------+--------------+-------------+-------------+`
Basically I need to fix the minor issue numbers so that they run in order for each product.
I have been trying to amend the answer I was given on the previous question to do this but so far having no luck.
Any help or advice would be much appreciated!
With ROW_NUMBER() window function:
with cte as (
select *,
row_number() over (partition by fkProductID, intMajorIssue order by pkProduct) rn
from tablename
)
update cte
set intMinorIssue = rn - 1
See the demo.
Results:
> pkProduct | fkProductID | intMajorIssue | intMinorIssue
> --------: | ----------: | ------------: | ------------:
> 1 | 10 | 1 | 0
> 2 | 10 | 2 | 0
> 3 | 10 | 2 | 1
> 4 | 10 | 2 | 2
> 5 | 10 | 2 | 3
> 6 | 11 | 1 | 0
> 7 | 11 | 1 | 1
> 8 | 11 | 1 | 2
> 9 | 11 | 1 | 3
> 10 | 11 | 2 | 0
> 11 | 11 | 2 | 1
> 12 | 12 | 1 | 0
> 13 | 12 | 2 | 0
I have a SQL Server table T1 that has orders by product_id, brand, and size for each day.
T1:
+------------+-------+------+----------+--------+
| product_id | Brand | Size | Date | Orders |
+------------+-------+------+----------+--------+
| 1 | 1 | 11 | 10/18/18 | 1 |
| 1 | 1 | 6 | 10/18/18 | 2 |
| 1 | 1 | 10 | 10/18/18 | 1 |
| 1 | 1 | 7 | 10/18/18 | 3 |
| 1 | 1 | 8.5 | 10/18/18 | 5 |
| 1 | 1 | 9.5 | 10/18/18 | 2 |
| 2 | 1 | 8 | 10/19/18 | 3 |
| 2 | 1 | 7 | 10/19/18 | 6 |
| 2 | 1 | 9 | 10/19/18 | 2 |
| 3 | 2 | 5 | 10/19/18 | 23 |
| 3 | 2 | 6 | 10/19/18 | 6 |
| 3 | 2 | 10 | 10/19/18 | 7 |
+------------+-------+------+----------+--------+
I also have a table, T2, that has the launch date for each product_id. A product_id may have more than one launch dates, signifying it is "restocked".
T2:
+------------+-------------+
| product_id | launch_date |
+------------+-------------+
| 1 | 8/18/18 |
| 1 | 10/18/18 |
| 2 | 10/18/18 |
| 3 | 4/18/18 |
+------------+-------------+
My goal is to create a table that is just the first 10 days of orders in each launch date (for each product_id, brand, and size). So if launch date for product 1 is 8/18/18 and 10/18/18, then I want the daily orders from 8/18/18 to 8/28/18, and from 10/18/18 to 10/28/18.
How would I go about creating this table?
Example output:
+------------+-------+------+----------+--------+
| product_id | Brand | Size | Date | Orders |
+------------+-------+------+----------+--------+
| 1 | 1 | 11 | 10/18/18 | 1 |
| 1 | 1 | 6 | 10/18/18 | 2 |
| 1 | 1 | 10 | 10/18/18 | 1 |
| 1 | 1 | 7 | 10/18/18 | 3 |
| 1 | 1 | 8.5 | 10/18/18 | 5 |
| 1 | 1 | 9.5 | 10/18/18 | 2 |
| … | | | | |
| 1 | 1 | 11 | 10/22/18 | 4 |
| 1 | 1 | 6 | 10/22/18 | 6 |
| 1 | 1 | 10 | 10/22/18 | 2 |
| 1 | 1 | 7 | 10/22/18 | 2 |
| 1 | 1 | 8.5 | 10/22/18 | 2 |
| 1 | 1 | 9.5 | 10/22/18 | 5 |
| … | | | | |
| 1 | 1 | 11 | 10/28/18 | 7 |
| 1 | 1 | 6 | 10/28/18 | 4 |
| 1 | 1 | 10 | 10/28/18 | 2 |
| 1 | 1 | 7 | 10/28/18 | 2 |
| 1 | 1 | 8.5 | 10/28/18 | 8 |
| 1 | 1 | 9.5 | 10/28/18 | 7 |
| … | | | | |
| 2 | 1 | 8 | 10/19/18 | 3 |
| 2 | 1 | 7 | 10/19/18 | 6 |
| 2 | 1 | 9 | 10/19/18 | 2 |
| 3 | 2 | 5 | 10/19/18 | 23 |
| 3 | 2 | 6 | 10/19/18 | 6 |
| 3 | 2 | 10 | 10/19/18 | 7 |
+------------+-------+------+----------+--------+
Thank you!
EDIT: including what I have tried so far:
My thought process is to try to create to join the launch_date and then create a column that is the number of days between the launch date and the Date of order. Then I can just filter for WHERE that column is less than or equal to 10.
This is the query I am using:
with temp as (
select
t1.product_id, t1.brand, t1.size, t1.date, t1.orders, t2.launch_date
from t1
left join t2 on t1.product_id = t2.product_id and t1.order_date = t2.order_date
)
select product_id,
brand,
size,
size,
date,
orders,
launch_date
from temp
;
In order for my reasoning to work, I would need to forward-fill the launch_date wherever it is null. I am not sure how to accomplish this. Here is the output I have so far:
+------------+-------+------+----------+--------+-------------+
| product_id | Brand | Size | Date | Orders | launch_date |
+------------+-------+------+----------+--------+-------------+
| 1 | 1 | 11 | 10/18/18 | 1 | 10/18/18 |
| 1 | 1 | 6 | 10/18/18 | 2 | 10/18/18 |
| 1 | 1 | 10 | 10/18/18 | 1 | 10/18/18 |
| 1 | 1 | 7 | 10/18/18 | 3 | 10/18/18 |
| 1 | 1 | 8.5 | 10/18/18 | 5 | 10/18/18 |
| 1 | 1 | 9.5 | 10/18/18 | 2 | 10/18/18 |
| … | | | | | |
| 1 | 1 | 11 | 10/22/18 | 4 | NULL |
| 1 | 1 | 6 | 10/22/18 | 6 | NULL |
| 1 | 1 | 10 | 10/22/18 | 2 | NULL |
| 1 | 1 | 7 | 10/22/18 | 2 | NULL |
| 1 | 1 | 8.5 | 10/22/18 | 2 | NULL |
| 1 | 1 | 9.5 | 10/22/18 | 5 | NULL |
| … | | | | | |
| 1 | 1 | 11 | 10/28/18 | 7 | NULL |
| 1 | 1 | 6 | 10/28/18 | 4 | NULL |
| 1 | 1 | 10 | 10/28/18 | 2 | NULL |
| 1 | 1 | 7 | 10/28/18 | 2 | NULL |
| 1 | 1 | 8.5 | 10/28/18 | 8 | NULL |
| 1 | 1 | 9.5 | 10/28/18 | 7 | NULL |
| … | | | | | |
| 2 | 1 | 8 | 10/19/18 | 3 | 10/18/18 |
| 2 | 1 | 7 | 10/19/18 | 6 | 10/18/18 |
| 2 | 1 | 9 | 10/19/18 | 2 | 10/18/18 |
| 3 | 2 | 5 | 10/19/18 | 23 | 10/18/18 |
| 3 | 2 | 6 | 10/19/18 | 6 | 10/18/18 |
| 3 | 2 | 10 | 10/19/18 | 7 | 10/18/18 |
+------------+-------+------+----------+--------+-------------+
If I can forward-fill the launch_date wherever it is NULL to be the most recent launch_date of that product_id, then I would be able to create a column to subtract the dates.
If anyone can point me in the right direction, I would appreciate it.
This is the result of a CTE query on multiple tables. I require to redefine the output and I can only think of using a pivot to do it.
Id | Parent_Id | Description | Account_Number | Year_of_Entry | Amount
-----------------------------------------------------------------------
1 | NULL | V | 001 | 2017 | 4
2 | 1 | W | 002 | 2017 | 2
3 | 2 | X | 003 | 2017 | 1
4 | 2 | Y | 004 | 2017 | 1
5 | 1 | Z | 005 | 2017 | 2
6 | 5 | T | 006 | 2017 | 2
7 | 6 | X | 007 | 2017 | 1
8 | 6 | Y | 008 | 2017 | 1
1 | NULL | V | 001 | 2016 | 8
2 | 1 | W | 002 | 2016 | 4
3 | 2 | X | 003 | 2016 | 2
4 | 2 | Y | 004 | 2016 | 2
5 | 1 | Z | 005 | 2016 | 4
6 | 5 | X | 006 | 2016 | 2
7 | 5 | Y | 007 | 2016 | 2
I would like to get an output that matches this one.
Id | Parent_Id | Description | Account_Number | Year_of_entry| Amount| X | Y
---------------------------------------------------------------------------------
1 | NULL | V | 001 | 2017 | 4 | 2 | 2
2 | 1 | W | 002 | 2017 | 2 | 1 | 1
5 | 1 | Z | 005 | 2017 | 2 | 1 | 1
6 | 5 | T | 006 | 2017 | 2 | 1 | 1
1 | NULL | V | 001 | 2016 | 8 | 4 | 4
2 | 1 | W | 002 | 2016 | 4 | 2 | 2
5 | 1 | Z | 005 | 2016 | 4 | 2 | 2
Current output with the CTE recursion query
Id | Parent_Id | Description | Account_Number | Year_of_entry| Amount| X | Y
---------------------------------------------------------------------------------
1 | NULL | V | 001 | 2017 | 4 | 0 | 0
2 | 1 | W | 002 | 2017 | 2 | 1 | 1
5 | 1 | Z | 005 | 2017 | 2 | 0 | 0
6 | 5 | T | 006 | 2017 | 2 | 1 | 1
1 | NULL | V | 001 | 2016 | 8 | 0 | 0
2 | 1 | W | 002 | 2016 | 4 | 2 | 2
5 | 1 | Z | 005 | 2016 | 4 | 2 | 2
Current output with #Daniel code
Id | Parent_Id | Description | Account_Number | Year_of_entry| Amount| X | Y
---------------------------------------------------------------------------------
2 | 1 | W | 002 | 2017 | 2 | 1 | 1
6 | 5 | T | 006 | 2017 | 2 | 1 | 1
2 | 1 | W | 002 | 2016 | 4 | 2 | 2
5 | 1 | Z | 005 | 2016 | 4 | 2 | 2
I have used isnull to convert to 0
EDIT : Thanks for the Help.
I ended up using 2 recursive CTEs to resolve this.
The first to get the X and Y values to the Parent.
The Second to pass all the totals up the tree to the root.
Thanks again for the assistance.
Regards
MJK
Use conditional logic with aggregation to create your x and y columns:
select a.Id, a.Parent_Id, a.Description, a.Account_Number, a.Year_of_Entry, a.Amount,
max(case when b.description in ('x','y')
then null else b.amount end) amount, sum(case when b.description='x' then b.amount else null end) X,
sum(case when b.description='y' then b.amount else null end) y from yourtable a
join yourtable b on (a.id=b.parent_id or a.parent_id is null) and a.Year_of_Entry=b.Year_of_Entry
where b. description in ('x','y')
group by a.Id, a.Parent_Id, a.Description, a.Account_Number, a.Year_of_Entry, a.Amount
order by a.Year_of_Entry desc, a.parent_id
I have this table of information
fk_studentID | fk_courseID | fk_educationalSemesterID | value
-------------+-------------+--------------------------+------
1 | 1 | 1 | 18
1 | 2 | 1 | 18
1 | 3 | 1 | 14
1 | 4 | 1 | 17
1 | 5 | 1 | 14
1 | 6 | 1 | 17
1 | 8 | 1 | 18
1 | 1 | 2 | 19
1 | 2 | 2 | 19
1 | 3 | 2 | 18
1 | 4 | 2 | 15
1 | 4 | 2 | 19
1 | 5 | 2 | 20
1 | 1 | 3 | 17
1 | 8 | 3 | 20
Need to prepare output result as:
fk_studentID | fk_courseID | 1st Semester | 2nd Semester | 3rd Semester
-------------+-------------+--------------+--------------+-------------
1 | 1 | 18 | 19 | 17
1 | 2 | 18 | 19 |
1 | 3 | 14 | 18 |
1 | 4 | 17 | 15 |
1 | 5 | 14 | 19 |
1 | 6 | 17 | |
1 | 8 | 18 | 20 | 20
Please help
I think you are just looking for a standard pivot query:
SELECT fk_studentID,
fk_courseID,
MAX(CASE WHEN fk_educationalSemesterID = 1 THEN value END) AS 1st_semester,
MAX(CASE WHEN fk_educationalSemesterID = 2 THEN value END) AS 2nd_semester,
MAX(CASE WHEN fk_educationalSemesterID = 3 THEN value END) AS 3rd_semester
FROM yourTable
GROUP BY fk_studentID,
fk_courseID