How to generate IDs based on column values - sql-server

I will provide examples and code where I can. Assume everything except [CycleStart] and [CycleEnd] datatypes are Varchar, I'm not too fussed about this at this stage.
Table A consists of the following RAW sample data:
+-------+---------+----------------+------------+------------+
| JobID | JobName | CycleDesc | CycleStart | CycleEnd |
+-------+---------+----------------+------------+------------+
| 10003 | Run1 | January 2019 | 31/12/2018 | 31/12/2018 |
| 10005 | Run2 | December 2018 | 31/12/2017 | 31/11/2018 |
| 10006 | Run3 | March 2019 | 31/12/2018 | 31/02/2019 |
| 10007 | Run4 | September 2019 | 31/12/2018 | 31/09/2019 |
| 10008 | Run5 | November 2019 | 31/12/2018 | 31/10/2019 |
+-------+---------+----------------+------------+------------+
Table B consists of the following sample data and the code used to generate this data is below:
+-------+---------+---------+
| JobID | PeriodID | Entity |
+-------+---------+---------+
| 10003 | 202101 | XYZ1 |
| 10003 | 202112 | XYZ2 |
| 10007 | 202008 | XYZ3 |
| 10007 | 202003 | XYZ4 |
| 10008 | 201904 | XYZ5 |
+-------+----------+--------+
Declare #Counter3 INT
SELECT #Counter3=1
WHILE #Counter3 <= 1000
BEGIN
INSERT INTO [dbo].[TableB]
SELECT
FLOOR(RAND()*(33979-1+1))+1 [JobID]
,CAST(ROUND(((2021 - 2019 -1) * RAND() + 2020), 0) AS VARCHAR) + RIGHT('0'+CAST(FLOOR(RAND()*(12-1+1))+1 AS VARCHAR),2) [PeriodID]
,FLOOR(RAND()*(23396-1+1))+1 [Entity]
The issue lies within Table B column [PeriodID]. This column represents an ID generated from [CycleStart] in Table A e.g. 31/12/2018 = 201812 (YYYYMM).
What I want to show in Table B is a Period ID for each Job ID but show EACH month + 30 years ahead of the [CycleStart] date. Example table of what I am looking to achieve:
+-------+---------+---------+
| JobID | PeriodID | Entity |
+-------+---------+---------+
| 10006 | 201812 | XYZ1 |
| 10006 | 201901 | XYZ2 |
| 10006 | 201902 | XYZ3 |
| 10006 | 201903 | XYZ4 |
| 10006 | 201904 | XYZ5 |
| 10006 | 201905 | XYZ5 |
| 10006 | 201906 | XYZ5 |
| 10006 | 201907 | XYZ5 |
| ... | +30yrs | ... |
| 10006 | 204812 | XYZ5 |
+-------+----------+--------+
How can I achieve this? Currently I am just randomly generating IDs which is not related to the [CycleStart] date and therefore just skewing my data but this is the only way I can think of doing it.

The best way is to create a calendar table / date dimension. You can use this table to solve this issue, and reuse it for other problems later. (Search online for some examples on how to build one).
If you have this table then you only need to join this table and that's it.
e.g.
INSERT INTO TableB ( JobID , PeriodID)
SELECT DISTINCT A.JobID , D.TheYear * 100 + D.TheMonth
FROM tableA A
JOIN myDateTable D
ON D.TheDate BETWEEN CONVERT(date , A.CycleStart , 103) AND DATEADD(YEAR,30, CONVERT(date , A.CycleStart , 103));

Related

MySQLServer: Check if conditions exist in group, then label entire group

My goal is to add another column to an existing table, to see if the value/conditions exists in a group and appropriately labeling the entire group if it is present or not.
If a Team has one project with a budget >= 20M or Actual_Spend >=2.5M I want to label the Team and all it's projects as Table 1 in the Category column. Irrespective if the other projects within the same Team fit this criteria.
I will provide a SQL fiddle link w/ my solution: http://sqlfiddle.com/#!18/3ddaf/12/0
I'm ending up with two extra columns of "Team" and "Category" and not sure how they're ending up there.
Below is the end result I'm looking for. I'm open to better solutions than the one I provided.
Thank you for your time
| Team | ProjectID | Budget | Actual_Spend | State | Category |
|------|-----------|----------|--------------|------------|----------|
| Cyan | 2 | NULL | NULL | Utah | Table 1 |
| Blue | 1 | NULL | 3000000 | California | Table 1 |
| Cyan | 1 | 20000000 | 1000000 | Utah | Table 1 |
| Blue | 2 | 22000000 | NULL | California | Table 1 |
| Red | 1 | 7000000 | 1000000 | Washington | Table 2 |
| Red | 2 | 19999000 | 2490000 | Oregon | Table 2 |
| Gray | 1 | 19000000 | 2500000 | Utah | Table 1 |
| Gray | 1 | 10000000 | 500000 | Utah | Table 1 |
Providing code to create the dataset:
Create Table Source_Data
(
Team varchar(50),
ProjectID INT,
BUDGET INT,
Actual_Spend INT,
State varchar(max),
)
INSERT INTO Source_Data
VALUES
('Blue',1,NULL,3000000,'California'),
('Green',1,20000000,1000000,'Utah'),
('Blue',2,22000000,NULL,'California'),
('Green',2,NULL,NULL,'Utah'),
('Red',1,7000000,1000000,'Washington'),
('Red',2,19999000,2490000,'Oregon'),
('Yellow',1,19000000,2500000,'Utah'),
('Yellow',1,10000000,500000,'Utah');
I think that you are looking for window functions:
select
s.*,
min(case when Budget>=20000000 or Actual_Spend>=2500000 then 'Table1' else 'Table2' end)
over(partition by team) Category
from Source_Data s
If any of the records having the same team satisfies condition Budget>=20000000 or Actual_Spend>=2500000, the new column yields Table1, else it produces Table2.
Demo on DB Fiddle:
Team | ProjectID | Budget | Actual_Spend | State | Category
:--- | --------: | -------: | -----------: | :--------- | :-------
Blue | 2 | 22000000 | null | California | Table1
Blue | 1 | null | 3000000 | California | Table1
Cyan | 1 | 20000000 | 1000000 | Utah | Table1
Cyan | 2 | null | null | Utah | Table1
Gray | 1 | 19000000 | 2500000 | Utah | Table1
Gray | 1 | 10000000 | 500000 | Utah | Table1
Red | 1 | 7000000 | 1000000 | Washington | Table2
Red | 2 | 19999000 | 2490000 | Oregon | Table2

Can recursion start from a specific record in a table?

I'm trying to calculate depreciation on vehicles. If there is a rebate on a vehicle, I need to stop the depreciation, factor in the rebaste based on the month it look affect, and resume the depreciation calculation.
A vehicle depreciates at a flat rate of 2% every month with 50 months being the point of 100% depreciation. When a rebate appears, I can stop the depreciation, but I don't know how to make it start again from a certain month.
Below is an example of the table's deprecation up to directly before the rebate:
+----------+-------+------------+--------------+------------+------------+
| Vehicle# | month | depDate | Initial Cost | Monthlydep | totaldep |
+----------+-------+------------+--------------+------------+------------+
| 12451 | 1 | 2015-08-01 | 44953.24 | 899.06 | 899.0648 |
| 12451 | 2 | 2015-09-01 | 44953.24 | 899.06 | 1798.1296 |
| ------- | ----- | ----- | ----- | ----- | ----- |
| 12451 | 42 | 2019-01-01 | 44953.24 | 899.06 | 37760.7216 |
| 12451 | 43 | 2019-02-01 | 44953.24 | 899.06 | 38659.7864 |
+----------+-------+------------+--------------+------------+------------+
Then let's say that a rebate comes in this month (2019-03-01) it needs to be factored in and then the depreciation needs to be recalculated from that month onwards the. How do I restart the depreciation from month 43 instead of it going through everything?
For example let's say that we get a rebate in month 44 for $200 dollars. The table should look like something below:
+----------+-------+------------+--------------+------------+------------+
| Vehicle# | month | depDate | Initial Cost | Monthlydep | totaldep |
+----------+-------+------------+--------------+------------+------------+
| 12451 | 43 | 2019-02-01 | 44953.24 | 899.06 | 38659.7864 |
| 12451 | 44 | 2019-03-01 | 44953.24 | 1099.06 | 39758.8464 |
| 12451 | 45 | 2019-04-01 | 44953.24 | 1099.06 | 40857.9064 |
| 12451 | 46 | 2019-05-01 | 44953.24 | 1099.06 | 41956.9664 |
| 12451 | 47 | 2019-06-01 | 44953.24 | 1099.06 | 43056.0264 |
| 12451 | 48 | 2019-06-01 | 44953.24 | 1099.06 | 44155.0864 |
| 12451 | 49 | 2019-06-01 | 44953.24 | 1099.06 | 45254.1464 |
+----------+-------+------------+--------------+------------+------------+
So month 49 would be the final month because the totalDep is equal to or higher than the initial cost
My sample code is below. If you remove the first cte and the join inner join in the top part of the union then that is the working depreciation calculation:
;With cte As( Select bd.[VehicleID]
,Max(bd.[Month]) As month
,Max(DateAdd(DAY,1,EOMONTH(DepreciationReportDate,-1))) As DepreciationReportDate
,Max(bd.MonthlyDepreciation) As MonthlyDepreciation
,Max(bd.AdjustedPurchaseCost) As AdjustedPurchaseCost
,Max(AccumulatedDepreciation) As AccumulatedDepreciation
From Work.dbo.DepreciationSchedule bd
Group By bd.VehicleID
)
,cte_CreateRows As
(
Select bd.[VehicleID]
,bd.[Month]
,DATEADD(DAY,1,EOMONTH(bd.DepreciationReportDate,-1)) As DepreciationReportDate
,bd.MonthlyDepreciation
,bd.AdjustedPurchaseCost
,bd.AccumulatedDepreciation
From Work.dbo.DepreciationSchedule bd
Inner Join cte cte
On cte.VehicleID = bd.VehicleID
And cte.month = bd.Month
Union All
Select bd.[VehicleID]
,[Month] = Cast(cr.[Month]+1 As int)
,DATEADD(DAY,1,EOMONTH(DateAdd(Month, 1, cr.DepreciationReportDate),-1)) As DepreciationReportDate
,bd.MonthlyDepreciation
,bd.AdjustedPurchaseCost
,AccumulatedDepreciation = cr.AccumulatedDepreciation + cr.MonthlyDepreciation
From Work.dbo.DepreciationSchedule bd
Inner Join cte_CreateRows cr On bd.[VehicleID] = cr.[VehicleID]
Where cr.AccumulatedDepreciation < cr.AdjustedPurchaseCost
And DateAdd(Month,1, DateAdd(DAY,1,EOMONTH(cr.DepreciationReportDate,-1))) < DATEADD(DAY,1,EOMONTH(GetDate(),-1))
)
Select a.VehicleID
,a.Month
,a.DepreciationReportDate
,Cast(a.MonthlyDepreciation As Decimal(12,2)) As 'Monthly Depreciation Expense'
,a.AdjustedPurchaseCost
,a.AccumulatedDepreciation
From [cte_CreateRows] As a
Order By a.VehicleID, a.Month

Split Time Frequency To Rows

I am trying to split a time frequency that has a start time, an end time, a frequency and a duration into separate rows. Here is some example data:
+------+------------+----------+-----------------+---------------+
| Name | Start_Time | End_Time | Frequency_Hours | Duration_Mins |
+------+------------+----------+-----------------+---------------+
| A | 08:00:00 | 18:00:00 | 2 | 2 |
| B | 00:00:00 | 23:59:59 | 1 | 5 |
| C | 00:00:00 | 23:59:59 | 4 | 15 |
+------+------------+----------+-----------------+---------------+
Can be created using the following query:
DECLARE #Tmp AS TABLE(Name VARCHAR(128)
,Start_Time VARCHAR(8)
,End_Time VARCHAR(8)
,Frequency_Hours INT
,Duration_Mins INT)
INSERT INTO #Tmp VALUES ('A','08:00:00', '18:00:00', 2,2)
,('B','00:00:00', '23:59:59', 1,5)
,('C','00:00:00', '23:59:59', 4,15)
Here is my desired output (I will then use this to drive a gantt chart visualisation):
+------+------------+----------+
| Name | Start_Time | End_Time |
+------+------------+----------+
| A | 08:00:00 | 08:02:00 |
| A | 10:00:00 | 10:02:00 |
| A | 12:00:00 | 12:02:00 |
| A | 14:00:00 | 14:02:00 |
| A | 16:00:00 | 16:02:00 |
| A | 18:00:00 | 18:02:00 |
| B | 00:00:00 | 00:05:00 |
| B | 01:00:00 | 01:05:00 |
| B | 02:00:00 | 02:05:00 |
| B | 03:00:00 | 03:05:00 |
| B | 04:00:00 | 04:05:00 |
| B | 05:00:00 | 05:05:00 |
| B | 06:00:00 | 06:05:00 |
| B | 07:00:00 | 07:05:00 |
| B | 08:00:00 | 08:05:00 |
| B | 09:00:00 | 09:05:00 |
| B | 10:00:00 | 10:05:00 |
| B | 11:00:00 | 11:05:00 |
| B | 12:00:00 | 12:05:00 |
| B | 13:00:00 | 13:05:00 |
| B | 14:00:00 | 14:05:00 |
| B | 15:00:00 | 15:05:00 |
| B | 16:00:00 | 16:05:00 |
| B | 17:00:00 | 17:05:00 |
| B | 18:00:00 | 18:05:00 |
| B | 19:00:00 | 19:05:00 |
| B | 20:00:00 | 20:05:00 |
| B | 21:00:00 | 21:05:00 |
| B | 22:00:00 | 22:05:00 |
| B | 23:00:00 | 23:05:00 |
| C | 00:00:00 | 00:15:00 |
| C | 04:00:00 | 04:15:00 |
| C | 08:00:00 | 08:15:00 |
| C | 12:00:00 | 12:15:00 |
| C | 16:00:00 | 16:15:00 |
| C | 20:00:00 | 20:15:00 |
+------+------------+----------+
I am hoping to be able to create a view out of this so I am trying to do it without cursors or other cpu intensive methods.
Any ideas?
Thanks,
Dan.
You could use a recursive cte like this
;WITH temp AS
(
SELECT t.Name, CAST(t.Start_Time AS time) AS CurrentStart_Time, dateadd(minute,t.Duration_Mins,CAST(t.Start_Time AS time)) AS CurrentEnd_Time, t.Frequency_Hours, CAST(t.End_Time AS time) AS End_Time
FROM #Tmp t
UNION ALL
SELECT t.Name, dateadd(hour,t.Frequency_Hours,t.CurrentStart_Time), dateadd(hour,t.Frequency_Hours,t.CurrentEnd_Time), t.Frequency_Hours, t.End_Time
FROM temp t
WHERE t.CurrentStart_Time < t.End_Time AND t.CurrentStart_Time < dateadd(hour,t.Frequency_Hours,t.CurrentStart_Time)
)
SELECT t.Name, t.CurrentStart_Time, t.CurrentEnd_Time
FROM temp t
ORDER BY t.Name
OPTION (MAXRECURSION 0)
Demo link: http://rextester.com/XJK25805
It can be done without RECURSIIVE CTE also.
If we create number instead of using
select distinct number master..spt_values then performance will be far better.
Like Number table can be populated from 1 to 100.
try this with various sample data,
declare #t table(Name varchar(20), Start_Time time(0),End_Time time(0)
, Frequency_Hours int,Duration_Mins int)
insert into #t VALUES
('A','08:00:00','18:00:00', 2 , 2 )
,('B','00:00:00','23:59:59', 1 , 5 )
,('C','00:00:00','23:59:59', 4 ,15 )
SELECT NAME
,dateadd(hour, n, Start_Time) Start_Time
,dateadd(minute, Duration_Mins, (dateadd(hour, n, Start_Time))) End_Time
FROM #t t
CROSS APPLY (
SELECT DISTINCT number * Frequency_Hours n
FROM master..spt_values
WHERE number >= 0
AND number <= datediff(HOUR, t.Start_Time, t.End_Time) / Frequency_Hours
) ca

Group Non-Contiguous Dates By Criteria In Column

I have a table with start and end dates for team consultations with customers.
I need to merge certain consultations based on a number of days specified in another column (sometimes the consultations may overlap, sometimes they are contiguous, sometimes they arent), Team and Type.
Some example data is as follows:
DECLARE #TempTable TABLE([CUSTOMER_ID] INT
,[TEAM] VARCHAR(1)
,[TYPE] VARCHAR(1)
,[START_DATE] DATETIME
,[END_DATE] DATETIME
,[GROUP_DAYS_CRITERIA] INT)
INSERT INTO #TempTable VALUES (1,'A','A','2013-08-07','2013-12-31',28)
,(2,'B','A','2015-05-15','2015-05-28',28)
,(2,'B','A','2015-05-15','2016-05-12',28)
,(2,'B','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2013-05-27','2014-07-23',28)
,(3,'C','A','2015-01-12','2015-05-28',28)
,(3,'B','A','2015-01-12','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-12-17',28)
,(4,'A','B','2013-07-09','2014-04-21',7)
,(4,'A','B','2014-04-29','2014-08-01',7)
Which looks like this:
+-------------+------+------+------------+------------+---------------------+
| CUSTOMER_ID | TEAM | TYPE | START_DATE | END_DATE | GROUP_DAYS_CRITERIA |
+-------------+------+------+------------+------------+---------------------+
| 1 | A | A | 07/08/2013 | 31/12/2013 | 28 |
| 2 | B | A | 15/05/2015 | 28/05/2015 | 28 |
| 2 | B | A | 15/05/2015 | 12/05/2016 | 28 |
| 2 | B | A | 28/05/2015 | 28/05/2015 | 28 |
| 3 | C | A | 27/05/2013 | 23/07/2014 | 28 |
| 3 | C | A | 12/01/2015 | 28/05/2015 | 28 |
| 3 | B | A | 12/01/2015 | 28/05/2015 | 28 |
| 3 | C | A | 28/05/2015 | 28/05/2015 | 28 |
| 3 | C | A | 28/05/2015 | 17/12/2015 | 28 |
| 4 | A | B | 09/07/2013 | 21/04/2014 | 7 |
| 4 | A | B | 29/04/2014 | 01/08/2014 | 7 |
+-------------+------+------+------------+------------+---------------------+
My desired output is as follows:
+-------------+------+------+------------+------------+---------------------+
| CUSTOMER_ID | TEAM | TYPE | START_DATE | END_DATE | GROUP_DAYS_CRITERIA |
+-------------+------+------+------------+------------+---------------------+
| 1 | A | A | 07/08/2013 | 31/12/2013 | 28 |
| 2 | B | A | 15/05/2015 | 12/05/2016 | 28 |
| 3 | C | A | 27/05/2013 | 23/07/2014 | 28 |
| 3 | C | A | 12/01/2015 | 17/12/2015 | 28 |
| 3 | B | A | 12/01/2015 | 28/05/2015 | 28 |
| 4 | A | B | 09/07/2013 | 21/04/2014 | 7 |
| 4 | A | B | 29/04/2014 | 01/08/2014 | 7 |
+-------------+------+------+------------+------------+---------------------+
I am struggling to do this at all, let alone with any efficiency! Any ideas / code will be greatly received.
Server version is MS SQL Server 2014
Thanks,
Dan
If I am understanding your question correctly, we want to return rows only when a second, third, etc consultation has not occurred within group_days_criteria number of days after the previous consultation end date.
We can get the previous consultation end date and eliminate rows (since we are not concerned with the number of consultations) where a consultation occurred for the same customer by the same team and of the same consultation type within our date range.
DECLARE #TempTable TABLE([CUSTOMER_ID] INT
,[TEAM] VARCHAR(1)
,[TYPE] VARCHAR(1)
,[START_DATE] DATETIME
,[END_DATE] DATETIME
,[GROUP_DAYS_CRITERIA] INT)
INSERT INTO #TempTable VALUES (1,'A','A','2013-08-07','2013-12-31',28)
,(2,'B','A','2015-05-15','2015-05-28',28)
,(2,'B','A','2015-05-15','2016-05-12',28)
,(2,'B','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2013-05-27','2014-07-23',28)
,(3,'C','A','2015-01-12','2015-05-28',28)
,(3,'B','A','2015-01-12','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-05-28',28)
,(3,'C','A','2015-05-28','2015-12-17',28)
,(4,'A','B','2013-07-09','2014-04-21',7)
,(4,'A','B','2014-04-29','2014-08-01',7)
;with prep as (
select Customer_ID,
Team,
[Type],
[Start_Date],
[End_Date],
Group_Days_Criteria,
ROW_NUMBER() over (partition by customer_id, team, [type] order by [start_date] asc, [end_date] desc) as rn, -- earliest start date with latest end date
lag([End_Date] + Group_Days_Criteria, 1, 0) over (partition by customer_id, team, [type] order by [start_date] asc, [end_date] desc) as PreviousEndDate -- previous end date +
from #TempTable
)
select p.Customer_Id,
p.[Team],
p.[Type],
p.[Start_Date],
p.[End_Date],
p.Group_Days_Criteria
from prep p
where p.rn = 1
or (p.rn != 1 and p.[Start_date] > p.PreviousEndDate)
order by p.Customer_Id, p.[Team], p.[Start_Date], p.[Type]
This returned the desired result set.

Select a specific line if i have the same information

I have a table with a data as bellow :
+--------+----------+-------+------------+--------------+
| month | code | type | date | PersonID |
+--------+----------+-------+------------+--------------+
| 201501 | 178954 | 3 | 2014-12-3 | 10 |
| 201501 | 178954 | 3 | 2014-12-3 | 10 |
| 201501 | 178955 | 2 | 2014-12-13 | 10 |
| 201501 | 178955 | 2 | 2014-12-13 | 10 |
| 201501 | 178956 | 2 | 2014-12-11 | 10 |
| 201501 | 178958 | 1 | 2014-12-10 | 10 |
| 201501 | 178959 | 2 | 2014-12-12 | 15 |
| 201501 | 178959 | 2 | 2014-12-12 | 15 |
| 201501 | 178954 | 1 | 2014-12-11 | 13 |
| 201501 | 178954 | 1 | 2014-12-11 | 13 |
+--------+----------+-------+------------+--------------+
In my first 6 lines i have the same PersonID in the same Month What i want if i have the same personID in the same Month i want to select the person who have the type is 2 with the recent date in my case the output will be like as bellow:
+--------+--------+------+------------+----------+
| month | code | type| date | PersonID |
+--------+--------+------+------------+----------+
| 201501 | 178955 | 2 | 2014-12-13 | 10 |
| 201501 | 178959 | 2 | 2014-12-12 | 15 |
| 201501 | 178954 | 2 | 2014-12-11 | 13 |
+--------+--------+------+------------+----------+
Also if they are some duplicate rows i don't want to display it
They are any solution to that ?
Simply use GROUP BY:
https://msdn.microsoft.com/de-de/library/ms177673(v=sql.120).aspx
SELECT mont, code, ... FROM tabelname GROUP BY PersonID, date, ...
Note that you have to specifiy all columns in the group by.
SELECT DISTINCT A.month, A.code, A.type, B.date, B.PersonID FROM YourTable A
INNER JOIN (SELECT PersonID, MAX(date) as date FROM YourTable
GROUP BY PersonID) B
ON (A.PersonID = B.PersonID
AND A.date = B.date)
WHERE A.type = 2 ORDER BY B.date DESC, A.PersonID
Just in case you/others are still wondering.

Resources