How to UNPIVOT to normalize output SQL Server 2012

How to UNPIVOT to normalize output SQL Server 2012 - sql-server

Is it possible to UNPIVOT data like this?
The columns CGL, CPL, EO should become Coverage Type, the values for CGL, CPL, EO should go in column Premium, and values for CGLTria,CPLTria,EOTria should go in column Tria Premium
Also if values for CGL,CPL and EO is 0 then I don't need those columns.
I am able to perform simple UNPIVOT , but confused when need to bring more columns
SELECT TOP 3
QuoteGUID, CoverageType, Premium
FROM
Align_EnvionmentalRating_PremiumHistory
UNPIVOT
(Premium FOR CoverageType IN (CGL, CPL, EO)) AS up
UPDATE: Adding more data
Columns: Policy Number, Policy Effective Date, Annual Statement Line, Fees, Risk State should stay the same.
Columns: CGL, CGLTria,CPL,CPLTria,EO,EOTria should be UNPIVOTED
select top 3 [Policy Number],
[Policy Effective Date],
'17.1' as [Annual Statement Line],
CGL,
CGLTria,
CPL,
CPLTria,
EO,
EOTria,
Fees,
[Risk State]
from #Test
UPDATE:
Adding consumable data:
create table dbo.TestDate (
PolicyNumber varchar(50),
PolicyEffectiveDate datetime,
AnnualStatementLine decimal(5,1),
CGL money,
CGLTria money,
CPL money,
CPLTria money,
EO money,
EOTria money,
Fees money,
RiskState varchar(2)
)
INSERT INTO dbo.TestDate (PolicyNumber, PolicyEffectiveDate , AnnualStatementLine, CGL , CGLTria , CPL ,CPLTria ,EO ,EOTria ,Fees ,RiskState )
values ('ENV560000001-00','2018-01-11 23:21:00',17.1,2000,160,674,54,341,0,250,'TX'),
('ENV560000002-00','2018-01-11 00:56:00',17.1,0,0,3238,259,0,0,250,'NV'),
('ENV560000003-00','2018-01-12 01:10:00',17.1,0,0,6045,484,0,0,250,'ND'),
('ENV560000004-00','2018-01-14 01:18:00',17.1,0,0,0,0,0,0,0,'ND')
select * from dbo.TestDate

Below query should work.
`Select * From
(select top 3 QuoteGUID, CoverageType, Premium
from Align_EnvionmentalRating_PremiumHistory
unpivot
(
Premium for CoverageType in ( CGL,CPL,EO)
) as up
) A
INNER JOIN
(select top 3 QuoteGUID, CoverageType, Premium
from Align_EnvionmentalRating_PremiumHistory
unpivot
(
TriaPremium for CoverageType in ( CGLTria,CPLTria,EOTria)
) as up
) as B
ON A.QuoteGUID=B.QuoteGUID AND A.CoverageType=substring(B.CoverageType,1,Len(B.CoverageType)-4)`

I'm not sure if this answers your question, but you could do this with simple UNION queries like this:
SELECT
ID = a1.ID,
CoverageType = 'CGL',
Premium = a1.CGL,
TriaPremium = a1.CGLTria
FROM
Align_EnvionmentalRating_PremiumHistory AS a1
WHERE
a1.CGL <> 0
UNION ALL
SELECT
ID = a2.ID,
CoverageType = 'CPL',
Premium = a2.CPL,
TriaPremium = a2.CPLTria
FROM
Align_EnvionmentalRating_PremiumHistory AS a2
WHERE
a2.CPL <> 0
UNION ALL
SELECT
ID = a3.ID,
CoverageType = 'EO',
Premium = a3.EO,
TriaPremium = a3.EOTria
FROM
Align_EnvionmentalRating_PremiumHistory AS a3
WHERE
a3.EO <> 0

This produces the exact output you state you want from the sample data provided. It could get kind of nasty with as many as 60 columns you have to normalize but it should be a one time thing to write it. Having worked through this a bit it seems you really need to separate this into at least two tables but is a whole can of worms.
with NormalizedData
(
PolicyNumber
, CoverageType
, Premium
, TriaPremium
) as
(
SELECT PolicyNumber
, 'CGL'
, CGL
, CGLTria
FROM TestDate
UNION ALL
SELECT PolicyNumber
, 'CPL'
, CPL
, CPLTria
FROM TestDate
UNION ALL
SELECT PolicyNumber
, 'EO'
, EO
, EOTria
FROM TestDate
)
select td.PolicyNumber
, td.PolicyEffectiveDate
, td.AnnualStatementLine
, nd.CoverageType
, nd.Premium
, nd.TriaPremium
, td.RiskState
from TestDate td
join NormalizedData nd on nd.PolicyNumber = td.PolicyNumber
order by td.PolicyNumber
, nd.CoverageType

I used CROSS APPLY operator to UNPIVOT the data, then included UNPIVOT statement in the INNER JOIN. That gave me desirable outcome.
declare #TestDate table (
QuoteGUID varchar(8000),
CGL money,
CGLTria money,
CPL money,
CPLTria money,
EO money,
EOTria money
)
INSERT INTO #TestDate (QuoteGUID, CGL , CGLTria , CPL ,CPLTria ,EO ,EOTria )
values ('2D62B895-92B7-4A76-86AF-00138C5C8540',2000,160,674,54,341,0),
('BE7F9483-174F-4238-8931-00D09F99F398',0,0,3238,259,0,0),
('BECFB9D8-D668-4C06-9971-0108A15E1EC2',0,0,0,0,0,0)
select A.QuoteGUID
,B.*
From #TestDate A
Cross Apply ( values ('CGL',CGL,CGLTria)
,('CPL',CPL,CPLTria)
,('CPL',EO,EOTria)
) B (CoverageType,Premium,TiraPremium)

Related

How can I use calculated alias in Order by clause?

The table is simple:
ID start_date end_date
1 2015-10-01 2015-10-02
2 2015-10-02 2015-10-03
3 2015-10-05 2015-10-06
4 2015-10-07 2015-10-08
ID 1 and 2 belong to one project since the end_date equals to the start_date, ID 3 and 4 are different ones.
Here is the query to find the same projects and sort by the time they take:
select P1.Start_Date, (
select min(P.End_Date)
from Projects as P
where P.End_Date not in (select Start_Date from Projects )
and P.End_Date > P1.Start_Date
) as ED
from Projects as P1
where P1.Start_Date not in (select End_Date from Projects )
order by datediff(day, P1.Start_Date, ED)
The problem is: the ED is invalid in the order by clause, but when using without datediff is valid:
order by ED
Is datediff calculated after select clause? Any one could explain? Thanks.

You can simply use CROSS APPLY to calculate this column like this:
DECLARE #Projects TABLE
(
[ID] SMALLINT
,[start_date] DATETIME
,[end_date] DATETIME
);
INSERT INTO #Projects ([ID], [start_date], [end_date])
VALUES ('1', '2015-10-01', '2015-10-02')
,('2', '2015-10-02', '2015-10-03')
,('3', '2015-10-05', '2015-10-06')
,('4', '2015-10-07', '2015-10-08');
select P1.Start_Date, ED
from #Projects as P1
CROSS APPLY
(
select min(P.End_Date)
from #Projects as P
where P.End_Date not in (select Start_Date from #Projects )
and P.End_Date > P1.Start_Date
) DS(ED)
where P1.Start_Date not in (select End_Date from #Projects )
order by datediff(day, P1.Start_Date, ED);
It seems that the engine of the management studio is not able to translate the alias ED to something valid. If for example you replace the ED with the sub-query it will work. Also, the following which is a bad practice will work:
select P1.Start_Date, (
select min(P.End_Date)
from #Projects as P
where P.End_Date not in (select Start_Date from #Projects )
and P.End_Date > P1.Start_Date
) as ED
from #Projects as P1
where P1.Start_Date not in (select End_Date from #Projects )
order by datediff(day, P1.Start_Date, 2)
Instead alias we are using the number of the column on which to sort. So, there is nothing wrong with your code.

SQL Server + retrieve the data without overlapping datetimes

I need some thoughts to the best implementation of this case
I have data where there can be multiple values with start & end datetime, now i need to pull the data without overlapping the dates, below is the sample data.
CREATE TABLE table2 (
start_date DATE NOT NULL,
end_date DATE NOT NULL,
comments VARCHAR(100) NULL ,
id int
);
INSERT INTO table2 (start_date, end_date, id) VALUES
('2011-12-01', '2012-01-02', 5),
('2012-01-01', '2012-01-06', 5),
('2012-01-05', '2012-01-10', 5),
('2012-01-09', '2012-01-11', 5);
from this i need the data which is not overlapping for each id
('2011-12-01', '2012-01-02', 5),
('2012-01-05', '2012-01-10', 5)
Please share me the thoughts on what cane be the best way to implement this ?
Thanks for the support
Thanks,
Manoj.

The output you provide is very unclear. On the first sight you are looking for an intervall, where not other intervall starts within (which would lead to a continued intervall). But your second expected line is overlapping with 2012-01-10?
The following query will return a row, if its end_date is not within another rows intervall... But this does not return your two expected rows, just the first.
SELECT * FROM table2 AS t
WHERE NOT EXISTS(SELECT 1
FROM table2 AS x
WHERE x.start_date<>t.start_date
AND x.end_date BETWEEN t.start_date AND t.end_date
);
I hope this points you the right direction...

The following will do it:
WITH cte
AS
(
SELECT
[start_date]
, end_date
, comments
, id
FROM
(
SELECT
[start_date]
, end_date
, comments
, id
, ROW_NUMBER() OVER (PARTITION BY id ORDER BY [start_date]) R
FROM table2
) Q
WHERE R = 1
UNION ALL
SELECT
[start_date]
, end_date
, comments
, id
FROM
(
SELECT
T.[start_date]
, T.end_date
, T.comments
, T.id
, ROW_NUMBER() OVER (PARTITION BY T.id ORDER BY T.[start_date]) R
FROM
cte C
JOIN table2 T ON
C.id = T.id
AND T.[start_date] > C.end_date
) Q
WHERE R = 1
)
SELECT
[start_date]
, end_date
, comments
, id
FROM cte

Get list of dates without entries in SQL Server

I am looking to find a solution to this problem. I have a table called LogEntry that stores information used by multiple offices, where they have to log any visitors that come in to their office on any given day. If no visitors come in, they are still required to log "No Visitors" for the day. How do I run a query that pulls all dates where an office failed to create even a "No Visitors" log?
I've looked at this question (and the article linked within), but even adapting that query, I'm only able to create a blank row for a date where an office is missing an entry for a date, not specify the actual office that did not create an entry. Is there a way to do what I'm trying to do?
declare #temp table (
CDate datetime,
loc_id varchar(50)
)
insert into #temp SELECT DISTINCT entryDate, locationID FROM LogEntry WHERE entryDate >= '05/01/2017' AND entryDate <= '07-31-2017'
;with d(date) as (
select cast('05/01/2017' as datetime)
union all
select date+1
from d
where date < '07/31/2017'
)
select DISTINCT t.loc_id, CONVERT(date, d.date)
FROM d LEFT OUTER JOIN #temp t ON d.date = t.CDate
GROUP BY t.loc_id, d.date
ORDER BY t.loc_id
As I said, this query returns me a list of dates in the date range, and all locations that submitted entries on that date, but I'd like to find a way to extract essentially the opposite information: if an office (specified by locationID) did not submit an entry on a given day, return only those locationIDs and the dates that they missed.
Sample data
EntryID | locationID | entryDate
=================================
1 1 07-01-2017
2 1 07-02-2017
3 2 07-02-2017
4 1 07-04-2017
Expected Result (for date range of 07-01 to 07-04)
locationID | missedEntryDate
============================
1 07-03-2017
2 07-01-2017
2 07-03-2017
2 07-04-2017

Your first step was good, you create a list of all dates, but you also need a list of all locations. Then you create a cross join to have all combinations and then you perform the left join to find out what is missing.
;with allDates(date) as (
select cast('05/01/2017' as datetime)
union all
select date+1
from d
where date < '07/31/2017'
), allLocations as (
SELECT DISTINCT loc_id
FROM #temp
), allCombinations as (
SELECT date, loc_id
FROM allDates
CROSS JOIN allLocations
)
SELECT AC.loc_id, AC.date
FROM allCombinations AC
LEFT JOIN #temp t
ON AC.date = t.CDate
AND AC.loc_id = t.loc_id
WHERE t.loc_id IS NULL -- didnt find a match on #temp

If your dataset is not too large you can try this:
select t.loc_id, CONVERT(date, d.date)
FROM d
-- Cross join dates to all available locs
CROSS JOIN (SELECT DISTINCT loc_id FROM #temp ) AS Locs
LEFT JOIN
( SELECT loc_id, t.CDate
FROM #temp
GROUP BY loc_id, d.date ) AS t ON d.date = t.CDate AND Locs.loc_id = t.loc_id
ORDER BY Locs.loc_id
This should be a bit faster:
;WITH cte AS (
SELECT a.LocID, RangeStart.CDate, ( CASE WHEN Input.LocID IS NULL THEN 1 ELSE 0 END ) AS IsMissing
FROM ( SELECT DISTINCT LocID FROM #temp ) AS a
CROSS JOIN ( SELECT CONVERT( DATETIME, '2017-05-01' ) AS CDate ) AS RangeStart
LEFT JOIN
( SELECT LocID, MIN( CDate ) AS CDate
FROM #temp
WHERE CDate = '2017-05-01'
GROUP BY LocID ) AS Input ON a.LocID = Input.LocID AND RangeStart.CDate = Input.CDate
UNION ALL
SELECT a.LocID, a.CDate + 1 AS CDate,
ISNULL( ItExists, 0 ) AS IsMissing
FROM cte AS a
OUTER APPLY( SELECT LocID, 1 AS ItExists FROM #temp AS b WHERE a.LocID = b.LocID AND a.CDate + 1 = b.CDate ) AS c
WHERE a.CDate < '2017-07-01'
)
SELECT * FROM cte OPTION( MAXRECURSION 0 )
You can also add an index:
CREATE INDEX IX_tmp_LocID_CDate ON #temp( LocID, CDate )
Sample data set for the second query:
CREATE TABLE #temp( LocID VARCHAR( 50 ), CDate DATETIME )
INSERT INTO #temp
VALUES
( '1', '2017-05-01' ), ( '1', '2017-05-02' ), ( '1', '2017-05-03' ), ( '1', '2017-05-04' ), ( '1', '2017-05-05' ),
( '2', '2017-05-01' ), ( '2', '2017-05-02' ), ( '2', '2017-05-03' ), ( '2', '2017-05-04' ), ( '2', '2017-05-05' )
;WITH d AS (
SELECT CAST( '05/01/2017' AS DATETIME ) AS date
UNION ALL
SELECT date + 2
FROM d
WHERE date < '2018-07-31'
)
INSERT INTO #temp
SELECT LocID, d.date
FROM ( SELECT DISTINCT LocID FROM #temp ) AS a
CROSS JOIN d
OPTION( MAXRECURSION 0 )

Multi-statement Scalar to Multi-statement TVF

I have a question on these few lines of code, particularly how this #workTable is being populated with the StartingCost and EndingCost values:
DECLARE #workTable TABLE
(
ProductId INT ,
StartingCost MONEY ,
EndingCost MONEY
) ;
The full query is listed below.
IF OBJECT_ID(N'Production.ms_tvf_ProductCostDifference',N'TF' ) IS NOT NULL
--SELECT * FROM sys.objects WHERE name LIKE 'm%'
DROP FUNCTION Production.ms_tvf_ProductCostDifference ;
GO
CREATE FUNCTION Production.ms_tvf_ProductCostDifference
(
#StartDate DATETIME ,
#EndDate DATETIME
)
RETURNS #retCostDifference TABLE
(
ProductId INT ,
CostDifference MONEY
)
AS
BEGIN
DECLARE #workTable TABLE
(
ProductId INT ,
StartingCost MONEY ,
EndingCost MONEY
) ;
INSERT INTO #retCostDifference
( ProductId ,
CostDifference
)
SELECT ProductID ,
StandardCost
FROM ( SELECT pch.ProductID ,
pch.StandardCost ,
ROW_NUMBER() OVER
( PARTITION BY ProductID
ORDER BY StartDate DESC ) AS rn
FROM Production.ProductCostHistory AS pch
WHERE EndDate BETWEEN
#StartDate AND #EndDate
) AS x
WHERE x.rn = 1 ;
UPDATE #retCostDifference
SET CostDifference = CostDifference - StandardCost
FROM #retCostDifference cd
JOIN ( SELECT ProductID ,
StandardCost
FROM ( SELECT pch.ProductID ,
pch.StandardCost ,
ROW_NUMBER() OVER
( PARTITION BY ProductID
ORDER BY StartDate ASC )
AS rn
FROM Production.ProductCostHistory
AS pch
WHERE EndDate BETWEEN
#StartDate AND #EndDate
) AS x
WHERE x.rn = 1
) AS y ON cd.ProductId = y.ProductID ;
RETURN ; select top 20 * from Production.ProductCostHistory
END
Go
/********************************************************************************
The code above represents
Listing 17: A multi-statement TVF
This TVF, Instead of retrieving a single row from the database and calculating
the price difference, pulls back all rows from the database and calculates the
price difference for all rows at once.
*********************************************************************************/
SELECT p.ProductID ,
p.Name ,
p.ProductNumber ,
pcd.CostDifference
FROM Production.Product AS p
INNER JOIN Production.ms_tvf_ProductCostDifference
('2001-01-01', GETDATE()) AS pcd
ON p.ProductID = pcd.ProductID ;

In your code, the below block is simply declared and it never used.
DECLARE #workTable TABLE
(
ProductId INT ,
StartingCost MONEY ,
EndingCost MONEY
) ;
Since #retCostDifference is the return table for this function and all the transactions (INSERT, UPDATE) are happened to the #retCostDifference table only.
There is no use of #workTable in the function.

t-sql test data warehouse type 2 changes

I need to look at a data warehouse and check that a type 2 change works correctly
I need to check that the vaild to date on a row is the same as the vaild from date on the next row.
This check is to make sure that a row has been ended has also been started correctly
thanks, Marc

The following relates to Kimball type-2 dimension table.
Note that this assumes
3000-01-01 as a date in far future for the current entries.
CustomerKey is an auto-incrementing integer.
This example should give you the list of rows with missing or miss-matched next-entries.
;
with
q_00 as (
select
CustomerKey
, CustomerBusinessKey
, rw_ValidFrom
, rw_ValidTo
, row_number() over (partition by CustomerBusinessKey order by CustomerKey asc) as rn
from dimCustomer
)
select
a.CustomerKey
, a.CustomerBusinessKey
, a.rw_ValidFrom
, a.rw_ValidTo
, b.CustomerKey as b_key
, b.CustomerBusinessKey as b_bus_key
, b.rw_ValidFrom as b_ValidFrom
, b.rw_ValidTo as b_ValidTo
from q_00 as a
left join q_00 as b on b.CustomerBusinessKey = a.CustomerBusinessKey and (b.rn = a.rn + 1)
where a.rw_ValidTo < '3000-01-01'
and a.rw_ValidTo != b.rw_ValidFrom ;
Also useful
-- Make sure there are no nulls
-- for rw_ValidFrom, rw_ValidTo
select
CustomerKey
, rw_ValidFrom
, rw_ValidTo
from dimCustomer
where rw_ValidFrom is null
or rw_ValidTo is null ;
-- make sure there are no duplicates in rw_ValidFrom
-- for the same customer
select
CustomerBusinessKey
, rw_ValidFrom
, count(1) as cnt
from dimCustomer
group by CustomerBusinessKey, rw_ValidFrom
having count(1) > 1 ;
-- make sure there are no duplicates in rw_ValidTo
-- for the same customer
select
CustomerBusinessKey
, rw_ValidTo
, count(1) as cnt
from dimCustomer
group by CustomerBusinessKey, rw_ValidTo
having count(1) > 1 ;

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to UNPIVOT to normalize output SQL Server 2012 - sql-server

Related

How can I use calculated alias in Order by clause?

SQL Server + retrieve the data without overlapping datetimes

Get list of dates without entries in SQL Server

Multi-statement Scalar to Multi-statement TVF

t-sql test data warehouse type 2 changes

Categories

Resources