Retrieve line from ValidationDate Column - sql-server

I have difficulties to write a SQL script.
I have a table like this:
And I want to have a result like this:
I used the min and max functions but that doesn't work.
Do you have any idea?
Thank you for your help

MIN() and MAX() do appear to get you what you want. FYI, I have converted your dates to yyyy-MM-dd format.
IF OBJECT_ID('tempdb..#YourTable','U') IS NOT NULL DROP TABLE #YourTable; --SELECT * FROM #YourTable
CREATE TABLE #YourTable (
Business_Key int NOT NULL,
[Name] varchar(10) NOT NULL,
[Attribute] varchar(10) NOT NULL,
ValidFrom date NOT NULL,
ValidTo date NOT NULL,
Primary_Key int NOT NULL,
);
INSERT INTO #YourTable (Business_Key, [Name], Attribute, ValidFrom, ValidTo, Primary_Key)
VALUES (1, 'Toto', 'Child', '2020-01-01', '2020-01-03', 1)
, (1, 'Toto', 'Child', '2020-01-03', '2020-01-10', 2)
, (1, 'Toto', 'Man' , '2020-01-10', '2020-01-15', 3)
, (2, 'Tata', 'Woman', '2020-01-01', '2020-01-15', 4)
, (3, 'Titi', 'Man' , '2020-01-01', '2020-01-15', 5)
, (3, 'Titi', 'Man' , '2020-01-05', '2020-01-17', 6)
SELECT Business_Key
, [Name]
, [Attribute]
, ValidFrom = MIN(ValidFrom)
, ValidTo = MAX(ValidTo)
, Primary_Key = MAX(Primary_Key)
FROM #YourTable yt
GROUP BY Business_Key, [Name], [Attribute]
Returns:
| Business_Key | Name | Attribute | ValidFrom | ValidTo | Primary_Key |
|--------------|------|-----------|------------|------------|-------------|
| 1 | Toto | Child | 2020-01-01 | 2020-01-10 | 2 |
| 1 | Toto | Man | 2020-01-10 | 2020-01-15 | 3 |
| 2 | Tata | Woman | 2020-01-01 | 2020-01-15 | 4 |
| 3 | Titi | Man | 2020-01-01 | 2020-01-17 | 6 |

Related

SQL Server get missing records information

I have a question about SQL Server.
If any column does not have values, then need to provide which column does not have a value.
If data is not available in one column, then output column value not available.
If data not available more than one column, then output those columns value are not available.
Concatenate multiple columns when values not exists.
Sample data :
CREATE TABLE [dbo].[EmpDetails]
(
[Empid] [int] NULL,
[Empname] [varchar](50) NULL,
[Location] [varchar](50) NULL,
[Deptid] [int] NULL,
[Deptname] [varchar](50) NULL
)
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (1, NULL, N'che', 10, N'hr')
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (2, N'hari', N'pune', NULL, N'pm')
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (3, N'var', NULL, 30, NULL)
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (4, NULL, NULL, NULL, N'hr')
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (NULL, N'venu', N'pune', NULL, NULL)
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (NULL, N'kumar', N'pune', 20, NULL)
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (8, 'ravi', NULL, 10, N'hr')
INSERT INTO [dbo].[EmpDetails] ([Empid], [Empname], [Location], [Deptid], [Deptname])
VALUES (10, N'k', N'pune', 20, N'hr')
Based on above data I want output like below :
empid | Empname | Location | Deptid| Deptname | Validate
------+---------+----------+-------+------------+---------------------------------------
1 | NULL |Che |10 | hr | Empname value is not available
2 | hari |pune |NULL | pm | Deptid value is not available
3 | var |NULL |30 | NULL | location and deptname values are not available
4 | NULL |NULL |NULL | hr | empname and location and deptid values are not available
NULL | venu |pune |NULL | NULL | empid and deptid and deptname values are not available
NULL | kumar |pune |20 | NULL | empid and deptname values are not available
8 | ravi |NULL |10 | hr | location value is not available
10 | k |pune |20 | hr |
I tried like below :
SELECT
empid, empname, location, deptid, deptname,
CASE
WHEN COALESCE(empid, '') = '' THEN 'Empid'
ELSE ''
END + ' '+
CASE
WHEN COALESCE(empname, '') = ''
THEN 'Empname'
ELSE ''
END + ' '+
CASE
WHEN COALESCE(Location, '') = ''
THEN 'Location'
ELSE ''
END + ' '+
CASE
WHEN COALESCE(Deptid, '') = ''
THEN 'Deptid'
ELSE ''
END + ' '+
CASE
WHEN COALESCE(Deptname, '') = ''
THEN 'Deptname'
ELSE ''
END + ' ' +
+ 'value not available' AS Validate
FROM
[Test].[dbo].[EmpDetails]
But this query is not returning the expected output.
Please tell me how to write query to achieve this task in SQL Server
Please try the following solution.
It is based on XML and XQuery.
It will work starting from SQL Server 2012 onwards.
SQL
-- DDL and sample data population, start
DECLARE #tbl TABLE
(
Empid int NULL,
Empname varchar(50) NULL,
Location varchar(50) NULL,
Deptid int NULL,
Deptname varchar(50) NULL
);
INSERT INTO #tbl (Empid, Empname, Location, Deptid, Deptname) VALUES
(1, NULL, N'che', 10, N'hr'),
(2, N'hari', N'pune', NULL, N'pm'),
(3, N'var', NULL, 30, NULL),
(4, NULL, NULL, NULL, N'hr'),
(NULL, N'venu', N'pune', NULL, NULL),
(NULL, N'kumar', N'pune', 20, NULL),
(8, 'ravi', NULL, 10, N'hr'),
(10, N'k', N'pune', 20, N'hr');
-- DDL and sample data population, end
SELECT t.*
, s
, Validate = REPLACE(c.query('
for $x in /root/source/*
let $name := local-name($x)
return if (/root/target/*[local-name(.)=$name]) then ()
else $name
').value('.','VARCHAR(MAX)'),SPACE(1), ' and ') +
CASE WHEN s=5 THEN ''
WHEN s=4 THEN ' value is not available'
WHEN s<4 THEN ' values are not available'
END
FROM #tbl AS t
CROSS APPLY (SELECT TRY_CAST('<root><source><Empid/><Empname/><Location/><Deptid/><Deptname/></source>' +
(SELECT Empid, Empname, Location, Deptid, Deptname
FOR XML PATH(''), ROOT('target')) + '</root>' AS XML)) AS t1(c)
CROSS APPLY (SELECT c.value('count(/root/target/*)', 'INT')) AS t2(s);
Output
+-------+---------+----------+--------+----------+---+----------------------------------------------------------+
| Empid | Empname | Location | Deptid | Deptname | s | Validate |
+-------+---------+----------+--------+----------+---+----------------------------------------------------------+
| 1 | NULL | che | 10 | hr | 4 | Empname value is not available |
| 2 | hari | pune | NULL | pm | 4 | Deptid value is not available |
| 3 | var | NULL | 30 | NULL | 3 | Location and Deptname values are not available |
| 4 | NULL | NULL | NULL | hr | 2 | Empname and Location and Deptid values are not available |
| NULL | venu | pune | NULL | NULL | 2 | Empid and Deptid and Deptname values are not available |
| NULL | kumar | pune | 20 | NULL | 3 | Empid and Deptname values are not available |
| 8 | ravi | NULL | 10 | hr | 4 | Location value is not available |
| 10 | k | pune | 20 | hr | 5 | |
+-------+---------+----------+--------+----------+---+----------------------------------------------------------+

Identify the Column changed in SCD Type 2 in SSIS SQL server

I am tracking data in my SCD table as shown below image using the SSIS package.
I need to add a new column, the "Column Updated" (as depicted above) which represents what columns were updated between N and N-1 transaction. This can be achieved by Cursor however I am looking for suggestions to do this in an efficient way. Would it be possible to perform within SCD or any other inbuilt SQL server function?
adding script:
Create table SCDtest
(
id int ,
empid int ,
Deptid varchar(10),
Ename varchar(50),
DeptName varchar(50),
city varchar(50),
startdate datetime,
Enddate datetime ,
ColumnUpdated varchar(500)
)
Insert into SCDtest values (1, 1, 'D1', 'Mike', 'Account', 'Atlanta', '7/31/2020', '8/3/2020','' )
Insert into SCDtest values (2, 2, 'D2', 'Roy', 'IT', 'New York', '7/31/2020', '8/5/2020','' )
Insert into SCDtest values (3, 1, 'D1', 'Ross', 'Account', 'Atlanta', '8/4/2020', '8/7/2020','' )
Insert into SCDtest values (4, 2, 'D2', 'Roy', 'IT', 'Los angeles', '8/5/2020',NULL ,'' )
Insert into SCDtest values (5, 1, 'D1', 'John', 'Marketing', 'Boston', '8/8/2020', NULL,'')
Thank you
Honestly I don't really know why you need this functionality as you can very easily just look at the two rows to see any changes, on the off chance that you do actually need to see them. I've never needed a ColumnUpdated type value and I don't think the processing required to generate one and the storage to hold the data is worth having it.
That said, here is one way you can calculate the desired output from your given test data. Ideally you would do this in a more efficient way as part of your ETL process that is updating the rows as they come in rather than all at once. Though this obviously required info about your ETL that you haven't included in your question:
Query
declare #SCDtest table(id int,empid int,Deptid varchar(10),Ename varchar(50),DeptName varchar(50),city varchar(50),startdate datetime,Enddate datetime);
Insert into #SCDtest values(1, 1, 'D1', 'Mike', 'Account', 'Atlanta', '7/31/2020', '8/3/2020'),(2, 2, 'D2', 'Roy', 'IT', 'New York', '7/31/2020', '8/5/2020'),(3, 1, 'D1', 'Ross', 'Account', 'Atlanta', '8/4/2020', '8/7/2020'),(4, 2, 'D2', 'Roy', 'IT', 'Los angeles', '8/5/2020',NULL),(5, 1, 'D1', 'John', 'Marketing', 'Boston', '8/8/2020', NULL);
with l as
(
select *
,lag(id,1) over (partition by empid order by id) as l
from #SCDtest
)
select l.id
,l.empid
,l.Deptid
,l.Ename
,l.DeptName
,l.city
,l.startdate
,l.Enddate
,stuff(concat(case when l.Deptid <> t.Deptid then ', Deptid' end
,case when l.Ename <> t.Ename then ', Ename' end
,case when l.DeptName <> t.DeptName then ', DeptName' end
,case when l.city <> t.city then ', city' end
)
,1,2,''
) as ColumnUpdated
from l
left join #SCDtest as t
on l.l = t.id
order by l.empid
,l.startdate;
Output
+----+-------+--------+-------+-----------+-------------+-------------------------+-------------------------+-----------------------+
| id | empid | Deptid | Ename | DeptName | city | startdate | Enddate | ColumnUpdated |
+----+-------+--------+-------+-----------+-------------+-------------------------+-------------------------+-----------------------+
| 1 | 1 | D1 | Mike | Account | Atlanta | 2020-07-31 00:00:00.000 | 2020-08-03 00:00:00.000 | NULL |
| 3 | 1 | D1 | Ross | Account | Atlanta | 2020-08-04 00:00:00.000 | 2020-08-07 00:00:00.000 | Ename |
| 5 | 1 | D1 | John | Marketing | Boston | 2020-08-08 00:00:00.000 | NULL | Ename, DeptName, city |
| 2 | 2 | D2 | Roy | IT | New York | 2020-07-31 00:00:00.000 | 2020-08-05 00:00:00.000 | NULL |
| 4 | 2 | D2 | Roy | IT | Los angeles | 2020-08-05 00:00:00.000 | NULL | city |
+----+-------+--------+-------+-----------+-------------+-------------------------+-------------------------+-----------------------+

How to separate one column to multiple using conditions in SQL Server

I want separate one column into multiple columns based on condition.
Table : emp
CREATE TABLE [dbo].[emp]
(
[name] [varchar](200) NULL,
[id] [int] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[emp] ([name], [id])
VALUES (N'lux-pen-oxo-mobile', 1),
(N'pne-soap', 2),
(N'hop-pen-mobile-soap-jad', 3),
(N'pen-soap-box', 4)
Based on the above data I want output like below :
id |prod1 |prod2 |prod3 |prod4 | Prod5
1 |lux |pen |oxo |mobile |
2 |pne |soap | | |
3 |hop |pen |mobile |soap |jad
4 |pen |soap |box |
I tried like this:
select
id,
case
when charindex('-', name) > 0
then substring(name, 1, charindex('-', [name]) - 1)
end prod1,
substring(name, charindex('-', [name], 2) + 1, len(name)) prod2,
substring(name, charindex('-', [name], 3) + 1, len(name)) prod3,
substring(name, charindex('-', [name], 4) + 1, len(name)) prod4,
substring(name, charindex('-', [name], 5) + 1, len(name)) prod4
from
[emp]
This query not returning the expected result.
Please tell me how to write a query to achieve this task in SQL Server.
You can do it using CTE like following example.
;WITH Split_Names (ID,xmlname)
AS
(
SELECT ID,
CONVERT(XML,'<Names><name>'
+ REPLACE(Name,'-', '</name><name>') + '</name></Names>') AS xmlname
FROM [dbo].[emp]
)
SELECT ID,
xmlname.value('/Names[1]/name[1]','varchar(100)') AS prod1,
xmlname.value('/Names[1]/name[2]','varchar(100)') AS prod2,
xmlname.value('/Names[1]/name[3]','varchar(100)') AS prod3,
xmlname.value('/Names[1]/name[4]','varchar(100)') AS prod4,
xmlname.value('/Names[1]/name[5]','varchar(100)') AS prod5
FROM Split_Names
OUTPUT
+----+-------+-------+--------+--------+-------+
| ID | prod1 | prod2 | prod3 | prod4 | prod5 |
+----+-------+-------+--------+--------+-------+
| 1 | lux | pen | oxo | mobile | NULL |
+----+-------+-------+--------+--------+-------+
| 2 | pne | soap | NULL | NULL | NULL |
+----+-------+-------+--------+--------+-------+
| 3 | hop | pen | mobile | soap | jad |
+----+-------+-------+--------+--------+-------+
| 4 | pen | soap | box | NULL | NULL |
+----+-------+-------+--------+--------+-------+
If you want to replace NULL with '', in that case you can change the columns in select like following.
ISNULL(xmlname.value('/Names[1]/name[1]','varchar(100)'),'') AS prod1 ,
Live Demo

SQL DATEDIFF in an sql query

I have two tables Customers and Purchases:
Customers table:
+------------+-----------+----------+
| CustomerID | FirstName | Surname |
+------------+-----------+----------+
| 101 | Jeff | Smith |
| 102 | Alex | Jones |
| 103 | Pam | Clark |
| 104 | Zola | Lona |
| 105 | Simphele | Ndima |
| 106 | Andre | Williams |
| 107 | Wayne | Shelton |
| 108 | Bob | Banard |
| 109 | Ken | Davidson |
| 110 | Sally | Ivan |
+------------+-----------+----------+
Purchases table:
+------------+--------------+------------+-----------+
| PurchaseId | PurchaseDate | CustomerID | ProductID |
+------------+--------------+------------+-----------+
| 1 | 2012-08-15 | 105 | a510 |
| 2 | 2012-08-15 | 102 | a510 |
| 3 | 2012-08-15 | 103 | a506 |
| 4 | 2012-08-16 | 105 | a510 |
| 5 | 2012-08-17 | 106 | a507 |
| 6 | 2012-08-17 | 107 | a509 |
| 7 | 2012-08-18 | 108 | a502 |
| 8 | 2012-08-19 | 108 | a510 |
| 9 | 2012-08-19 | 109 | a502 |
| 10 | 2012-08-20 | 110 | a503 |
| 11 | 2012-08-21 | 101 | a510 |
| 12 | 2012-08-22 | 102 | a507 |
+------------+--------------+------------+-----------+
My question (which I have been struggling with for the last 2 days): create a query that will display all the customers who purchased products after five days or more, since their last purchase.
Desired outputs:
+-----------+------------------+
| Firstname | Daysdifference |
+-----------+------------------+
| Alex | 7 |
+-----------+------------------+
select c.FirstName, t.dif as Daysdifference from customer c
inner join
(
select p1.CustomerID,
datediff(day,p1.PurchaseDate,p2.PurchaseDate) as dif
from purchases p1
inner join purchases p2
on p1.CustomerID=p2.CustomerID
where datediff(day,p1.PurchaseDate,p2.PurchaseDate)>=5
) t
on t.CustomerID= c.CustomerID
Here you go:
DECLARE #Customers TABLE (CustomerID INT, FirstName VARCHAR(30), Surname VARCHAR(30));
DECLARE #Purchases TABLE (PurchaseId INT, PurchaseDate DATE, CustomerID INT, ProductID VARCHAR(10) );
/**/
INSERT INTO #Customers VALUES
(101,'Jeff ' , 'Smith '),
(102,'Alex ' , 'Jones '),
(103,'Pam ' , 'Clark '),
(104,'Zola ' , 'Lona '),
(105,'Simphele' , 'Ndima '),
(106,'Andre ' , 'Williams'),
(107,'Wayne ' , 'Shelton '),
(108,'Bob ' , 'Banard '),
(109,'Ken ' , 'Davidson'),
(110,'Sally ' , 'Ivan ');
INSERT INTO #Purchases VALUES
(1, '2012-08-15' ,105, 'a510'),
(2, '2012-08-15' ,102, 'a510'),
(3, '2012-08-15' ,103, 'a506'),
(4, '2012-08-16' ,105, 'a510'),
(5, '2012-08-17' ,106, 'a507'),
(6, '2012-08-17' ,107, 'a509'),
(7, '2012-08-18' ,108, 'a502'),
(8, '2012-08-19' ,108, 'a510'),
(9, '2012-08-19' ,109, 'a502'),
(10,'2012-08-20' ,110, 'a503'),
(11,'2012-08-21' ,101, 'a510'),
(12,'2012-08-22' ,102, 'a507');
--
WITH CTE AS (
SELECT Pur1.CustomerID, DATEDIFF(DAY, Pur1.PurchaseDate, Pur2.PurchaseDate) Daysdifference
FROM #Purchases Pur1 INNER JOIN #Purchases Pur2 ON Pur1.CustomerID = Pur2.CustomerID
)
SELECT Cus.FirstName, CTE.Daysdifference
FROM #Customers Cus INNER JOIN CTE ON Cus.CustomerID = CTE.CustomerID
WHERE CTE.Daysdifference >= 5;
Result:
+-----------+------------------+
| Firstname | Daysdifference |
+-----------+------------------+
| Alex | 7 |
+-----------+------------------+
Demo
You can solve it like this:
Create a ranking based on date desc and partitioned by customer id
Next check date diff between consecutive ranks to find those customers
Query below
; with cte as
(
select
*,
row_number() over(partition by CustomerID order by PurchaseDate desc) r
from
Purchases
)
select
Name= c.FirstName,
Daysdifference =datediff(d,c1.PurchaseDate, c2.PurchaseDate)
from
Customers c join
cte c1
on c.customerid=c1.customerid
join cte c2
on c1.CustomerID=c2.CustomerId
and c1.r-1=c2.r
and datediff(d,c1.PurchaseDate, c2.PurchaseDate) >=5
See working demo
Since SQL Server 2012 and the addition of the LAG & LEAD functions, there is no reason at all to do a self join for something like this...
Note... Ranking function can be extremely efficient compared to other methods BUT they do need the help of a proper index to perform their best (note the additional POC index in the test script).
CREATE TABLE #Customers (
CustomerID INT PRIMARY KEY,
FirstName VARCHAR(30),
Surname VARCHAR(30)
);
CREATE TABLE #Purchases (
PurchaseId INT PRIMARY KEY,
PurchaseDate DATE,
CustomerID INT,
ProductID VARCHAR(10)
);
INSERT INTO #Customers VALUES
(101,'Jeff ' , 'Smith '),
(102,'Alex ' , 'Jones '),
(103,'Pam ' , 'Clark '),
(104,'Zola ' , 'Lona '),
(105,'Simphele' , 'Ndima '),
(106,'Andre ' , 'Williams'),
(107,'Wayne ' , 'Shelton '),
(108,'Bob ' , 'Banard '),
(109,'Ken ' , 'Davidson'),
(110,'Sally ' , 'Ivan ');
INSERT INTO #Purchases VALUES
(1, '2012-08-15' ,105, 'a510'),
(2, '2012-08-15' ,102, 'a510'),
(3, '2012-08-15' ,103, 'a506'),
(4, '2012-08-16' ,105, 'a510'),
(5, '2012-08-17' ,106, 'a507'),
(6, '2012-08-17' ,107, 'a509'),
(7, '2012-08-18' ,108, 'a502'),
(8, '2012-08-19' ,108, 'a510'),
(9, '2012-08-19' ,109, 'a502'),
(10,'2012-08-20' ,110, 'a503'),
(11,'2012-08-21' ,101, 'a510'),
(12,'2012-08-22' ,102, 'a507');
-- add POC index...
CREATE NONCLUSTERED INDEX ix_POC ON #Purchases (CustomerID, PurchaseDate);
--===========================================================
SELECT
c.FirstName,
p2.Daysdifference
FROM
#Customers c
JOIN (
SELECT
p.CustomerID,
Daysdifference = DATEDIFF(DAY, p.PurchaseDate, LEAD(p.PurchaseDate, 1) OVER (PARTITION BY p.CustomerID ORDER BY p.PurchaseDate))
FROM
#Purchases p
) p2
ON c.CustomerID = p2.CustomerID
WHERE
p2.Daysdifference >= 5;
Results...
FirstName Daysdifference
------------------------------ --------------
Alex 7

Create sql query to inherits value from nearest parent row

I have a self-referenced table called Units that has a "BossId" column refers to manager person .
There is business rule to determine the unit's Boss as described below:
1-The unit has its own BossId. (there is no more work)
2-The BossId is null. in this case we refer to the most nearest parent that has bossId value
i wanna create an efficient SQL view that all unit and their boss is specified according to the mentioned rules
below is the structure of my unit table:
CREATE TABLE [dbo].[Unit](
[Id] [int] IDENTITY(1,1) NOT NULL,
[ParentId] [int] NULL,
[BossId] [int] NULL,
Here is the sample data:
INSERT INTO Units (ID, ParentID, BossId) VALUES (1, NULL, 1000)
INSERT INTO Units (ID, ParentID, BossId) VALUES (2, 1, NULL)
INSERT INTO Units (ID, ParentID, BossId) VALUES (3, 2, NULL)
INSERT INTO Units (ID, ParentID, BossId) VALUES (4, 1, 3000)
INSERT INTO Units (ID, ParentID, BossId) VALUES (5, 4, NULL)
Selecting the data as follows:
Select ID,ParentId,BossId from Units
result would be:
+----+-------+----------+
| ID | ParentId| BossId|
+----+-------+----------+
| 1 | NULL | 1000 |
| 2 | 1 | NULL |
| 3 | 2 | NULL |
| 4 | 1 | 3000 |
| 5 | 4 | NULL |
+----+-------+----------+
I need some view to produce something like this:
+----+-------+----------+
| ID | ParentId| BossId|
+----+-------+----------+
| 1 | NULL | 1000 |
| 2 | 1 | 1000 |
| 3 | 2 | 1000 |
| 4 | 1 | 3000 |
| 5 | 4 | 3000 |
+----+-------+----------+
So all unit's boss id is specified according to the rule
With _cte (ParentId, Id, BossId)
As
(
Select ParentId, Id, BossId
From Units
Union All
Select U.parentId, U.Id, c.BossId
From Units As U
Join _cte As c
On u.ParentId = c.Id
)
Select Id, ParentId, Max(BossId) As BossId
From _cte
Where BossId Is Not Null
Group
By Id, ParentId
Order
By Id, ParentId
Produces
Id ParentId BossId
----------- ----------- -----------
1 NULL 1000
2 1 1000
3 2 1000
4 1 3000
5 4 3000

Resources