T-SQL prepare dynamic COALESCE - sql-server

As attached in screenshot, there are two tables.
Configuration:
Detail
Using Configuration and Detail table I would like to populate IdentificationType and IDerivedIdentification column in the Detail table.
Following logic should be used, while deriving above columns
Configuration table has order of preference, which user can change dynamically (i.e. if country is Austria then ID preference should be LEI then TIN (in case LEI is blanks) then CONCAT (if both blank then some other logic)
In case of contract ID = 3, country is BG, so LEI should be checked first, since its NULL, CCPT = 456 will be picked.
I could have used COALESCE and CASE statement, in case hardcoding is allowed.
Can you please suggest any alternation approach please ?
Regards
Digant

Assuming that this is some horrendous data dump and you are trying to clean it up here is some SQL to throw at it. :) Firstly, I was able to capture your image text via Adobe Acrobat > Excel.
(I also built the schema for you at: http://sqlfiddle.com/#!6/8f404/12)
Firstly, the correct thing to do is fix the glaring problem and that's the table structure. Assuming you can't here's a solution.
So, here it is and what it does is unpivots the columns LEI, NIND, CCPT and TIN from the detail table and also as well as FirstPref, SecondPref, ThirdPref from the Configuration table. Basically, doing this helps to normalize the data although it's costing you major performance if there are no plans to fix the data structure or you cannot. After that you are simply joining the tables Detail.ContactId to DerivedTypes.ContactId then DerivedPrefs.ISOCountryCode to Detail.CountrylSOCountryCode and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType If you use an inner join rather than the left join you can remove the RANK() function but it will not show all ContactIds, only those that have a value in their LEI, NIND, CCPT or TIN columns. I think that's a better solution anyway because why would you want to see an error mixed in a report? Write a separate report for those with no values in those columns. Lastly, the TOP (1) with ties allows you to display one record per ContactId and allows for the record with the error to still display. Hope this helps.
CREATE TABLE Configuration
(ISOCountryCode varchar(2), CountryName varchar(8), FirstPref varchar(6), SecondPref varchar(6), ThirdPref varchar(6))
;
INSERT INTO Configuration
(ISOCountryCode, CountryName, FirstPref, SecondPref, ThirdPref)
VALUES
('AT', 'Austria', 'LEI', 'TIN', 'CONCAT'),
('BE', 'Belgium', 'LEI', 'NIND', 'CONCAT'),
('BG', 'Bulgaria', 'LEI', 'CCPT', 'CONCAT'),
('CY', 'Cyprus', 'LEI', 'NIND', 'CONCAT')
;
CREATE TABLE Detail
(ContactId int, FirstName varchar(1), LastName varchar(3), BirthDate varchar(4), CountrylSOCountryCode varchar(2), Nationality varchar(2), LEI varchar(9), NIND varchar(9), CCPT varchar(9), TIN varchar(9))
;
INSERT INTO Detail
(ContactId, FirstName, LastName, BirthDate, CountrylSOCountryCode, Nationality, LEI, NIND, CCPT, TIN)
VALUES
(1, 'A', 'DES', NULL, 'AT', 'AT', '123', '4345', NULL, NULL),
(2, 'B', 'DEG', NULL, 'BE', 'BE', NULL, '890', NULL, NULL),
(3, 'C', 'DEH', NULL, 'BG', 'BG', NULL, '123', '456', NULL),
(4, 'D', 'DEi', NULL, 'BG', 'BG', NULL, NULL, NULL, NULL)
;
SELECT TOP (1) with ties Detail.ContactId,
FirstName,
LastName,
BirthDate,
CountrylSOCountryCode,
Nationality,
LEI,
NIND,
CCPT,
TIN,
ISNULL(DerivedPrefs.ldentificationType, 'ERROR') ldentificationType,
IDerivedIdentification,
RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END) AS PrefRank
FROM
Detail
LEFT JOIN
(
SELECT
ContactId,
LEI,
NIND,
CCPT,
TIN
FROM Detail
) DetailUNPVT
UNPIVOT
(IDerivedIdentification FOR ldentificationType IN
(LEI, NIND, CCPT, TIN)
)AS DerivedTypes
ON DerivedTypes.ContactId = Detail.ContactId
LEFT JOIN
(
SELECT
ISOCountryCode,
CountryName,
FirstPref,
SecondPref,
ThirdPref
FROM
Configuration
) ConfigUNPIVOT
UNPIVOT
(ldentificationType FOR Pref IN
(FirstPref, SecondPref, ThirdPref)
)AS DerivedPrefs
ON DerivedPrefs.ISOCountryCode = Detail.CountrylSOCountryCode
and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType
ORDER BY RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END)

Related

Select all Main table rows with detail table column constraints with GROUP BY

I've 2 tables tblMain and tblDetail on SQL Server that are linked with tblMain.id=tblDetail.OrderID for orders usage. I've not found exactly the same situation in StackOverflow.
Here below is the sample table design:
/* create and populate tblMain: */
CREATE TABLE tblMain (
ID int IDENTITY(1,1) NOT NULL,
DateOrder datetime NULL,
CONSTRAINT PK_tblMain PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblMain (DateOrder) VALUES('2021-05-20T12:12:10');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-21T09:13:13');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-22T21:30:28');
GO
/* create and populate tblDetail: */
CREATE TABLE tblDetail (
ID int IDENTITY(1,1) NOT NULL,
OrderID int NULL,
Gencod VARCHAR(255),
Quantity float,
Price float,
CONSTRAINT PK_tblDetail PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '1234567890123', 8, 12.30);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '5825867890321', 2, 2.88);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '7788997890333', 1, 1.77);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9882254656215', 3, 5.66);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9665464654654', 4, 10.64);
GO
Here is my SELECT with grouping:
SELECT tblMain.id,SUM(tblDetail.Quantity*tblDetail.Price) AS TotalPrice
FROM tblMain LEFT JOIN tblDetail ON tblMain.id=tblDetail.orderid
WHERE (tblDetail.Quantity<>0) GROUP BY tblMain.id;
GO
This gives:
The wished output:
We see that id=2 is not shown even with LEFT JOIN, as there is no records with OrderID=2 in tblDetail.
How to design a new query to show tblMain.id = 2? Mean while I must keep WHERE (tblDetail.Quantity<>0) constraints. Many thanks.
EDIT:
The above query serves as CTE (Common Table Expression) for a main query that takes into account payments table tblPayments again.
After testing, both solutions work.
In my case, the main table has 15K records, while detail table has some millions. With (tblDetail.Quantity<>0 OR tblDetail.Quantity IS NULL) AND tblDetail.IsActive=1 added on JOIN ON clause it takes 37s to run, while the first solution of #pwilcox, the condition being added on the where clause, it ends up on 29s. So a gain of time of 20%.
tblDetail.IsActive column permits me ignore detail rows that is temporarily ignored by setting it to false.
So the for me it's ( #pwilcox's answer).
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
Change
WHERE (tblDetail.Quantity<>0)
to
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
as the former will omit id = 2 because the corresponding quantity would be null in a left join.
And as HABO mentions, you can also make the condition a part of your join logic as opposed to your where statement, avoiding the need for the 'or' condition.
select m.id,
totalPrice = sum(d.quantity * d.price)
from tblMain m
left join tblDetail d
on m.id = d.orderid
and d.quantity <> 0
group by m.id;

Can I pull Max and Min values in SQL without using group by for non aggregate values?

I have a table of user data for when they enroll in a program. The fields include a user ID, start date, end date, entry reason, exit reason and program type. For each year the user is enrolled in a specific program they will have an entry and exit date for that year along with an entry reason. They only get an exit reason when they are exited from the program completely. Here is an example of the data in the table.
Data Table
Desired Result
I need to pull one line for each user that has their original start date in the program, most recent start date, and most recent end date. I also need to pull the exit reason if one exists and entry reason associated with the most recent start date and this is what is getting me hung up. I’m assuming the problem is related to having to group by the entry reason. Is there any way around using an aggregate function to get the min/max dates?
My query is:
Select
Table1.userID,
CAST(Min(table2.startdate) as date) as Originalstartdate,
CAST(Max(table2.startdate) as date) as Maxstartdate,
CAST(Max(table2.enddate) as date) as ExitDate,
CASE
WHEN table2.exitreason = NULL then ‘’
ELSE table2.exitreason
END as Exitcode,
Table2.entryreason
From
Table1 left outer join
Table2 on Table1.userID = Table2.userID
Where
Table1.status = ‘active’ and Table2.programID = ‘Program1’ and (Table2.exitreason <> ‘NULL’ or Table2.entryreason <> ‘NULL’)
Group By
Table1.userID, Table2.exitreason, Table2.entryreason
I used the below sample code in order to generate this.
The idea here is to utilize the userID as the anchor (you want one row per user, right?), aggregating the rest of the information but with the situation you requested.
CREATE TABLE SCRIPT:
CREATE TABLE table1
(
userID INT IDENTITY(1, 1) PRIMARY KEY,
name VARCHAR(200),
stat CHAR(1) NOT NULL
DEFAULT 'A');
CREATE TABLE table2
(
t2ID INT IDENTITY(1, 1),
StartDate DATE,
UserID INT FOREIGN KEY REFERENCES table1(userID),
ProgramID VARCHAR(150) DEFAULT 'Program1',
EndDate DATE,
EntryReason VARCHAR(2000),
ExitReason VARCHAR(2000));
INSERT INTO Table1
(name)
SELECT *
FROM(VALUES
(
'First name'),
(
'Second name'),
(
'Third name')) x("name");
INSERT INTO Table2
SELECT *
FROM(VALUES
(
'20180101', 1, 'Program1', '20181231', 11, NULL),
(
'20190101', 1, 'Program1', '20191231', 12, NULL),
(
'20200101', 1, 'Program1', NULL, 11, NULL),
(
'20170101', 2, 'Program1', '20171231', 11, NULL),
(
'20180101', 2, 'Program1', '20171231', 14, 2),
(
'20200101', 3, 'Program1', NULL, 11, NULL)
) x(StartDate, UserID, ProgramID, EndDate, EntryReason, ExitReason);
QUERY:
SELECT t1.userID,
CAST(MIN(t2.StartDate) AS DATE)
AS OriginalStartDate, -- This uses your logic to grab the earliest date
CAST(MAX(t2.StartDate) AS DATE)
AS RecentStartDate, -- This utilizes your logic to grab the last start date
CAST(MAX(t2.enddate) AS DATE)
AS ExitDate,
-- This works because we know an ExitDate must be populated due to the where
-- criteria (which prevents people who haven't exited yet from showing up)
ISNULL(MAX(t2.exitreason), '')
AS ExitCode, -- This is just a cleaner way to handle nulls.
STUFF(
(
SELECT CONCAT(',', EntryReason)
FROM Table2
WHERE Table2.UserID=t1.UserID FOR XML PATH('')
), 1, 1, '')
AS EntryReasonList
-- this solution creates a list of entry reasons; we could pick a best winner
-- (e.g. first entry code, last entry code..) but I created a list because
-- I didn't understand your intent.
FROM Table1
AS t1
LEFT JOIN
Table2
AS t2
ON T1.userID=T2.userID
WHERE t1.stat='A' -- you would use status= 'active'
AND t2.programID='Program1' -- same as before
AND NOT EXISTS
-- a not exists clause will do what you want to filter graduates out
(
SELECT 1
FROM Table2
AS t2self
WHERE t2.userID=t2self.userID
AND t2self.exitreason IS NOT NULL
)
GROUP BY t1.userID;

How to group by with NULL values

What would be the simplest way to group by when NULL values?
declare #MyTable Table (ID int, Name varchar(50),Coverage varchar(50), Premium money)
insert into #MyTable values (1,'Robert', 'AutoBI', 100),
(1,'Robert', NULL, 300),
(2,'Neill','AutoBIPD',150),
(2,'Neill','AutoBI',200),
(3,'Kim', 'Collision',50),
(3,'Kim',NULL,100),
(4,'Rick','AutoBI',70),
(5,'Lukasz','Comprehensive',50),
(5,'Lukasz','NULL',25)
select ID,
Name,
Coverage,
sum(Premium) as Premium
from #MyTable
group by ID
,Name
,Premium
,Coverage
The outcome looks like this:
As you can see there is NULL value for name 'Robert'.
How can I have summed premium ($400) and only one line without NULL Coverage?
But I need to make it look like this:
I cannot use MAX() function in this case.
This solution assumes that NULL will be grouped to one "random" NOT NULL value within ID/Name. If more than single value is poissible then this query won't return stable result sets between executions:
select ID,
Name,
ISNULL(m1.Coverage, sub.Coverage) AS Coverage,
sum(Premium) as Premium
FROM #MyTable m1
cross apply (SELECT TOP 1 m2.Coverage FROM #MyTable m2 WHERE Coverage IS NOT NULL
AND m1.ID = m2.ID AND m1.Name = m2.Name) sub
group by ID
,Name
,ISNULL(m1.Coverage, sub.Coverage);
Rextester Demo

How can i model groups of persons?

I need to model groups of persons and I can't find a way to design tabels to do it efficiently.
Groups can be thought as sets, unordered collections of one or more persons, each group should be uniquely identified by its components.
Edit: and a person can be part of more than one group.
My first attempt looks like this.
A table which contains all "persons" managed by the system.
table Persons(
id int,
name varchar,
(other data...)
)
a table that contains groups and all group properties:
table Groups(
group_id int,
group_name varchar,
(other data...)
)
and a table with the association between persons and groups
table gropus_persons (
person_id int,
group_id in
)
This design doesn't fit well with this requirements because it is hard to write the query to retrieve the group id from a list of components.
The only query I could come up to find the group composed by persons (1, 2, 3) looks like this:
select *
from groups g
where
g.group_id in (select group_id from gropus_persons where person_id = 1)
and g.group_id in (select group_id from gropus_persons where person_id = 2)
and g.group_id in (select group_id from gropus_persons where person_id = 3)
and not exists (select 1 from gropus_persons where group_id = g.group_id and person_id not in (1,2,3))
the problem is that the number of components is variable so I can only use a dynamically generated query and add a subquery for each component each time I need to find a new group.
Is there a better solution?
Thank you in advice for the help!
You need to group by the "group" and count how many hits you receive. For this, you only need the intersection table:
select GroupID, count(*) as MemberCount
from GroupsPersons
where PersonID in( 1, 2, 3 )
group by GroupID
having count(*) = 3;
The problem comes with making this query suitable for a varying list of person id values. As you seem to already realize this will require dynamic SQL, the pseudo-code will look something like this:
stmt := 'select GroupID, count(*) as MemberCount '
|| 'from GroupsPersons '
|| 'where PersonID in( ' || CSVList || ' ) '
|| 'group by GroupID '
|| 'having count(*) = ' || length( CSVList );
The one potential bug you have to be wary of is if the same id repeats in the list. For example: CSVList := '1, 2, 3, 2';
This will generate a correct count(*) value of 3, but the having clause will be looking for 4.
Another solution to consider is to pivot/xpath the set of person IDs in alpha sequence and store it in your groups table and compare that string with your target.
For your example, you'd use Select group_id from groups where personIDs = '1,2,3,'
How about this, I think the schema is the same as yours, not sure:
create table Groups(
group_id int primary key,
group_name varchar(100)
);
create table Persons(
person_id int primary key,
name varchar
);
create table Membership(
group_id int REFERENCES Groups (group_id),
person_id int REFERENCES Persons (person_id)
);
INSERT INTO Persons
VALUES (1, 'p1'),
(2, 'p2'),
(3, 'p2'),
(4, 'p2');
INSERT INTO Groups
VALUES (1, 'group1'),
(2, 'group2');
INSERT INTO Membership
VALUES (1, 1),
(1, 2),
(2, 2),
(1, 3);
Then select:
select p.name, g.group_name
from Persons as p
join Membership as m on p.person_id = m.person_id
join Groups as g on g.group_id = m.group_id
where m.group_id in (1, 2);
Obviously data would need to be adjusted to suit yours.

Sql Query - One to Many

I have three tables in my database.
Employees
Skill
EmployeeSkill
Here is the structure
I am unable to think of a query that would return me results in the following format
Employee [Skill1] [Skill2] [Skill3] ...
Mr.Abc true false true
Mr.Xyz false true true
Where [Skill1], [Skill2] etc will list all the skills for a specific category defined inside skill table and the column value (true/false) will depend on records in EmployeeSkill table. For example if there is an entry in the table that links Employee with [skill1], it would list true and if there is no entry it will list false.
Additionally the number of skill selected (and displayed as column headers) can change based on the the skillCategory.
Help appriciated
Use SQL PIVOT function
http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
example from that site:
-- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days,
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost
FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;
See also this answer for a decent example on pivot:
How to create a pivot query in sql server without aggregate function
UPDATE: I've created a SQL Fiddle for you, showing how to do it. The only issue is that with PIVOT, you cannot make the columns dynamic. There is a way to make it dynamic, but only by using dynamic queries... (there is here an example showing how: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx -> comment 'Como hacer PIVOT con consultas dinámicas').
This is the SQL Fiddle i made for you: http://sqlfiddle.com/#!3/cb979/5/1
Assuming this base code:
CREATE TABLE EMPLOYEES
(
Id int NOT NULL IDENTITY (1, 1),
FirstName varchar(50) NULL,
LastName varchar(50) NULL,
Salary float(53) NULL,
Department varchar(50) NULL
) ON [PRIMARY]
CREATE TABLE Skill
(
Id int NOT NULL IDENTITY (1, 1),
Skill varchar(50) NULL,
) ON [PRIMARY]
CREATE TABLE EmployeeSkill
(
SkillId int,
EmployeeId varchar(50) NULL,
)
INSERT INTO EMPLOYEES (FirstName, LastName, Salary, Department)
VALUES ('Alex','T',200,'IT');
INSERT INTO EMPLOYEES (FirstName, LastName, Salary, Department)
VALUES ('Zed','Bee',300,'IT');
INSERT INTO Skill (Skill) VALUES ('SQL Skill');
INSERT INTO Skill (Skill) VALUES ('HTML Skill');
INSERT INTO Skill (Skill) VALUES ('PHP Skill');
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(1,1);
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(2,1);
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(3,1);
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(1,2);
And this SQL to create the pivot:
SELECT *
FROM
(
SELECT EmployeeID, FirstName + ' ' + LastName as FullName, SkillID, Skill
FROM EmployeeSkill LEFT JOIN Skill ON Skill.ID = SkillID
LEFT JOIN Employees ON Employees.ID = EmployeeID
) AS source
PIVOT
(
COUNT([SkillID])
FOR [Skill] IN ([SQL Skill], [HTML Skill], [PHP Skill])
) as pvt;

Resources