Sql Query - One to Many - sql-server

I have three tables in my database.
Employees
Skill
EmployeeSkill
Here is the structure
I am unable to think of a query that would return me results in the following format
Employee [Skill1] [Skill2] [Skill3] ...
Mr.Abc true false true
Mr.Xyz false true true
Where [Skill1], [Skill2] etc will list all the skills for a specific category defined inside skill table and the column value (true/false) will depend on records in EmployeeSkill table. For example if there is an entry in the table that links Employee with [skill1], it would list true and if there is no entry it will list false.
Additionally the number of skill selected (and displayed as column headers) can change based on the the skillCategory.
Help appriciated

Use SQL PIVOT function
http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
example from that site:
-- Pivot table with one row and five columns
SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days,
[0], [1], [2], [3], [4]
FROM
(SELECT DaysToManufacture, StandardCost
FROM Production.Product) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;
See also this answer for a decent example on pivot:
How to create a pivot query in sql server without aggregate function
UPDATE: I've created a SQL Fiddle for you, showing how to do it. The only issue is that with PIVOT, you cannot make the columns dynamic. There is a way to make it dynamic, but only by using dynamic queries... (there is here an example showing how: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx -> comment 'Como hacer PIVOT con consultas dinĂ¡micas').
This is the SQL Fiddle i made for you: http://sqlfiddle.com/#!3/cb979/5/1
Assuming this base code:
CREATE TABLE EMPLOYEES
(
Id int NOT NULL IDENTITY (1, 1),
FirstName varchar(50) NULL,
LastName varchar(50) NULL,
Salary float(53) NULL,
Department varchar(50) NULL
) ON [PRIMARY]
CREATE TABLE Skill
(
Id int NOT NULL IDENTITY (1, 1),
Skill varchar(50) NULL,
) ON [PRIMARY]
CREATE TABLE EmployeeSkill
(
SkillId int,
EmployeeId varchar(50) NULL,
)
INSERT INTO EMPLOYEES (FirstName, LastName, Salary, Department)
VALUES ('Alex','T',200,'IT');
INSERT INTO EMPLOYEES (FirstName, LastName, Salary, Department)
VALUES ('Zed','Bee',300,'IT');
INSERT INTO Skill (Skill) VALUES ('SQL Skill');
INSERT INTO Skill (Skill) VALUES ('HTML Skill');
INSERT INTO Skill (Skill) VALUES ('PHP Skill');
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(1,1);
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(2,1);
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(3,1);
INSERT INTO EmployeeSkill (SkillID, EmployeeID) VALUES(1,2);
And this SQL to create the pivot:
SELECT *
FROM
(
SELECT EmployeeID, FirstName + ' ' + LastName as FullName, SkillID, Skill
FROM EmployeeSkill LEFT JOIN Skill ON Skill.ID = SkillID
LEFT JOIN Employees ON Employees.ID = EmployeeID
) AS source
PIVOT
(
COUNT([SkillID])
FOR [Skill] IN ([SQL Skill], [HTML Skill], [PHP Skill])
) as pvt;

Related

Select all Main table rows with detail table column constraints with GROUP BY

I've 2 tables tblMain and tblDetail on SQL Server that are linked with tblMain.id=tblDetail.OrderID for orders usage. I've not found exactly the same situation in StackOverflow.
Here below is the sample table design:
/* create and populate tblMain: */
CREATE TABLE tblMain (
ID int IDENTITY(1,1) NOT NULL,
DateOrder datetime NULL,
CONSTRAINT PK_tblMain PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblMain (DateOrder) VALUES('2021-05-20T12:12:10');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-21T09:13:13');
INSERT INTO tblMain (DateOrder) VALUES('2021-05-22T21:30:28');
GO
/* create and populate tblDetail: */
CREATE TABLE tblDetail (
ID int IDENTITY(1,1) NOT NULL,
OrderID int NULL,
Gencod VARCHAR(255),
Quantity float,
Price float,
CONSTRAINT PK_tblDetail PRIMARY KEY
(
ID ASC
)
)
GO
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '1234567890123', 8, 12.30);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(1, '5825867890321', 2, 2.88);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '7788997890333', 1, 1.77);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9882254656215', 3, 5.66);
INSERT INTO tblDetail (OrderID, Gencod, Quantity, Price) VALUES(3, '9665464654654', 4, 10.64);
GO
Here is my SELECT with grouping:
SELECT tblMain.id,SUM(tblDetail.Quantity*tblDetail.Price) AS TotalPrice
FROM tblMain LEFT JOIN tblDetail ON tblMain.id=tblDetail.orderid
WHERE (tblDetail.Quantity<>0) GROUP BY tblMain.id;
GO
This gives:
The wished output:
We see that id=2 is not shown even with LEFT JOIN, as there is no records with OrderID=2 in tblDetail.
How to design a new query to show tblMain.id = 2? Mean while I must keep WHERE (tblDetail.Quantity<>0) constraints. Many thanks.
EDIT:
The above query serves as CTE (Common Table Expression) for a main query that takes into account payments table tblPayments again.
After testing, both solutions work.
In my case, the main table has 15K records, while detail table has some millions. With (tblDetail.Quantity<>0 OR tblDetail.Quantity IS NULL) AND tblDetail.IsActive=1 added on JOIN ON clause it takes 37s to run, while the first solution of #pwilcox, the condition being added on the where clause, it ends up on 29s. So a gain of time of 20%.
tblDetail.IsActive column permits me ignore detail rows that is temporarily ignored by setting it to false.
So the for me it's ( #pwilcox's answer).
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
Change
WHERE (tblDetail.Quantity<>0)
to
where (tblDetail.quantity <> 0 or tblDetail.quantity is null)
as the former will omit id = 2 because the corresponding quantity would be null in a left join.
And as HABO mentions, you can also make the condition a part of your join logic as opposed to your where statement, avoiding the need for the 'or' condition.
select m.id,
totalPrice = sum(d.quantity * d.price)
from tblMain m
left join tblDetail d
on m.id = d.orderid
and d.quantity <> 0
group by m.id;

JSON_VALUE is not working in where clause in SQL Server

I have JSON data in a column in my table. I am trying to apply where condition on the JSON column and fetch records.
Employee table:
Here is my SQL query:
SELECT ID, EMP_NAME
FROM EMPLOYEE
WHERE JSON_VALUE(TEAM, '$') IN (2, 3, 4, 5, 7, 10)
I am getting an empty result when I use this query. Any help on how to do this?
You need to parse the JSON in the TEAM column with OPENJSON():
Table:
CREATE TABLE EMPLOYEE (
ID int,
EMP_NAME varchar(50),
TEAM varchar(1000)
)
INSERT INTO EMPLOYEE (ID, EMP_NAME, TEAM)
VALUES
(1, 'Name1', '[2,11]'),
(2, 'Name2', '[2,3,4,5,7,10]'),
(3, 'Name3', NULL)
Statement:
SELECT DISTINCT e.ID, e.EMP_NAME
FROM EMPLOYEE e
CROSS APPLY OPENJSON(e.TEAM) WITH (TEAM int '$') j
WHERE j.TEAM IN (2,3,4,5,7,10)
Result:
ID EMP_NAME
1 Name1
2 Name2
As an additional option, if you want to get the matches as an aggregated text, you may use the following statement (SQL Server 2017 is needed):
SELECT e.ID, e.EMP_NAME, a.TEAM
FROM EMPLOYEE e
CROSS APPLY (
SELECT STRING_AGG(TEAM, ',') AS TEAM
FROM OPENJSON(e.TEAM) WITH (TEAM int '$')
WHERE TEAM IN (2,3,4,5,7,10)
) a
WHERE a.TEAM IS NOT NULL
Result:
ID EMP_NAME TEAM
1 Name1 2
2 Name2 2,3,4,5,7,10
JSON_VALUE returns a scalar value, not a data set, which you appaer to think it would. If you run SELECT JSON_VALUE('[2,3,4,5,7,10]','$') you'll see that it returns NULL, so yes, no rows will be returned.
You need to treat the JSON like a data set, not a single value:
SELECT ID, EMP_NAME
FROM EMPLOYEE E
WHERE EXISTS (SELECT 1
FROM OPENJSON (E.TEAM) OJ
WHERE OJ.Value IN (2,3,4,5,7,10))

How to group by with NULL values

What would be the simplest way to group by when NULL values?
declare #MyTable Table (ID int, Name varchar(50),Coverage varchar(50), Premium money)
insert into #MyTable values (1,'Robert', 'AutoBI', 100),
(1,'Robert', NULL, 300),
(2,'Neill','AutoBIPD',150),
(2,'Neill','AutoBI',200),
(3,'Kim', 'Collision',50),
(3,'Kim',NULL,100),
(4,'Rick','AutoBI',70),
(5,'Lukasz','Comprehensive',50),
(5,'Lukasz','NULL',25)
select ID,
Name,
Coverage,
sum(Premium) as Premium
from #MyTable
group by ID
,Name
,Premium
,Coverage
The outcome looks like this:
As you can see there is NULL value for name 'Robert'.
How can I have summed premium ($400) and only one line without NULL Coverage?
But I need to make it look like this:
I cannot use MAX() function in this case.
This solution assumes that NULL will be grouped to one "random" NOT NULL value within ID/Name. If more than single value is poissible then this query won't return stable result sets between executions:
select ID,
Name,
ISNULL(m1.Coverage, sub.Coverage) AS Coverage,
sum(Premium) as Premium
FROM #MyTable m1
cross apply (SELECT TOP 1 m2.Coverage FROM #MyTable m2 WHERE Coverage IS NOT NULL
AND m1.ID = m2.ID AND m1.Name = m2.Name) sub
group by ID
,Name
,ISNULL(m1.Coverage, sub.Coverage);
Rextester Demo

T-SQL prepare dynamic COALESCE

As attached in screenshot, there are two tables.
Configuration:
Detail
Using Configuration and Detail table I would like to populate IdentificationType and IDerivedIdentification column in the Detail table.
Following logic should be used, while deriving above columns
Configuration table has order of preference, which user can change dynamically (i.e. if country is Austria then ID preference should be LEI then TIN (in case LEI is blanks) then CONCAT (if both blank then some other logic)
In case of contract ID = 3, country is BG, so LEI should be checked first, since its NULL, CCPT = 456 will be picked.
I could have used COALESCE and CASE statement, in case hardcoding is allowed.
Can you please suggest any alternation approach please ?
Regards
Digant
Assuming that this is some horrendous data dump and you are trying to clean it up here is some SQL to throw at it. :) Firstly, I was able to capture your image text via Adobe Acrobat > Excel.
(I also built the schema for you at: http://sqlfiddle.com/#!6/8f404/12)
Firstly, the correct thing to do is fix the glaring problem and that's the table structure. Assuming you can't here's a solution.
So, here it is and what it does is unpivots the columns LEI, NIND, CCPT and TIN from the detail table and also as well as FirstPref, SecondPref, ThirdPref from the Configuration table. Basically, doing this helps to normalize the data although it's costing you major performance if there are no plans to fix the data structure or you cannot. After that you are simply joining the tables Detail.ContactId to DerivedTypes.ContactId then DerivedPrefs.ISOCountryCode to Detail.CountrylSOCountryCode and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType If you use an inner join rather than the left join you can remove the RANK() function but it will not show all ContactIds, only those that have a value in their LEI, NIND, CCPT or TIN columns. I think that's a better solution anyway because why would you want to see an error mixed in a report? Write a separate report for those with no values in those columns. Lastly, the TOP (1) with ties allows you to display one record per ContactId and allows for the record with the error to still display. Hope this helps.
CREATE TABLE Configuration
(ISOCountryCode varchar(2), CountryName varchar(8), FirstPref varchar(6), SecondPref varchar(6), ThirdPref varchar(6))
;
INSERT INTO Configuration
(ISOCountryCode, CountryName, FirstPref, SecondPref, ThirdPref)
VALUES
('AT', 'Austria', 'LEI', 'TIN', 'CONCAT'),
('BE', 'Belgium', 'LEI', 'NIND', 'CONCAT'),
('BG', 'Bulgaria', 'LEI', 'CCPT', 'CONCAT'),
('CY', 'Cyprus', 'LEI', 'NIND', 'CONCAT')
;
CREATE TABLE Detail
(ContactId int, FirstName varchar(1), LastName varchar(3), BirthDate varchar(4), CountrylSOCountryCode varchar(2), Nationality varchar(2), LEI varchar(9), NIND varchar(9), CCPT varchar(9), TIN varchar(9))
;
INSERT INTO Detail
(ContactId, FirstName, LastName, BirthDate, CountrylSOCountryCode, Nationality, LEI, NIND, CCPT, TIN)
VALUES
(1, 'A', 'DES', NULL, 'AT', 'AT', '123', '4345', NULL, NULL),
(2, 'B', 'DEG', NULL, 'BE', 'BE', NULL, '890', NULL, NULL),
(3, 'C', 'DEH', NULL, 'BG', 'BG', NULL, '123', '456', NULL),
(4, 'D', 'DEi', NULL, 'BG', 'BG', NULL, NULL, NULL, NULL)
;
SELECT TOP (1) with ties Detail.ContactId,
FirstName,
LastName,
BirthDate,
CountrylSOCountryCode,
Nationality,
LEI,
NIND,
CCPT,
TIN,
ISNULL(DerivedPrefs.ldentificationType, 'ERROR') ldentificationType,
IDerivedIdentification,
RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END) AS PrefRank
FROM
Detail
LEFT JOIN
(
SELECT
ContactId,
LEI,
NIND,
CCPT,
TIN
FROM Detail
) DetailUNPVT
UNPIVOT
(IDerivedIdentification FOR ldentificationType IN
(LEI, NIND, CCPT, TIN)
)AS DerivedTypes
ON DerivedTypes.ContactId = Detail.ContactId
LEFT JOIN
(
SELECT
ISOCountryCode,
CountryName,
FirstPref,
SecondPref,
ThirdPref
FROM
Configuration
) ConfigUNPIVOT
UNPIVOT
(ldentificationType FOR Pref IN
(FirstPref, SecondPref, ThirdPref)
)AS DerivedPrefs
ON DerivedPrefs.ISOCountryCode = Detail.CountrylSOCountryCode
and DerivedTypes.ldentificationType = DerivedPrefs.ldentificationType
ORDER BY RANK() OVER (PARTITION BY Detail.ContactId ORDER BY
CASE WHEN Pref = 'FirstPref' THEN 1
WHEN Pref = 'SecondPref' THEN 2
WHEN Pref = 'ThirdPref' THEN 3
ELSE 99 END)

SQL - How To Return A Result Without Multiples

I am trying to write a Business Object report to show a list of the people who have not returned a timesheet on a selected date, but I can't figure out how to stop the SQL query returning multiple entries for individuals.
My Staff_Table contains 2 columns - Employee No & Name
My Timesheet_Table contains, among other things, Employee No & Week_Ending_Date.
I can easily write a statement to return all users who have entered a timesheet with a Week_Ending_Date of e.g. 10/08/2012. However, if I try to return a list of all those who have not enetered a timesheet for 10/08/2012, I pick up every single timesheet in the table which does not have that date, so, for example, if a person has submitted 100 timesheets and only 1 of them is for 10/08/2012, the results will show him 99 times.
What I need is a fixed list of everyone on the Staff_Table who has not submitted for that date, showing once only.
I tried a Union with NOT EXISTS but either I'm doing it wrong or it simply isn't appropriate.
Can anyone point me in the right direction?
You should select all employee numbers that do not enter timesheet first. Then, filter the list using NOT IN.
DECLARE #Week_Ending_Date DATETIME = '2012-08-10'
DECLARE #Staff TABLE
(
EmployeeNo INT NOT NULL,
EmployeeName NVARCHAR(100) NOT NULL
)
DECLARE #TimeSheet TABLE
(
EmployeeNo INT NOT NULL,
Week_Ending_Date DATETIME
)
INSERT INTO #Staff (EmployeeNo, EmployeeName)
VALUES (1, 'Alan'), (2, 'Peter')
INSERT INTO #TimeSheet (EmployeeNo, Week_Ending_Date)
VALUES (1, '2012-08-10'), (1, '2012-08-17'), (2, '2012-08-03')
SELECT
S.EmployeeName
FROM
#Staff S
WHERE
EmployeeNo NOT IN (SELECT EmployeeNo FROM #TimeSheet WHERE Week_Ending_Date = #Week_Ending_Date)
Try adding DISTINCT to your query
ie
SELECT DISTINCT ...
Your query should look something like
SELECT *
FROM Staff
WHERE NOT EXISTS
(
Select EmployeeNo
from Timesheet
where WeekEndingDate='2012-08-10'
and TimeSheet.EmployeeNo = Staff.EmployeeNo
)
or
SELECT *
FROM Staff
WHERE EmployeeNo NOT IN
(
Select EmployeeNo
from Timesheet
where WeekEndingDate='2012-08-10'
)
You could use a not exists clause to find staff that has not submitted a particular timesheet:
select *
from Staff s
where not exists
(
select *
from Timesheet t
where t.EmployeeNo = s.EmployeeNo
and t.Week_Ending_Date = '2012-08-19'
)
Can you not just select all employees who haven't submitted a timesheet and group the results by their name?
select Name
from Staff_Table
left join Timesheet_Table on Staff_Table.[Employee No] = Timesheet_Table.[Employee No] and Timesheet_Table.Week_Ending_Date = '10 August 2012'
having Timesheet_Table.Week_Ending_Date is null
group by Name, Timesheet_Table.Week_Ending_Date
I haven't tested this, but something along these lines.

Resources