SQL Server: how count from value from dynamic columns? - sql-server

SQL Server: how count from value from dynamic columns?
I have data:
+ Subject
___________________
| SubID | SubName |
|-------|---------|
| 1 | English |
|-------|---------|
| 2 | Spanish |
|-------|---------|
| 3 | Korean |
|_______|_________|
+ Student
______________________________________
| StuID | StuName | Gender | SubID |
|---------|---------|--------|--------|
| 1 | David | M | 1,2 |
|---------|---------|--------|--------|
| 2 | Lucy | F | 2,3 |
|_________|_________|________|________|
I want to query result as:
____________________________________
| SubID | SubName | Female | Male |
|--------|---------|--------|------|
| 1 | English | 0 | 1 |
|--------|---------|--------|------|
| 2 | Spanish | 1 | 1 |
|--------|---------|--------|------|
| 3 | Koean | 1 | 0 |
|________|_________|________|______|
This is my query:
SELECT
SubID, SubName, 0 AS Female, 0 AS Male
FROM Subject
I don't know to replace 0 with real count.

Because you made the mistake of storing CSV data in your tables, we will have to do some SQL Olympics to get your result set. We can try joining the two tables on the condition that the SubID from the subject table appears somewhere in the CSV list of IDs in the student table. Then, aggregated by subject and count the number of males and females.
SELECT
s.SubID,
s.SubName,
COUNT(CASE WHEN st.Gender = 'F' THEN 1 END) Female,
COUNT(CASE WHEN st.Gender = 'M' THEN 1 END) Male
FROM Subject s
LEFT JOIN Student st
ON ',' + CONVERT(varchar(10), st.SubID) + ',' LIKE
'%,' + CONVERT(varchar(10), s.SubID) + ',%'
GROUP BY
s.SubID,
s.SubName;
Demo
But, you would be best off refactoring your table design to normalize the data better. Here is an example of a student table which looks a bit better:
+---------+---------+--------+--------+
| StuID | StuName | Gender | SubID |
+---------+---------+--------+--------+
| 1 | David | M | 1 |
+---------+---------+--------+--------+
| 1 | David | M | 2 |
+---------+---------+--------+--------+
| 2 | Lucy | F | 2 |
+---------+---------+--------+--------+
| 2 | Lucy | F | 3 |
+---------+---------+--------+--------+
We can go a bit further, and even store the metadata separately from the StuID and SubID relationship. But even using just the above would have avoided the ugly join condition.

If the version of your SQL Server is SQL Server or above, you could use STRING_split function to get expected results.
create table Subjects
(
SubID int,
SubName varchar(30)
)
insert into Subjects values
(1,'English'),
(2,'Spanish'),
(3,'Korean')
create table student
(
StuID int,
StuName varchar(30),
Gender varchar(10),
SubID varchar(10)
)
insert into student values
(1,'David','M','1,2'),
(2,'Lucy','F','2,3')
--Query
;WITH CTE AS
(
SELECT
S.Gender,
S1.value AS SubID
FROM student S
CROSS APPLY STRING_SPLIT(S.SubID,',') S1
)
select
T.SubID,
T.SubName,
COUNT(CASE T1.Gender WHEN 'F' THEN 1 END) AS Female,
COUNT(CASE T1.Gender WHEN 'M' THEN 1 END) AS Male
from Subjects T
LEFT JOIN CTE T1 ON T.SubID=T1.SubID
GROUP BY T.SubID,T.SubName
ORDER BY T.SubID
--Output
/*
SubID SubName Female Male
----------- ------------------------------ ----------- -----------
1 English 0 1
2 Spanish 1 1
3 Korean 1 0
*/

Related

How to convert column to row in sql without using pivot

I have been assigned to get the data in required format from two tables.
TableStaff :
StaffID | Staff Name
--------+-----------
1 | John
2 | Jack
and TableLead
LeadID | LeadValue | LeadStaus | StaffID
-------+-----------+-----------+--------
1 | 5000 | New | 1
2 | 8000 | Qualified | 1
3 | 3000 | New | 2
As you will notice StaffID is a foreign key referencing TableStaff.
I have to represent the data in following format
StaffID | StaffName | NewLeadCount | QualifiedLeadCount
--------+-----------+--------------+-------------------
1 | John | 1 | 1
2 | Jack | 1 | 0
What I have tried till now is :
SELECT
COUNT([LeadID ]) AS LdCount, 'New' AS StageName
FROM
[dbo].[TableLead]
WHERE
[LeadStaus] = 'New'
UNION
SELECT
COUNT([LeadID ]) AS LdCount, 'Qualified' AS StageName
FROM
[dbo].[TableLead]
WHERE
[LeadStaus] = 'Qualified '
Any NULL spots should be replaced by 0. Can anyone show me the right direction to approach the problem ?
I would recommend conditional aggregation:
select s.staffid, s.staffname,
sum(case when l.leadstatus = 'New' then 1 else 0 end) as newLeadCount,
sum(case when l.leadstatus = 'Qualified' then 1 else 0 end) as qualifiedLeadCount
from TableStaff s
inner join TableLead l on l.staffid = s.staffid
group by s.staffid, s.staffname

SSIS Lookup Multiple Columns in one table to the same ID column in another

I have the following table:
EventValue | Person1 | Person2 | Person3 | Person4 | Meta1 | Meta2
-------------------------------------------------------------------------------------------
123 | joePerson01 | samRock01 | nancyDrew01 | steveRogers01 | 505 | 606
321 | steveRogers02 | yoMama01 | ruMo01 | lukeJedi01 | 707 | 808
I want to transform the Person columns into IDs for my destination table, so all of the ID's would be coming from the same Person table in my Destination DB:
ID | FirstName | LastName | DatabaseOneID | DatabaseTwoID
----------------------------------------------------------
1 | Joe | Person | joePerson01 | personJoe01
2 | Sam | Rockwell | samRock01 | rockSam01
3 | Nancy | Drew | nancyDrew01 | drewNancy01
4 | Steve | Rogers | steveRogers01 | rogersSteve01
5 | Steve R | Rogers | steveRogers02 | rogersSteve02
6 | Yo | Mama | yoMama01 | mamaYo01
7 | Rufus | Murdock | ruMo01 | moRu01
8 | Luke | Skywalker | lukeJedi01 | jediLuke01
With results like so:
MetaID | EventValue | Person1ID | Person2ID | Person3ID | Person4ID
------------------------------------------------------------------------
1 | 123 | 1 | 2 | 3 | 4
2 | 321 | 5 | 6 | 7 | 8
I currently have a Lookup Transform looking up the first Person column, but couldn't figure out how to convert all 4 Person columns into IDs within the same lookup.
You could do it in one query, or use UNPIVOT, or use a scalar function if you think it'll be more fixable for your implementation. Then, you just create a view of it, in which it'll be an easy access for you.
here is a quick example :
DECLARE
#tb1 TABLE
(
EventValue INT
, Person1 VARCHAR(250)
, Person2 VARCHAR(250)
, Person3 VARCHAR(250)
, Person4 VARCHAR(250)
, Meta1 INT
, Meta2 INT
)
DECLARE
#Person TABLE
(
ID INT
, FirstName VARCHAR(250)
, LastName VARCHAR(250)
, DatabaseOneID VARCHAR(250)
, DatabaseTwoID VARCHAR(250)
)
INSERT INTO #tb1
VALUES
(123,'joePerson01','samRock01','nancyDrew01','steveRogers01',505,606),
(321,'steveRogers02','yoMama01','ruMo01','lukeJedi01',707,808)
INSERT INTO #Person
VALUES
(1,'Joe','Person','joePerson01','personJoe01'),
(2,'Sam','Rockwell','samRock01','rockSam01'),
(3,'Nancy','Drew','nancyDrew01','drewNancy01'),
(4,'Steve','Rogers','steveRogers01','rogersSteve01'),
(5,'SteveR','Rogers','steveRogers02','rogersSteve02'),
(6,'Yo','Mama','yoMama01','mamaYo01'),
(7,'Rufus','Murdock','ruMo01','moRu01'),
(8,'Luke','Skywalker','lukeJedi01','jediLuke01')
SELECT ROW_NUMBER() OVER(ORDER BY EventValue) AS MetaID, *
FROM (
SELECT
t.EventValue
, MAX(CASE WHEN t.Person1 IN(p.DatabaseOneID, p.DatabaseTwoID) THEN p.ID ELSE NULL END) AS Person1ID
, MAX(CASE WHEN t.Person2 IN(p.DatabaseOneID, p.DatabaseTwoID) THEN p.ID ELSE NULL END) AS Person2ID
, MAX(CASE WHEN t.Person3 IN(p.DatabaseOneID, p.DatabaseTwoID) THEN p.ID ELSE NULL END) AS Person3ID
, MAX(CASE WHEN t.Person4 IN(p.DatabaseOneID, p.DatabaseTwoID) THEN p.ID ELSE NULL END) AS Person4ID
FROM #tb1 t
LEFT JOIN #Person p
ON p.DatabaseOneID IN(t.Person1, t.Person2, t.Person3, t.Person4)
OR p.DatabaseTwoID IN(t.Person1, t.Person2, t.Person3, t.Person4)
GROUP BY t.EventValue
) D
I currently have a Lookup Transform looking up the first Person column, but couldn't figure out how to convert all 4 Person columns into IDs within the same lookup.
You cannot do this within the same lookup, you have to add a Lookup Transformation for each Column. In your case you should add 4 Lookup Transformation.
If source database and destination database are on the same server, then you can use a SQL query to achieve that as mentioned in the other answer, but in case that each database is on a separate server you have to go with Lookup transformation or you have to import data into a staging table and perform Join operations using SQL.

How to Select all Entrys from one Table, and SUM a subset of another table

I have a larger Database with Times that employees entered. They enter an activity, when it was and how long they spent on it, as well as a customer.
I'm now trying to return a table with all employees, that Sums their times, but only if it's timed for a subset of Customers. I can get either a table with The Correct times, but employees that didn't enter any time are omitted, or I get all employees but with the sum time from all customers.
The tables I have are:
EMPLOYEE for the employees
ACTIVITY for all activities
CUSTOMER for the customers
To have some "example Data":
| EMPLOYEE | | ACTIVITY |
+------------+---------+ +------------+------------+------------+
| I_EMPLOYEE | S_NAME1 | | I_EMPLOYEE | I_CUSTOMER | N_DURETIME |
+------------+---------+ +------------+------------+------------+
| 1 | A | | 1 | 1 | 5 |
| 2 | B | | 2 | 3 | 10 |
| 3 | C | | 1 | 3 | 15 |
+------------+---------+ | 3 | 2 | 10 |
| 1 | 2 | 10 |
+------------+------------+------------+
What i'd expect to get when i want all times except Customer 2:
+----------+----------+
| EMPLOYEE | DURETIME |
+----------+----------+
| 1 | 20 |
| 2 | 10 |
| 3 | - |
+----------+----------+
I get either of those two out:
+----------+----------+ +----------+----------+
| EMPLOYEE | DURETIME | | EMPLOYEE | DURETIME |
+----------+----------+ +----------+----------+
| 1 | 20 | | 1 | 30 |
| 2 | 10 | | 2 | 10 |
+----------+----------+ | 3 | 10 |
+----------+----------+
To get the correct times i use the following:
SELECT emp.S_NAME1 AS Mitarbeiter, SUM(act.N_DURETIME)/60 as Zeit
FROM EMPLOYEE AS emp
LEFT JOIN ACTIVITY AS act on act.I_EMPLOYEE = emp.I_EMPLOYEE
LEFT JOIN CUSTOMER AS cust on cust.I_CUSTOMER = act.I_CUSTOMER
WHERE cust.CUSTNO NOT '2'
to get the full list of employees i used:
SELECT emp.S_NAME1 AS Mitarbeiter, SUM(act.N_DURETIME)/60 as Zeit
FROM EMPLOYEE AS emp
LEFT JOIN ACTIVITY AS act on act.I_EMPLOYEE = emp.I_EMPLOYEE
LEFT JOIN CUSTOMER AS cust on cust.I_CUSTOMER = act.I_CUSTOMER AND cust.CUSTNO NOT '2'
So, depending on whether I put my "Customer Filter" in the JOIN or the WHERE statement, I get half of the correct table. How can I combine those to get the correct output?
Create Table #emp
(
i_emp Int,
s_name1 Char(1)
)
Insert Into #emp Values
(1,'A'),
(2,'B'),
(3,'C')
Create Table #Activity
(
i_emp Int,
i_cust Int,
n_duretime Int
)
Insert Into #Activity Values
(1,1,5),
(2,3,10),
(1,3,15),
(3,2,10),
(1,2,10)
Query
Select
e.i_emp,
Sum(Case When a.i_cust = 2 Then Null Else a.n_duretime End) As durationTot
From
#emp e Left Join
#Activity a On e.i_emp = a.i_emp
Group By
e.i_emp
Result:
i_emp durationTot
1 20
2 10
3 NULL
You can try the following query
create table Employee(I_EMPLOYEE int, S_NAME1 char(1))
insert into Employee Values (1, 'A'),(2, 'B'),(3, 'C')
create table ACTIVITY (I_EMPLOYEE int, I_CUSTOMER int, N_DURETIME int)
insert into ACTIVITY Values(1, 1, 5 ),( 2, 3, 10), (1, 3, 15), ( 3, 2, 10), ( 1 , 2 , 10 )
select EMPLOYEE, sum(isnull(DURETIME, 0)) as DURETIME from(
select EMPLOYEE.S_NAME1 as EMPLOYEE, case I_Customer when 2 then 0 else N_DURETIME end as DURETIME from activity
inner join Employee on activity.I_EMPLOYEE = Employee.I_EMPLOYEE
)a group by EMPLOYEE
Below is the output
I_EMPLOYEE EMPLOYEE DURETIME
--------------------------------
1 A 20
2 B 10
3 C 0

Get all categories with number of associated records with where clause

So I have two tables:
Categories
-------------------
| Id | Name |
-------------------
| 1 | Category1 |
-------------------
| 2 | Category2 |
-------------------
| 3 | Category3 |
-------------------
Products
--------------------------------------------
| Id | CategoryId | Name | CreatedDate |
--------------------------------------------
| 1 | 1 | Product1 | 2017-05-05 |
--------------------------------------------
| 1 | 1 | Product2 | 2017-05-06 |
--------------------------------------------
| 2 | 2 | Product3 | 2017-12-21 |
--------------------------------------------
I need a query to select all categories along with the number of products for each for a specific time range in which those products were created (CreatedDate).
What I currently have is this:
SELECT c.[Name], COUNT(p.[Id]) AS ProductCount
FROM Categories AS c
LEFT JOIN Products AS p ON p.[CategoryId] = c.[Id]
WHERE p.[CreatedDate] BETWEEN '2017-05-01' AND '2017-06-01'
GROUP BY c.[Name]
My issue is that I'm not seeing Category2 and Category3 in the results set because they don't pass the WHERE clause. I want to see all categories in the results set.
Put the where condition in the left join clause
SELECT c.[Name], COUNT(p.[Id]) AS ProductCount
FROM Categories AS c
LEFT JOIN Products AS p ON p.[CategoryId] = c.[Id]
AND p.[CreatedDate] BETWEEN '2017-05-01' AND '2017-06-01'
GROUP BY c.[Name]
This way it is applied to the join only and not to the complete result set.

How do you create a query which returns dynamic column names in Postgresql?

I have two tables in a reporting database, one for orders, and one for order items. Each order can have multiple order items, along with a quantity for each:
Orders
+----------+---------+
| order_id | email |
+----------+---------+
| 1 | 1#1.com |
+----------+---------+
| 2 | 2#2.com |
+----------+---------+
| 3 | 3#3.com |
+----------+---------+
Order Items
+---------------+----------+----------+--------------+
| order_item_id | order_id | quantity | product_name |
+---------------+----------+----------+--------------+
| 1 | 1 | 1 | Tee Shirt |
+---------------+----------+----------+--------------+
| 2 | 1 | 3 | Jeans |
+---------------+----------+----------+--------------+
| 3 | 1 | 1 | Hat |
+---------------+----------+----------+--------------+
| 4 | 2 | 2 | Tee Shirt |
+---------------+----------+----------+--------------+
| 5 | 3 | 3 | Tee Shirt |
+---------------+----------+----------+--------------+
| 6 | 3 | 1 | Jeans |
+---------------+----------+----------+--------------+
For reporting purposes, I'd love to denormalise this data into a separate PostgreSQL view (or just run a query) that turns the data above into something like this:
+----------+---------+-----------+-------+-----+
| order_id | email | Tee Shirt | Jeans | Hat |
+----------+---------+-----------+-------+-----+
| 1 | 1#1.com | 1 | 3 | 1 |
+----------+---------+-----------+-------+-----+
| 2 | 2#2.com | 2 | 0 | 0 |
+----------+---------+-----------+-------+-----+
| 3 | 3#3.com | 3 | 1 | 0 |
+----------+---------+-----------+-------+-----+
ie, it's a sum of the quantity of each item within the order with the product name; and the product names set as the column titles. Do I need to use something like crosstab to do this, or is there a clever way using subqueries even if I don't know the list of distinct product names at before the query runs.
This is one possible answer:
create table orders
(
orders_id int PRIMARY KEY,
email text NOT NULL
);
create table orders_items
(
order_item_id int PRIMARY KEY,
orders_id int REFERENCES orders(orders_id) NOT NULL,
quantity int NOT NULL,
product_name text NOT NULL
);
insert into orders VALUES (1, '1#1.com');
insert into orders VALUES (2, '2#2.com');
insert into orders VALUES (3, '3#3.com');
insert into orders_items VALUES (1,1,1,'T-Shirt');
insert into orders_items VALUES (2,1,3,'Jeans');
insert into orders_items VALUES (3,1,1,'Hat');
insert into orders_items VALUES (4,2,2,'T-Shirt');
insert into orders_items VALUES (5,3,3,'T-Shirt');
insert into orders_items VALUES (6,3,1,'Jeans');
select
orders.orders_id,
email,
COALESCE(tshirt.quantity, 0) as "T-Shirts",
COALESCE(jeans.quantity,0) as "Jeans",
COALESCE(hat.quantity, 0) as "Hats"
from
orders
left join (select orders_id, quantity from orders_items where product_name = 'T-Shirt')
as tshirt ON (tshirt.orders_id = orders.orders_id)
left join (select orders_id, quantity from orders_items where product_name = 'Jeans')
as jeans ON (jeans.orders_id = orders.orders_id)
left join (select orders_id, quantity from orders_items where product_name = 'Hat')
as hat ON (hat.orders_id = orders.orders_id)
;
Tested with postgresql. Result:
orders_id | email | T-Shirts | Jeans | Hats
-----------+---------+----------+-------+------
1 | 1#1.com | 1 | 3 | 1
2 | 2#2.com | 2 | 0 | 0
3 | 3#3.com | 3 | 1 | 0
(3 rows)
Based on your comment, you can try to use tablefunc like this:
CREATE EXTENSION tablefunc;
SELECT * FROM crosstab
(
'SELECT orders_id, product_name, quantity FROM orders_items ORDER BY 1',
'SELECT DISTINCT product_name FROM orders_items ORDER BY 1'
)
AS
(
orders_id text,
TShirt text,
Jeans text,
Hat text
);
But I think you are thinking the wrong way about SQL. You usually know which rows you want and have to tell it SQL. "Rotating tables" 90 degrees is not part of SQL and should be avoided.

Resources