How to efficiently self-join so the query does not time out? - sql-server

| id1 | id2 | id3 | id4 | val1 | val2 | age | race | zip |
----------------------------------------------------------
| 11 | 222 | 333 | 44 | 789 | abc | 45 | AA |12345|
| 11 | 222 | 333 | 44 | 567 | def | 45 | AA |12345|
| 11 | 333 | 444 | 44 | 789 | xyz | 30 | AS |23456|
| 22 | 555 | 666 | 77 | 012 | abc | 38 | W |34567|
| 22 | 555 | 666 | 77 | 789 | GHI | 38 | W |34567|
| 34 | 333 | 777 | 99 | 012 | GHI | 75 | W |34567|
I want to get ALL rows for ID1, id2, id3, id4 where val1 is 789. So the output look similar to this:
| id1 | id2 | id3 | id4 | val1 | val2 | age | race | zip |
----------------------------------------------------------
| 11 | 222 | 333 | 44 | 789 | abc | 45 | AA |12345|
| 11 | 222 | 333 | 44 | 567 | def | 45 | AA |12345|
| 11 | 333 | 444 | 44 | 789 | xyz | 30 | AS |23456|
| 22 | 555 | 666 | 77 | 012 | abc | 38 | W |34567|
| 22 | 555 | 666 | 77 | 789 | GHI | 38 | W |34567|
I can get the self-join to work on a small dataset, however since the table I'm working is huge, I want an efficient solution that will not timeout. Here is the query I'm using:
select t1.*
from t t1
join t t2 on
t1.id1=t2.id1 and
t1.id2=t2.id2 and
t1.id3=t2.id3 and
t1.id4=t2.id4
where t1.val1='789'
Again, I want ALL rows of ID1 through ID4 as long as one of the VAL1 value is '789'.

You don't need to join to any tables - what you want is EXISTS:
Select *
From t t1
Where Exists (Select *
From t t2
Where t1.ID1 = t2.ID1
And t2.val1 = 789)

Related

Avoid multiple left joins in MSSQL

I have the following database structure:
Users
----------------------
| User_ID | Username |
|--------------------|
| 14590 | Sam |
| 14591 | Michael |
| 14592 | Albert |
----------------------
Addresses
----------------------------------------------
| Adr_ID | City | Street |
|--------------------------------------------|
| 62 | New York | Perfect Street 1 |
| 63 | New York | Another Street 12 |
| 64 | Prague | Zlata Ulicka 52 |
| 65 | Berlin | Alexanderplatz 36 |
| 66 | Berlin | Am Bahnhof 49 |
| 67 | Warsaw | Poniatowskiego 74 |
| 68 | Paris | Rue Des Barres 33 |
| 69 | Paris | Rue De L’abreuvoir 63 |
| 70 | Lisbon | Rua Augusta |
----------------------------------------------
Addresses_Link
------------------------------------------------------------
| Link_ID | Adr_ID | User_ID | Main_Address | Address_Type |
|----------------------------------------------------------|
| 570 | 62 | 14590 | 1 | 1 |
| 571 | 63 | 14590 | 1 | 2 |
| 572 | 64 | 14590 | 0 | 3 |
| 573 | 65 | 14591 | 1 | 1 |
| 574 | 66 | 14591 | 1 | 2 |
| 575 | 67 | 14591 | 0 | 3 |
| 576 | 68 | 14592 | 1 | 1 |
| 577 | 69 | 14592 | 1 | 2 |
| 578 | 70 | 14592 | 0 | 3 |
------------------------------------------------------------
The result I want to get:
-----------------------------------------------------------------------------------------------------
| User_ID | Username | Adr_Private_City | Adr_Private_Street | Adr_Job_City | Adr_Job_Street |
|---------------------------------------------------------------------------------------------------|
| 14590 | Sam | New York | Perfect Street 1 | New York | Another Street 12 |
| 14591 | Michael | Berlin | Alexanderplatz 36 | Berlin | Am Bahnhof 49 |
| 14592 | Albert | Paris | Rue Des Barres 33 | Paris | Rue De L’abreuvoir 63 |
-----------------------------------------------------------------------------------------------------
Columns:
Adr_Private_City / Adr_Private_Street - when Main_Address = 1 and Address_Type = 1
Adr_Job_City / Adr_Job_Street - when Main_Address = 1 and Address_Type = 2
I created an SQL query like this:
SELECT
u.User_ID,
u.Username,
a1.City AS Adr_Private_City,
a1.Street AS Adr_Private_Street,
a2.City AS Adr_Job_City,
a2.Street AS Adr_Job_Street
FROM Users u
LEFT JOIN Addresses_Link al1 ON al1.User_ID = u.User_ID
LEFT JOIN Addresses_Link al2 ON al2.User_ID = u.User_ID
LEFT JOIN Addresses a1 ON a1.Adr_ID = al1.Adr_ID
LEFT JOIN Addresses a2 ON a2.Adr_ID = al2.Adr_ID
WHERE
al1.Main_Address = 1 AND al1.Address_Type = 1 AND
al2.Main_Address = 1 AND al2.Address_Type = 2
Is it possible to avoid multiple left joins and make the query not too slow?
You can achieve what you want with below
Your original query
FROM Users u
LEFT JOIN Addresses_Link al1 ON al1.User_ID = u.User_ID
....
WHERE al1.Main_Address = 1
effectively is an INNER JOIN, when you have the condition al1.Main_Address = 1 in the WHERE clause
Since you used LEFT JOIN, I have turn it into a true LEFT JOIN query. For Addresses_Link and Addresses, since you join it on Adr_ID, I use INNER JOIN
SELECT *
FROM Users u
LEFT JOIN
(
SELECT al.User_ID,
MAX(CASE WHEN al.Address_Type = 1 THEN a.City END) AS Adr_Private_City,
MAX(CASE WHEN al.Address_Type = 1 THEN a.Street END) AS Adr_Private_Street,
MAX(CASE WHEN al.Address_Type = 2 THEN a.City END) AS Adr_Job_City,
MAX(CASE WHEN al.Address_Type = 2 THEN a.Street END) AS Adr_Job_Street,
FROM Addresses_Link al
INNER JOIN Addresses a ON a.Adr_ID = al.Adr_ID
WHERE al.Main_Address = 1
AND al.Address_Type IN (1, 2)
GROUP BY al.User_ID
) a ON a.User_ID = u.User_ID

Sum, Group by and Null

I'm dipping my toes into SQL. I have the following table
+------+----+------+------+-------+
| Type | ID | QTY | Rate | Name |
+------+----+------+------+-------+
| B | 1 | 1000 | 21 | Jack |
| B | 2 | 2000 | 12 | Kevin |
| B | 1 | 3000 | 24 | Jack |
| B | 1 | 1000 | 23 | Jack |
| B | 3 | 200 | 13 | Mary |
| B | 2 | 3000 | 12 | Kevin |
| B | 4 | 4000 | 44 | Chris |
| B | 4 | 5000 | 43 | Chris |
| B | 3 | 1000 | 26 | Mary |
+------+----+------+------+-------+
I don't know how I would leverage Sum and Group by to achieve the following result.
+------+----+------+------+-------+------------+
| Type | ID | QTY | Rate | Name | Sum of QTY |
+------+----+------+------+-------+------------+
| B | 1 | 1000 | 21 | Jack | 5000 |
| B | 1 | 3000 | 24 | Jack | Null |
| B | 1 | 1000 | 23 | Jack | Null |
| B | 2 | 3000 | 12 | Kevin | 5000 |
| B | 2 | 3000 | 12 | Kevin | Null |
| B | 3 | 200 | 13 | Mary | 1200 |
| B | 3 | 1000 | 26 | Mary | Null |
| B | 4 | 4000 | 44 | Chris | 9000 |
| B | 4 | 5000 | 43 | Chris | Null |
+------+----+------+------+-------+------------+
Any help is appreciated!
You can use window function :
select t.*,
(case when row_number() over (partition by type, id order by name) = 1
then sum(qty) over (partition by type, id order by name)
end) as Sum_of_QTY
from table t;

Lateral flatten two columns without repetition in snowflake

I have a query that groups by a two variables to get a total of another. In order to maintain my table structure for later computations I listagg() two other variables to save for the next stage of the query. However, when I attempt to do two later flatten's of the listagg() columns my data is repeated to many times.
Example: my_table
id | list1 | code| list2 | total
--------|-----------------|-----|----------|---
2434166 | 735,768,769,746 | 124 | 21,2,1,6 | 30
select
id,
list1_table.value::int as list1_val,
code,
list2.value::int as list2_val,
total
from my_table
lateral flatten(input=>split(list1, ',')) list1_table,
lateral flatten(input=>split(list2, ',')) list2_table
Result:
id | list1 | code| list2 | total
--------|-----------------|-----|----------|---
2434166 | 768 | 124 | 2 | 30
2434166 | 735 | 124 | 2 | 30
2434166 | 746 | 124 | 2 | 30
2434166 | 769 | 124 | 2 | 30
2434166 | 768 | 124 | 21 | 30
2434166 | 735 | 124 | 21 | 30
2434166 | 746 | 124 | 21 | 30
2434166 | 769 | 124 | 21 | 30
2434166 | 768 | 124 | 6 | 30
2434166 | 735 | 124 | 6 | 30
2434166 | 746 | 124 | 6 | 30
2434166 | 769 | 124 | 6 | 30
2434166 | 768 | 124 | 1 | 30
2434166 | 735 | 124 | 1 | 30
2434166 | 746 | 124 | 1 | 30
2434166 | 769 | 124 | 1 | 30
I understand what is going on but I'm just wonder how do I get my desired result:
id | list1 | code| list2 | total
--------|-----------------|-----|----------|---
2434166 | 768 | 124 | 2 | 30
2434166 | 735 | 124 | 21 | 30
2434166 | 746 | 124 | 6 | 30
2434166 | 769 | 124 | 1 | 30
As you noticed yourself, you want 4 records. There are 2 ways to do it, both exploit the index column produced by flatten, which represents the position of the produced value in the input (see the Flatten Documentation)
Using 2 flattens and index-selection
First way is to take the result of your query, and add these index column, here's an example:
select id,
list1_table.value::int as list1_val, list1_table.index as list1_index, code,
list2_table.value::int as list2_val, list2_table.index as list2_index, total
from my_table,
lateral flatten(input=>split(list1, ',')) list1_table,
lateral flatten(input=>split(list2, ',')) list2_table;
---------+-----------+-------------+------+-----------+-------------+-------+
ID | LIST1_VAL | LIST1_INDEX | CODE | LIST2_VAL | LIST2_INDEX | TOTAL |
---------+-----------+-------------+------+-----------+-------------+-------+
2434166 | 735 | 0 | 124 | 21 | 0 | 30 |
2434166 | 735 | 0 | 124 | 2 | 1 | 30 |
2434166 | 735 | 0 | 124 | 1 | 2 | 30 |
2434166 | 735 | 0 | 124 | 6 | 3 | 30 |
2434166 | 768 | 1 | 124 | 21 | 0 | 30 |
2434166 | 768 | 1 | 124 | 2 | 1 | 30 |
2434166 | 768 | 1 | 124 | 1 | 2 | 30 |
2434166 | 768 | 1 | 124 | 6 | 3 | 30 |
2434166 | 769 | 2 | 124 | 21 | 0 | 30 |
2434166 | 769 | 2 | 124 | 2 | 1 | 30 |
2434166 | 769 | 2 | 124 | 1 | 2 | 30 |
2434166 | 769 | 2 | 124 | 6 | 3 | 30 |
2434166 | 746 | 3 | 124 | 21 | 0 | 30 |
2434166 | 746 | 3 | 124 | 2 | 1 | 30 |
2434166 | 746 | 3 | 124 | 1 | 2 | 30 |
2434166 | 746 | 3 | 124 | 6 | 3 | 30 |
---------+-----------+-------------+------+-----------+-------------+-------+
As you can see, the rows you are interested are the ones with the same index.
So to get your result by selecting these rows after the lateral joins happen:
select id,
list1_table.value::int as list1_val, code,
list2_table.value::int as list2_val, total
from my_table,
lateral flatten(input=>split(list1, ',')) list1_table,
lateral flatten(input=>split(list2, ',')) list2_table
where list1_table.index = list2_table.index;
---------+-----------+------+-----------+-------+
ID | LIST1_VAL | CODE | LIST2_VAL | TOTAL |
---------+-----------+------+-----------+-------+
2434166 | 735 | 124 | 21 | 30 |
2434166 | 768 | 124 | 2 | 30 |
2434166 | 769 | 124 | 1 | 30 |
2434166 | 746 | 124 | 6 | 30 |
---------+-----------+------+-----------+-------+
Using 1 flatten + lookup-by-index
An easier, more efficient, and more flexible way (useful if you have multiple arrays like that or e.g. array indices are related but not 1-to-1) is to flatten only on one array, and then use the index of the produced elements to lookup values in other arrays.
Here's an example:
select id, list1_table.value::int as list1_val, code,
split(list2,',')[list1_table.index]::int as list2_val, -- array lookup here
total
from my_table, lateral flatten(input=>split(list1, ',')) list1_table;
---------+-----------+------+-----------+-------+
ID | LIST1_VAL | CODE | LIST2_VAL | TOTAL |
---------+-----------+------+-----------+-------+
2434166 | 735 | 124 | 21 | 30 |
2434166 | 768 | 124 | 2 | 30 |
2434166 | 769 | 124 | 1 | 30 |
2434166 | 746 | 124 | 6 | 30 |
---------+-----------+------+-----------+-------+
See how we simply use the index produced when flattening list1 to lookup the value from list2
To get elements from multiple arrays with the same index we could use array accessor:
SELECT t.id, t.code, t.total, s.ind,
STRTOK_TO_ARRAY(t.list1, ',')[s.ind]::int AS list1_val,
STRTOK_TO_ARRAY(t.list2, ',')[s.ind]::int AS list2_val
FROM t
,(SELECT ROW_NUMBER() OVER(ORDER BY seq4()) - 1 AS ind
FROM TABLE(GENERATOR(ROWCOUNT => 10))) s -- here up to 10 elements
WHERE list1_val IS NOT NULL
ORDER BY t.id, s.ind;
The idea is to generate tally numbers and then access the elements.
Sample data:
CREATE OR REPLACE TABLE t(id INT, list1 TEXT, code INT, list2 TEXT, total INT) AS
SELECT 6, '735,768,769,746', 124, '21,2,1,6', 30 UNION
SELECT 7, '1,2,3' , 1, '10,20,30', 50;

How to merge row which are not present in other table in sql?

I have table like
CL_Client
cl_id | cl_name |cl_system
1 | a |Dpo
2 | b | Dpo
3 | c |Dpo
4 | d
CLOI_ClientOrderItems
Cl_id|cl_name|orderid| date |status |masterid
1 | a | 123 | 27/5/0215 | 12 | 111
1 | a | 123 | 27/5/0215 | 15 | 111
2 | b | 213 | 27/5/0215 | 12 | 222
3 | c | 452 | 27/5/0215 | 16 | 333
4 | d | 458 | 27/5/0215 | 20 | 444
4 | d | 452 | 27/5/0215 | 22 | 333
Invoice table
orderid|rate|master id|invoice_date
123 |10 | 111 |27/5/2015
213 |10 | 222 |27/5/2015
458 |10 | 444 |27/5/2015
in invoice table there is no row of masterorderid 333 but in result I want to show that also.
I have tried this query but it's not working correctly:
SELECT distinct
C.cl_id,
C.cl_name,
[dbo].getOrderCountbyMasterorderID(CO.masterorderid) as No_Of_Orders,
CONVERT(VARCHAR(5),CO.cloi_order_date,108) as OrderTime,
I.in_total,
CO.MasterOrderId,
CO.cloi_current_status
from
dbo.CL_Clients C
INNER JOIN dbo.CLOI_ClientOrderItems CO
ON C.cl_id = CO.cl_id
LEFT OUTER JOIN dbo.IN_Invoices I
ON CO.MasterOrderId = I.MasterOrderId
where
CO.cloi_current_status in(7,8,160,163,167,170,250,251,162) and
C.cl_system='Dpo' and
datepart(yyyy,I.in_date_issued)=2015 and
datepart(mm,I.in_date_issued)=05 and
datepart(dd,I.in_date_issued)=27
group by
C.cl_id,
C.cl_name,
CO.masterorderid,
CO.cloi_order_date,
CO.cloi_current_status,
I.in_total,
CO.MasterOrderId
order by
OrderTime
expected result
cl_id | cl_name |No_Of_Orders| OrderTime|in_total|MasterOrderId|status
1 | a |2 | 09:45 | 65.33 |111 |12
2 | b |1 | 09:53 | 65.33 |222 |15
3 | c |1 | 09:54 | 43.21 |333 |16
4 | d |2 | 09:56 | 43.21 |444 |20
You have not shown data for CL_Clients table.
Try executing this query first
SELECT distinct
C.cl_id,
C.cl_name,
[dbo].getOrderCountbyMasterorderID(CO.masterorderid) as No_Of_Orders,
CONVERT(VARCHAR(5),CO.cloi_order_date,108) as OrderTime,
--I.in_total,
CO.MasterOrderId,
CO.cloi_current_status
from
dbo.CL_Clients C
INNER JOIN dbo.CLOI_ClientOrderItems CO
ON C.cl_id = CO.cl_id
--LEFT OUTER JOIN dbo.IN_Invoices I
--ON CO.MasterOrderId = I.MasterOrderId
where
CO.cloi_current_status in(7,8,160,163,167,170,250,251,162)
and C.cl_system='Dpo'
--and datepart(yyyy,I.in_date_issued)=2015
--and datepart(mm,I.in_date_issued)=05
--and datepart(dd,I.in_date_issued)=27
group by
C.cl_id,
C.cl_name,
CO.masterorderid,
CO.cloi_order_date,
CO.cloi_current_status,
--I.in_total,
CO.MasterOrderId
order by
OrderTime
If masterorderid = 333 is not even displayed here, then check your where clauses.

Reformat existing table from paired columns into rows

For example I have a table with 5 rows and 7 columns, I wish to move the last two columns into the previous two columns. New format of table would now be 10 rows and 5 columns
Present Table format
+-----+------------+----------+------------+---------------+------------+---------------+
| id | VisitDate | fkFamily | child1.DOB | child1.Gender | child2.DOB | child2.Gender |
+-----+------------+----------+------------+---------------+------------+---------------+
| 78 | 19/04/2010 | 277 | 14/03/2009 | 0 | NULL | NULL |
| 79 | 20/04/2010 | 289 | 12/08/2007 | 0 | NULL | NULL |
| 107 | 20/04/2010 | 191 | NULL | NULL | NULL | NULL |
| 108 | 20/04/2010 | 259 | NULL | NULL | 31/03/2010 | 1 |
| 109 | 20/04/2010 | 126 | NULL | NULL | NULL | NULL |
+-----+------------+----------+------------+---------------+------------+---------------+
New table format
+-----+------------+----------+------------+----------------------+
| id | VisitDate | fkFamily | child.DOB | child.Gender |
+-----+------------+----------+------------+----------------------+
| 78 | 19/04/2010 | 277 | 14/03/2009 | 0 |
| 79 | 20/04/2010 | 289 | 12/08/2007 | 0 |
| 107 | 20/04/2010 | 191 | NULL | NULL |
| 108 | 20/04/2010 | 259 | NULL | NULL |
| 109 | 20/04/2010 | 126 | NULL | NULL |
| 78 | 19/04/2010 | 277 | NULL | NULL |
| 79 | 20/04/2010 | 289 | NULL | NULL |
| 107 | 20/04/2010 | 191 | NULL | NULL |
| 108 | 20/04/2010 | 259 | 31/03/2010 | 1 |
| 109 | 20/04/2010 | 126 | NULL | NULL |
+-----+------------+----------+------------+----------------------+
You can get the final result by unpivoting the columns Child1_DOB, Child1_Gender, etc. Starting in SQL Server 2005, the unpivot function was made available but for your case I'd actually use CROSS APPLY so you can unpivot the Child1, and Child2 values in pairs.
The syntax would be:
select
t.id,
t.visitdate,
t.fkFamily,
c.child_DOB,
c.child_Gender
from yourtable t
cross apply
(
select child1_DOB, child1_Gender union all
select child2_DOB, child2_Gender
) c (child_DOB, child_Gender);
See SQL Fiddle with Demo
Then you could also include an identifier for each of the values so you know if it belonged to child one or two:
select
t.id,
t.visitdate,
t.fkFamily,
c.child,
c.child_DOB,
c.child_Gender
from yourtable t
cross apply
(
select 'Child1', child1_DOB, child1_Gender union all
select 'Child2', child2_DOB, child2_Gender
) c (child, child_DOB, child_Gender)
See SQL Fiddle with Demo. These give a result similar to:
| ID | VISITDATE | FKFAMILY | CHILD_DOB | CHILD_GENDER |
|-----|------------|----------|------------|--------------|
| 78 | 19/04/2010 | 277 | 14/03/2009 | 0 |
| 78 | 19/04/2010 | 277 | (null) | (null) |
| 79 | 20/04/2010 | 289 | 12/08/2007 | 0 |
| 79 | 20/04/2010 | 289 | (null) | (null) |
| 107 | 20/04/2010 | 191 | (null) | (null) |
| 107 | 20/04/2010 | 191 | (null) | (null) |
| 108 | 20/04/2010 | 259 | (null) | (null) |
| 108 | 20/04/2010 | 259 | 31/03/2010 | 1 |
| 109 | 20/04/2010 | 126 | (null) | (null) |
| 109 | 20/04/2010 | 126 | (null) | (null) |
You could reformat the table into something like this by using UNION:-
SELECT * FROM (
SELECT id, VisitDate, fkFamily, child1_DOB as child_DOB, child1_Gender as child_Gender
FROM yourtable
UNION
SELECT id, VisitDate, fkFamily, child2_DOB, child2_Gender
FROM yourtable) as temp
FIDDLE
You could use SELECT INTO if you wanted to create a new table from the results, for example:-
SELECT * INTO yournewtable FROM (
SELECT id, VisitDate, fkFamily, child1_DOB as child_DOB, child1_Gender as child_Gender
FROM yourtable
UNION
SELECT id, VisitDate, fkFamily, child2_DOB, child2_Gender
FROM yourtable) as temp

Resources