How to optimize this SQL Server query

How to optimize this SQL Server query - sql-server

I have following query, which is working as expected but taking approx 3 seconds to execute. Reason is large number of records. Can somebody please suggest any steps in order to improve performance?
Explanation :
Check to see value using Comp id and Default_Comp = 1
If not found, ignore the Default_Comp and check only based on Comp id
Still not found, ignore the join with table 2 and try to get by Comp id.
My code:
DECLARE #Finished_Comp VARCHAR(MAX) = NULL;
SELECT #Finished_Comp = MIN(tbl2.Finished_Comp)
FROM Table1 tbl1
INNER JOIN Table2 tbl2 ON tbl1.Sav_ID = tbl2.Sav_ID
WHERE Comp_ID = #Comp_ID AND tbl1.Default_Comp = 1
IF #Finished_Comp IS NULL
BEGIN
SELECT #Finished_Comp = MIN(tbl2.Finished_Comp)
FROM Table1 tbl1
INNER JOIN Table2 tbl2 ON tbl1.Sav_ID = tbl2.Sav_ID
WHERE Comp_ID = #Comp_ID
END
IF #Finished_Comp IS NULL
BEGIN
SELECT #Finished_Comp = MIN(Finished_Comp)
FROM Table1 tbl1
WHERE Comp_ID = #Comp_ID AND #Finished_Comp != ''
END
I tried to use COALESCE, but it's returning wrong results for Finished_Comp

You say in the comments
I strongly believe the query can be changed to some extent so that no
multiple queries need to be executed.
Yes you're right.
SELECT #Finished_Comp = COALESCE(MIN(CASE WHEN tbl1.Default_Comp = 1 THEN tbl2.Finished_Comp END),
MIN(tbl2.Finished_Comp),
MIN(CASE WHEN tbl1.Finished_Comp <> '' THEN tbl1.Finished_Comp END))
FROM Table1 tbl1
LEFT JOIN Table2 tbl2
ON tbl1.Sav_ID = tbl2.Sav_ID
WHERE tbl1.Comp_ID = #Comp_IDV
But at best this will only reduce execution time to a third of current (for the case that all three queries need to be executed).
You should consider adding indexes on
Table1 - Comp_ID, Sav_ID INCLUDE (Default_Comp, Finished_Comp)
Table2 - Sav_ID INCLUDE (Finished_Comp)
For potentially much larger improvements.

Related

Duplicate and incorrect output sql query

I need to select grade_name from tblgrade, subject_name from tblsubject, count (subscribe_id) from tblsubcription, count (sub_status) from tblsubcription where sub_status=1 and count (sub_status) from tblsubcription where sub_status is null.
This is what i have tried:
SELECT t2.grade_name,
t.subject_name,
(SELECT COUNT(*)
FROM tblsubcription
WHERE sub_status IS NULL
AND teacher_id = 2) AS pending,
(SELECT COUNT(*)
FROM tblsubcription
WHERE sub_status = '1'
AND teacher_id = 2) AS appoved,
COUNT(t1.subscribe_id) AS totalsub
FROM tblsubject t
INNER JOIN tblsubject_grade tg ON (t.subject_id = tg.subject_id)
INNER JOIN tblsubcription t1 ON (tg.subject_garde_id = t1.subject_garde_id)
INNER JOIN tblgrade t2 ON (tg.grade_id = t2.grade_id)
AND tg.grade_id = t2.grade_id
AND tg.subject_id = t.subject_id
AND t2.admin_id = t.admin_id
WHERE t1.teacher_id = 2
GROUP BY t.subject_name,
t2.grade_name;
See result obtained when the above query is executed and the expected result i need is in red

Looking at this subquery:
(SELECT COUNT(*)
FROM tblsubcription
WHERE sub_status IS NULL
AND teacher_id = 2) AS pending,
There is nothing here to relate (correlate) it to the specific row. You need an additional condition in the WHERE clause that tells you which Grade/Subject pair to look at. The other (approved) subquery is the same way.
Alternatively, you may be able to solve this with another join to tblsubscription and conditional aggregation.
I'd post code to fix this, but I find the images too blurry to read well, so I can't easily infer which fields to use. Next time post formatted text, and you'll get a better answer in less time.

Update records SQL?

First when I started this project seemed very simple. Two tables, field tbl1_USERMASTERID in Table 1 should be update from field tbl2_USERMASTERID Table 2. After I looked deeply in Table 2, there is no unique ID that I can use as a key to join these two tables. Only way to match the records from Table 1 and Table 2 is based on FIRST_NAME, LAST_NAME AND DOB. So I have to find records in Table 1 where:
tbl1_FIRST_NAME equals tbl2_FIRST_NAME
AND
tbl1_LAST_NAME equals tbl2_LAST_NAME
AND
tbl1_DOB equals tbl2_DOB
and then update USERMASTERID field. I was afraid that this can cause some duplicates and some users will end up with USERMASTERID that does not belong to them. So if I find more than one record based on first,last name and dob those records would not be updated. I would like just to skip and leave them blank. That way I wouldn't populate invalid USERMASTERID. I'm not sure what is the best way to approach this problem, should I use SQL or ColdFusion (my server side language)? Also how to detect more than one matching record?
Here is what I have so far:
UPDATE Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
Here is query where I tried to detect duplicates:
SELECT DISTINCT
tbl1.FName,
tbl1.LName,
tbl1.dob,
COUNT(*) AS count
FROM Table1 AS tbl1
LEFT OUTER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.FName = tbl2.first
AND tbl1.LName = tbl2.last
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND LTRIM(RTRIM(tbl1.first)) <> ''
AND LTRIM(RTRIM(tbl1.last)) <> ''
AND LTRIM(RTRIM(tbl1.dob)) <> ''
GROUP BY tbl1.FName,tbl1.LName,tbl1.dob
Some data after I tested query above:
First Last DOB Count
John Cook 2008-07-11 2
Kate Witt 2013-06-05 1
Deb Ruis 2016-01-22 1
Mike Bennet 2007-01-15 1
Kristy Cruz 1997-10-20 1
Colin Jones 2011-10-13 1
Kevin Smith 2010-02-24 1
Corey Bruce 2008-04-11 1
Shawn Maiers 2016-08-28 1
Alenn Fitchner 1998-05-17 1
If anyone have idea how I can prevent/skip updating duplicate records or how to improve this query please let me know. Thank you.

You could check for and avoid duplicate matches using with common_table_expression (Transact-SQL)
along with row_number()., like so:
with cte as (
select
t.fname
, t.lname
, t.dob
, t.usermasterid
, NewUserMasterId = t2.usermasterid
, rn = row_number() over (partition by t.fname, t.lname, t.dob order by t2.usermasterid)
from table1 as t
inner join table2 as t2 on t.dob = t2.dob
and t.fname = t2.fname
and t.lname = t2.lname
and ltrim(rtrim(t.usermasterid)) = ''
)
--/* confirm these are the rows you want updated
select *
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
/* update those where only 1 usermasterid matches this record
update t
set t.usermasterid = t.NewUserMasterId
from cte as t
where t.NewUserMasterId != ''
and not exists (
select 1
from cte as i
where t.dob = i.dob
and t.fname = i.fname
and t.lname = i.lname
and i.rn>1
);
--*/
I use the cte to extract out the sub query for readability. Per the documentation, a common table expression (cte):
Specifies a temporary named result set, known as a common table expression (CTE). This is derived from a simple query and defined within the execution scope of a single SELECT, INSERT, UPDATE, or DELETE statement.
Using row_number() to assign a number for each row, starting at 1 for each partition of t.fname, t.lname, t.dob. Having those numbered allows us to check for the existence of duplicates with the not exists() clause with ... and i.rn>1

You could use a CTE to filter out the duplicates from Table1 before joining:
; with CTE as (select *
, count(ID) over (partition by LastName, FirstName, DoB) as IDs
from Table1)
update a
set a.ID = b.ID
from Table2 a
left join CTE b
on a.FirstName = b.FirstName
and a.LastName = b.LastName
and a.Dob = b.Dob
and b.IDs = 1
This will work provided there are no exact duplicates (same demographics and same ID) in table 1. If there are exact duplicates, they will also be excluded from the join, but you can filter them out before the CTE to avoid this.

Please try below SQL:
UPDATE Table1 AS tbl1
INNER JOIN Table2 AS tbl2
ON tbl1.dob = tbl2.dob
AND tbl1.fname = tbl2.fname
AND tbl1.lname = tbl2.lname
LEFT JOIN Table2 AS tbl3
ON tbl3.dob = tbl2.dob
AND tbl3.fname = tbl2.fname
AND tbl3.lname = tbl2.lname
AND tbl3.usermasterid <> tbl2.usermasterid
SET tbl1.usermasterid = tbl2.usermasterid
WHERE LTRIM(RTRIM(tbl1.usermasterid)) = ''
AND tbl3.usermasterid is null

Conditional JOIN Statement SQL Server

Is it possible to do the following:
IF [a] = 1234 THEN JOIN ON TableA
ELSE JOIN ON TableB
If so, what is the correct syntax?

I think what you are asking for will work by joining the Initial table to both Option_A and Option_B using LEFT JOIN, which will produce something like this:
Initial LEFT JOIN Option_A LEFT JOIN NULL
OR
Initial LEFT JOIN NULL LEFT JOIN Option_B
Example code:
SELECT i.*, COALESCE(a.id, b.id) as Option_Id, COALESCE(a.name, b.name) as Option_Name
FROM Initial_Table i
LEFT JOIN Option_A_Table a ON a.initial_id = i.id AND i.special_value = 1234
LEFT JOIN Option_B_Table b ON b.initial_id = i.id AND i.special_value <> 1234
Once you have done this, you 'ignore' the set of NULLS. The additional trick here is in the SELECT line, where you need to decide what to do with the NULL fields. If the Option_A and Option_B tables are similar, then you can use the COALESCE function to return the first NON NULL value (as per the example).
The other option is that you will simply have to list the Option_A fields and the Option_B fields, and let whatever is using the ResultSet to handle determining which fields to use.

This is just to add the point that query can be constructed dynamically based on conditions.
An example is given below.
DECLARE #a INT = 1235
DECLARE #sql VARCHAR(MAX) = 'SELECT * FROM [sourceTable] S JOIN ' + IIF(#a = 1234,'[TableA] A ON A.col = S.col','[TableB] B ON B.col = S.col')
EXEC(#sql)
--Query will be
/*
SELECT * FROM [sourceTable] S JOIN [TableB] B ON B.col = S.col
*/

You can solve this with union
select a, b
from tablea
join tableb on tablea.a = tableb.a
where b = 1234
union
select a, b
from tablea
join tablec on tablec.a = tableb.a
where b <> 1234

I disagree with the solution suggesting 2 left joins. I think a table-valued function is more appropriate so you don't have all the coalescing and additional joins for each condition you would have.
CREATE FUNCTION f_GetData (
#Logic VARCHAR(50)
) RETURNS #Results TABLE (
Content VARCHAR(100)
) AS
BEGIN
IF #Logic = '1234'
INSERT #Results
SELECT Content
FROM Table_1
ELSE
INSERT #Results
SELECT Content
FROM Table_2
RETURN
END
GO
SELECT *
FROM InputTable
CROSS APPLY f_GetData(InputTable.Logic) T

I think it will be better to think about your query in a different way and treat them more like sets.
I do believe if you make two separate queries then join them using UNION, It will be much better in performance and more readable.

What indenting style do you use in SQL Server stored procedures?

None of my SQL Server stored procedure editing IDEs seem to have any tools to enforce indentation styles, so I find that a lot of the stored procedures I see are all over the place. I find indenting really improves readability though. I would like to codify some stored procedure indenting standards in our company's coding style guide, and I'm wondering if anyone has any best practices they would like to share.
For instance, in a normal SELECT statement, I try to keep the SELECT, FROM, WHERE, ORDER BY, and GROUP BY clauses all on the same level, and indent anything below that. I also try to indent each JOIN one level from the table it's logically joining into.
Does anyone else have similar advice or best practices?

My select formattings:
--
-- SELECT statements
--
select
t1.field1, t1.field2,
t2.field3,
t3.fieldn
from
tblOne t1
inner join tblTwo t2 on t1.field = t2.field and t1.field2 = t2.field2
left join tblThree t3 on t2.field = t3.field and t2.field2 = t3.field2
left join (
select id, sum(quantity) as quantity
from tbl4
group by id
) t4 on t4.id=t3.id
where
t1.field = 'something'
and t2.field = 'somethin else'
order by
fieldn
Optionally (when lines get too long) I split lines at logical boundaries and indent splitted parts:
inner join tblTwo as t2
on t1.field = t2.field and t1.field2 = t2.field2
Sometimes I'm using different syntax for very simple (sub)selects.
Main goal is to make code readable and relatively easily modifyable.
--edit--
IMHO (at least in small team) it is not needed to enforce very strict rules, this helps support and maintainig :) In our team, where about 3-4 people write most sql, it is very easy to establish code author, just looking at sql statement - all people are using somewhat different style (capitalzing, aliases, indenting etc).

SELECT T1.Field1,
T1.Field2,
T2.Field1 As Field 3
FROM Table1 AS T1
LEFT JOIN Table2 AS T2
ON T1.Field1 = T2.Field7
WHERE T1.Field9 = 5
AND T2.Field1 < 900
ORDER BY T2.Field1 DESC
INSERT INTO Table1 (
Field1,
Filed2,
Field3 )
VALUES ( 'Field1',
'Field2',
'Field3' )
UPDATE Table1
SET Field1 = SomeValue,
Field2 = AnotherValue,
FIeld134567 = A ThirdValue
WHERE Field9 = A Final Value
I find that I dont necessarily has a set indentation length and instead I try to indent based on the length of the field names and values. I like my left margins to line up along any given vertical plane and I like my Evaluators (such as equal signs) to line up. I always have any command term on a different vertical plane than its accompanying values and fields. I also tend to try to make the space between my SELECT command and the Field list equal in length to the space used by a SELECT DISTINCT Field or INSERT INTO Table.
But in the end, all that is just my preferences. I like neat looking code.

I prefer the following style:
--
-- SELECT statements
--
select field1,
field2,
field3,
fieldn
from tblOne as t1
inner join tblTwo as t2
on t1.field = t2.field
and t1.field2 = t2.field2
left outer join tblThree as t3
on t2.field = t3.field
and t2.field2 = t3.field2
where t1.field = 'something'
and t2.field = 'somethin else'
order by fieldn
--
-- IF statements
--
if #someVar = 'something'
begin
-- statements here
set #someVar2 = 'something else'
end
--
-- WHILE statements
--
while #count < #max
begin
set #count = #count + 1
end

I prefair the following style...
Select
Id = i.Identity,
User = u.UserName,
From
tblIdentities i
Inner Join
tblUsers u On i.UserId = u.UserId
Where
(
u.IsActive = 'True'
And
i.Identity > 100
)
Also I try and not to use the As keyword. I prefair equals instead. Probably upset a few people but I find this code much easier to read...
Select
Id = tbl.Identity,
User = tbl.UserName,
Age = tbl.Age,
DOB = tbl.DateOfBirth
From
tbl
Rather than...
Select
tbl.Id As Identity,
tbl.UserName As User,
tbl.Age As Age,
tbl.DateOfBirth As DOB
From
tbl

SELECT T1.Field1,
T1.Field2,
T2.Field1 AS Field 3
FROM Table1 AS T1
LEFT JOIN Table2 AS T2 ON T1.Field1 = T2.Field7
WHERE T1.Field9 = 5
AND T2.Field1 < 900
ORDER BY T2.Field1 DESC
INSERT INTO Table1 (Field1, Field2, Field3)
VALUES ('Field1', 'Field2', 'Field3' ) /* for values trivial in length */
UPDATE Table1
SET Field1 = SomeValue,
Field2 = AnotherValue,
FIeld134567 = A ThirdValue
WHERE Field9 = A Final Value
I think my preferred format comes from that one COBOL class I took back in college. Something about code in pretty aligned columns that makes me happy inside.

My style is almost identical to Justin's. I indent the "and" so the "d" in "and" lines up with the "e" in "where".
Sometimes I capitalize keywords. When I have a sub-select, I indent the whole sub-select and format it the same as a regular select.
One place where I may deviate is if I have dozens of fields being selected. In that case, I put several fields to a line and add white space to make even columns of text.

I tend to right-justify the keywords:
SELECT T1.Field1, T2.Field2
FROM Table1 AS T1
LEFT JOIN Table2 AS T2 ON T1.Field1 = T2.Field7
WHERE T1.Field9 = 5
AND T2.Field1 < 900
ORDER BY T2.Field1 DESC
Note that it's not hard-and-fast. I favor having the SELECT being left-most that I will break the justification (INNER JOIN, ORDER BY). I'll wrap on ON and its ilk if necessary, preferring to start a line with a keyword, if possible.
LEFT JOIN Table2 AS T2
ON T1.Field1 = T2.Field7 AND T2.Field8 IS NOT NULL

Update a table using JOIN in SQL Server?

I want to update a column in a table making a join on other table e.g.:
UPDATE table1 a
INNER JOIN table2 b ON a.commonfield = b.[common field]
SET a.CalculatedColumn= b.[Calculated Column]
WHERE
b.[common field]= a.commonfield
AND a.BatchNO = '110'
But it is complaining :
Msg 170, Level 15, State 1, Line 2
Line 2: Incorrect syntax near 'a'.
What is wrong here?

You don't quite have SQL Server's proprietary UPDATE FROM syntax down. Also not sure why you needed to join on the CommonField and also filter on it afterward. Try this:
UPDATE t1
SET t1.CalculatedColumn = t2.[Calculated Column]
FROM dbo.Table1 AS t1
INNER JOIN dbo.Table2 AS t2
ON t1.CommonField = t2.[Common Field]
WHERE t1.BatchNo = '110';
If you're doing something silly - like constantly trying to set the value of one column to the aggregate of another column (which violates the principle of avoiding storing redundant data), you can use a CTE (common table expression) - see here and here for more details:
;WITH t2 AS
(
SELECT [key], CalculatedColumn = SUM(some_column)
FROM dbo.table2
GROUP BY [key]
)
UPDATE t1
SET t1.CalculatedColumn = t2.CalculatedColumn
FROM dbo.table1 AS t1
INNER JOIN t2
ON t1.[key] = t2.[key];
The reason this is silly, is that you're going to have to re-run this entire update every single time any row in table2 changes. A SUM is something you can always calculate at runtime and, in doing so, never have to worry that the result is stale.

Try it like this:
UPDATE a
SET a.CalculatedColumn= b.[Calculated Column]
FROM table1 a INNER JOIN table2 b ON a.commonfield = b.[common field]
WHERE a.BatchNO = '110'

Answer given above by Aaron is perfect:
UPDATE a
SET a.CalculatedColumn = b.[Calculated Column]
FROM Table1 AS a
INNER JOIN Table2 AS b
ON a.CommonField = b.[Common Field]
WHERE a.BatchNo = '110';
Just want to add why this problem occurs in SQL Server when we try to use alias of a table while updating that table, below mention syntax will always give error:
update tableName t
set t.name = 'books new'
where t.id = 1
case can be any if you are updating a single table or updating while using join.
Although above query will work fine in PL/SQL but not in SQL Server.
Correct way to update a table while using table alias in SQL Server is:
update t
set t.name = 'books new'
from tableName t
where t.id = 1
Hope it will help everybody why error came here.

MERGE table1 T
USING table2 S
ON T.CommonField = S."Common Field"
AND T.BatchNo = '110'
WHEN MATCHED THEN
UPDATE
SET CalculatedColumn = S."Calculated Column";

UPDATE mytable
SET myfield = CASE other_field
WHEN 1 THEN 'value'
WHEN 2 THEN 'value'
WHEN 3 THEN 'value'
END
From mytable
Join otherTable on otherTable.id = mytable.id
Where othertable.somecolumn = '1234'
More alternatives here.

Seems like SQL Server 2012 can handle the old update syntax of Teradata too:
UPDATE a
SET a.CalculatedColumn= b.[Calculated Column]
FROM table1 a, table2 b
WHERE
b.[common field]= a.commonfield
AND a.BatchNO = '110'
If I remember correctly, 2008R2 was giving error when I tried similar query.

I find it useful to turn an UPDATE into a SELECT to get the rows I want to update as a test before updating. If I can select the exact rows I want, I can update just those rows I want to update.
DECLARE #expense_report_id AS INT
SET #expense_report_id = 1027
--UPDATE expense_report_detail_distribution
--SET service_bill_id = 9
SELECT *
FROM expense_report_detail_distribution erdd
INNER JOIN expense_report_detail erd
INNER JOIN expense_report er
ON er.expense_report_id = erd.expense_report_id
ON erdd.expense_report_detail_id = erd.expense_report_detail_id
WHERE er.expense_report_id = #expense_report_id

Another approach would be to use MERGE
;WITH cteTable1(CalculatedColumn, CommonField)
AS
(
select CalculatedColumn, CommonField from Table1 Where BatchNo = '110'
)
MERGE cteTable1 AS target
USING (select "Calculated Column", "Common Field" FROM dbo.Table2) AS source ("Calculated Column", "Common Field")
ON (target.CommonField = source."Common Field")
WHEN MATCHED THEN
UPDATE SET target.CalculatedColumn = source."Calculated Column";
-Merge is part of the SQL Standard
-Also I'm pretty sure inner join updates are non deterministic..
Similar question here where the answer talks about that
http://ask.sqlservercentral.com/questions/19089/updating-two-tables-using-single-query.html

I think, this is what you are looking for.
UPDATE
Table1
SET
Table1.columeName =T1.columeName * T2.columeName
FROM
Table1 T1
INNER JOIN Table2 T2
ON T1.columeName = T2.columeName;

I had the same issue.. and you don't need to add a physical column.. cuz now you will have to maintain it..
what you can do is add a generic column in the select query:
EX:
select tb1.col1, tb1.col2, tb1.col3 ,
(
select 'Match' from table2 as tbl2
where tbl1.col1 = tbl2.col1 and tab1.col2 = tbl2.col2
)
from myTable as tbl1

Aaron's approach above worked perfectly for me. My update statement was slightly different because I needed to join based on two fields concatenated in one table to match a field in another table.
--update clients table cell field from custom table containing mobile numbers
update clients
set cell = m.Phone
from clients as c
inner join [dbo].[COSStaffMobileNumbers] as m
on c.Last_Name + c.First_Name = m.Name

Those who are using MYSQL
UPDATE table1 INNER JOIN table2 ON table2.id = table1.id SET table1.status = 0 WHERE table1.column = 20

Try:
UPDATE table1
SET CalculatedColumn = ( SELECT [Calculated Column]
FROM table2
WHERE table1.commonfield = [common field])
WHERE BatchNO = '110'

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to optimize this SQL Server query - sql-server

Related

Duplicate and incorrect output sql query

Update records SQL?

Conditional JOIN Statement SQL Server

What indenting style do you use in SQL Server stored procedures?

Update a table using JOIN in SQL Server?

Categories

Resources