Joining on varchar(50) foreign key slows query

Joining on varchar(50) foreign key slows query - database

I have this query which is pretty long, but adding a where clause to it, or joining on a string makes it take an extra 2 seconds to run. I can't figure out why.
Here's the query in full:
ALTER PROCEDURE [dbo].[RespondersByPracticeID]
#practiceID int = null,
#activeOnly bit = 1
AS
BEGIN
SET NOCOUNT ON;
select
isnull(sum(isResponder),0) as [Responders]
,isnull(count(*) - sum(isResponder),0) as [NonResponders]
,isnull((select
count(p.patientID)
from patient p
inner join practice on practice.practiceid = p.practiceid
inner join [lookup] l on p.dosing = l.lookupid and l.lookupid = 'da_ncd'
where
p.practiceID = isnull(#practiceID, p.practiceID)
and p.active = case #activeOnly when 1 then 1 else p.active end
) - (isnull(sum(isResponder),0) + isnull(count(*) - sum(isResponder),0)),0)
as [Undetermined]
from (
select
v.patientID
,firstVisit.hbLevel as startHb
,maxHbVisit.hblevel as maxHb
, case when (maxHbVisit.hblevel - firstVisit.hbLevel >= 1) then 1 else 0 end as isResponder
,count(v.patientID) as patientCount
from patient p
inner join visit v on v.patientid = v.patientid
inner join practice on practice.practiceid = p.practiceid
inner join [lookup] l on p.dosing = l.lookupid and l.lookupid = 'da_ncd'
inner join (
SELECT
p.PatientID
,v.VisitID
,v.hblevel
,v.VisitDate
FROM Patient p
INNER JOIN Visit v ON p.PatientID = v.PatientID
WHERE
v.VisitDate = (
SELECT MIN(VisitDate)
FROM Visit
WHERE PatientId = p.PatientId
)
) firstVisit on firstVisit.patientID = v.patientID
inner join (
select
p.patientID
,max(v.hbLevel) as hblevel
from Patient p
INNER JOIN Visit v ON p.PatientID = v.PatientID
group by
p.patientID
) MaxHbVisit on maxHbVisit.patientid = v.patientId
where
p.practiceID = isnull(#practiceID, p.practiceID)
and p.active = case #activeOnly when 1 then 1 else p.active end
group by
v.patientID
,firstVisit.hbLevel
,maxHbVisit.hblevel
having
datediff(
d,
dateadd(
day
,-DatePart(
dw
,min(v.visitDate)
) + 1
,min(v.visitDate)
)
, max(v.visitDate)
) >= (7 * 8) -- Eight weeks.
) responders
END
The line that slows it down is:
inner join [lookup] l on p.dosing = l.lookupid and l.lookupid = 'da_ncd'
Also, moving it to the where clause has the same effect:
where p.dosing = 'da_ncd'
Otherwise, the query runs almost instantly. >.<

Ah, sorry I figured it out. Patient.Dosing was set as allow nulls. I guess that made it a different sort of index.

For the record, even though the question is answered.
Usually things like this happen because the execution plan is changed. Compare the plans in query analyzer.

Another gotcha is data types - if p.dosing and l.lookupid differ - nvarchar vs. varchar, for example, can have a huge impact.

Try creating an index on that table, being sure to properly include that VARCHAR field in the list of fields.

Related

Compare and identify new records in Select query for different dates

I have a query as part of a stored procedure and the results are correct. I do however need to add code to compare the results to the results of the same query when it last ran e.g. 2 days ago. I then need any new records that was not on the results from 2 days ago to appear on top (first) in the results. I tried using the Except clause, but the results are wrong.
Any help will be appreciated how to identify any new records compared to the records from GETDATE()-1. Thank you
SELECT distinct rt.sPortfolio, CAST(rt.iRuleId AS VARCHAR(15)) AS iRuleId, sDescription, sComment, rt.sPortfolio AS BlankRow
FROM thinktank..ruletests AS rt
INNER JOIN thinktank..rules AS r
ON r.id = rt.iruleid
AND dttest > getdate()-0.5
AND iresult <> 4
AND ichecktype = 3
AND r.iCategory = 0
AND sRuleSet <> 'NOTIFY'
INNER JOIN thinktank..GroupPortfolios AS gp
ON rt.sPortfolio = gp.sPortfolioId
AND sGroupID = 'EC'
LEFT OUTER JOIN thinktank..RuleUCs AS ruc
ON r.id = ruc.iruleid
AND ruc.icategory = '636'
WHERE ISNULL(ruc.sValue,'N') <> 'Y'
EXCEPT
SELECT distinct rt.sPortfolio, CAST(rt.iRuleId AS VARCHAR(15)) AS iRuleId, sDescription, sComment, rt.sPortfolio AS BlankRow
FROM thinktank..ruletests AS rt
INNER JOIN thinktank..rules AS r
ON r.id = rt.iruleid
AND dttest > getdate()-1
AND iresult <> 4
AND ichecktype = 3
AND r.iCategory = 0
AND sRuleSet <> 'NOTIFY'
INNER JOIN thinktank..GroupPortfolios AS gp
ON rt.sPortfolio = gp.sPortfolioId
AND sGroupID = 'EC'
LEFT OUTER JOIN thinktank..RuleUCs AS ruc
ON r.id = ruc.iruleid
AND ruc.icategory = '636'
WHERE ISNULL(ruc.sValue,'N') <> 'Y'

Convert PostgreSQL to MS SQL

I am needing help converting a PostgreSQL query to MSSQ.
Below is what i have done so far but i am issuing with the function and array areas which i do not think are allowed in MS SQL.
Is there something that that i need to do change the function and looks the WHERE statement has an array in it too.
I have added the select statement for the #temp table but when i create the #temp table i am getting errors saying incorrect syntax
CREATE FUNCTION pm_aggregate_report
(
_facility_ids uuid[]
, _risk_ids uuid[] DEFAULT NULL::uuid[]
, _assignee_ids uuid[] DEFAULT NULL::uuid[]
, _start_date date DEFAULT NULL::date
, _end_date date DEFAULT NULL::date
)
RETURNS TABLE
(
facility character varying
, pm_id uuid, grouped_pm boolean
, risk_id uuid
, risk character varying
, pm_status_id uuid
, user_id uuid
, assignee text
, completed_by uuid
, total_labor bigint
)
CREATE TABLE #tmp_pm_aggregate
(
facility_id VARCHAR(126),
pm_id VARCHAR(126),
grouped_pm VARCHAR(126),
risk_id VARCHAR(126),
pm_status_id VARCHAR(126),
user_id VARCHAR(126),
completed_by VARCHAR(126)
)
SELECT DISTINCT
COALESCE(gp.facility_id, a.facility_id) as facility_id,
COALESCE(p.grouped_pm_id, p.id) as pm_id,
CASE WHEN p.grouped_pm_id IS NULL THEN false ELSE true END as grouped_pm,
COALESCE(gp.risk_id, a.risk_id) as risk_id,
COALESCE(gp.pm_status_id, p.pm_status_id) as pm_status_id,
COALESCE(gass.user_id, sass.user_id) as user_id,
COALESCE(gp.completed_by, p.completed_by) as completed_by
FROM pms p
JOIN assets a
ON p.asset_id = a.id
LEFT JOIN grouped_pms gp
ON p.grouped_pm_id = gp.id
LEFT JOIN assignees sass
ON p.id = sass.record_id
AND sass.type = 'single_pm'
LEFT JOIN assignees gass
ON p.grouped_pm_id = gass.record_id
AND gass.type = 'grouped_pm'
LEFT JOIN users u
ON (sass.user_id = u.id OR gass.user_id = u.id)
WHERE a.facility_id = ANY(_facility_ids)
AND NOT a.is_component
AND COALESCE(gp.pm_status_id, p.pm_status_id) in ('f9bdfc17-3bb5-4ec0-8477-24ef05ea3b9b', '06fc910c-3d07-4284-8f6e-8fb3873f5333')
AND COALESCE(gp.completion_date, p.completion_date) BETWEEN COALESCE(_start_date, '1/1/2000') AND COALESCE(_end_date, '1/1/3000')
AND COALESCE(gp.show_date, p.show_date) <= CURRENT_TIMESTAMP
AND COALESCE(gass.user_id, sass.user_id) IS NOT NULL
AND u.user_type_id != 'ec823d98-7023-4908-8006-2e33ddf2c11b'
AND (_risk_ids IS NULL OR COALESCE(gp.risk_id, a.risk_id) = ANY(_risk_ids)
AND (_assignee_ids IS NULL OR COALESCE(gass.user_id, sass.user_id) = ANY(_assignee_ids);
SELECT
f.name as facility,
t.pm_id,
t.grouped_pm,
t.risk_id,
r.name as risk,
t.pm_status_id,
t.user_id,
u.name_last + ', ' + u.name_first as assignee,
t.completed_by,
ISNULL(gwl.total_labor, swl.total_labor) as total_labor
FROM #tmp_pm_aggregate t
JOIN facilities f
ON t.facility_id = f.id
JOIN risks r
ON t.risk_id = r.id
JOIN users u
ON t.user_id = u.id
LEFT JOIN (SELECT wl.record_id, wl.user_id, SUM(wl.labor_time) as total_labor
FROM work_logs wl
WHERE wl.type = 'single_pm'
GROUP BY wl.record_id, wl.user_id) as swl
ON t.pm_id = swl.record_id
AND t.user_id = swl.user_id
AND t.grouped_pm = false
LEFT JOIN (SELECT wl.record_id, wl.user_id, SUM(wl.labor_time) as total_labor
FROM work_logs wl
WHERE wl.type = 'grouped_pm'
GROUP BY wl.record_id, wl.user_id) as gwl
ON t.pm_id = gwl.record_id
AND t.user_id = gwl.user_id
AND t.grouped_pm = true
ORDER BY facility,
assignee,
risk;
DROP TABLE #tmp_pm_aggregate;

You can create an inline Table Valued Function, and simply return a resultset from it. You do not need (and cannot use) a temp table, you do not declare the returned "rowset" shape.
For the array parameters, you can use a Table Type:
CREATE TYPE dbo.GuidList (value uniqueidentifier NOT NULL PRIMARY KEY);
Because the table parameters are actual tables, you must query them like this (NOT EXISTS (SELECT 1 FROM #risk_ids) OR ISNULL(gp.risk_id, a.risk_id) IN (SELECT r.value FROM #risk_ids))
The parameters must start with #
There is no boolean type, you must use bit
Always use deterministic date formats for literals. yyyymmdd works for dates. Do you need to take into account hours and minutes, because you haven't?
ISNULL generally performs better than COALESCE in SQL Server, as the compiler understands it better
You may want to pass a separate parameter showing whether you passed in anything for the optional table parameters
I suggest you look carefully at the actual query: why does it need DISTINCT? It performs poorly, and is usually a code-smell indicating poorly thought-out joins. Perhaps you need to combine the two joins on assignees, or perhaps you should use a row-numbering strategy somewhere.
CREATE FUNCTION dbo.pm_aggregate_report
(
#facility_ids dbo.GuidList
, #risk_ids dbo.GuidList
, #assignee_ids dbo.GuidList
, #start_date date
, #end_date date
)
RETURNS TABLE AS RETURN
SELECT DISTINCT -- why DISTINCT, perhaps rethink your joins
ISNULL(gp.facility_id, a.facility_id) as facility_id,
ISNULL(p.grouped_pm_id, p.id) as pm_id,
CASE WHEN p.grouped_pm_id IS NULL THEN CAST(0 AS bit) ELSE CAST(1 AS bit) END as grouped_pm,
ISNULL(gp.risk_id, a.risk_id) as risk_id,
ISNULL(gp.pm_status_id, p.pm_status_id) as pm_status_id,
ISNULL(gass.user_id, sass.user_id) as user_id,
ISNULL(gp.completed_by, p.completed_by) as completed_by
FROM pms p
JOIN assets a
ON p.asset_id = a.id
LEFT JOIN grouped_pms gp
ON p.grouped_pm_id = gp.id
LEFT JOIN assignees sass
ON p.id = sass.record_id
AND sass.type = 'single_pm'
LEFT JOIN assignees gass
ON p.grouped_pm_id = gass.record_id
AND gass.type = 'grouped_pm'
LEFT JOIN users u
ON (sass.user_id = u.id OR gass.user_id = u.id) -- is this doubling up your rows?
WHERE a.facility_id IN (SELECT f.value FROM #facility_ids f)
AND a.is_component = 0
AND ISNULL(gp.pm_status_id, p.pm_status_id) in ('f9bdfc17-3bb5-4ec0-8477-24ef05ea3b9b', '06fc910c-3d07-4284-8f6e-8fb3873f5333')
AND ISNULL(gp.completion_date, p.completion_date) BETWEEN ISNULL(#start_date, '20000101') AND ISNULL(#end_date, '30000101') -- perhaps use >= AND <
AND ISNULL(gp.show_date, p.show_date) <= CURRENT_TIMESTAMP
AND ISNULL(gass.user_id, sass.user_id) IS NOT NULL
AND u.user_type_id != 'ec823d98-7023-4908-8006-2e33ddf2c11b'
AND (NOT EXISTS (SELECT 1 FROM #risk_ids) OR ISNULL(gp.risk_id, a.risk_id) IN (SELECT r.value FROM #risk_ids))
AND (NOT EXISTS (SELECT 1 FROM #assignee_ids) OR ISNULL(gass.user_id, sass.user_id) IN (SELECT aid.value FROM #assignee_ids aid));

How to increase performance of this query?

I have an SQL query, it is running on MSSQL 2008 R2
View vMobileLastMobileHistory has about 1000 rows and
select * from vMobileLastMobileHistory is taking 0.2 sec
but this query is taking 5 seconds, how can I optimize this code?
(I think the problem is INTERSECT but I dont know how change this)
SELECT DISTINCT *
FROM
(
SELECT vMobileLastMobileHistory.*
FROM vMobileLastMobileHistory
LEFT OUTER JOIN MobileType_DomainAction ON
MobileType_DomainAction.tiMobileType = vMobileLastMobileHistory.tiMobileType
LEFT OUTER JOIN MobileType_User ON
MobileType_User.MobileID = MobileType_DomainAction.ID
WHERE MobileType_User.UserID = #UserID OR #UserID = - 1
INTERSECT
SELECT vMobileLastMobileHistory.*
FROM vMobileLastMobileHistory
LEFT OUTER JOIN dbo.Region_User ON
dbo.vMobileLastMobileHistory.strRegion = dbo.Region_User.strRegion
WHERE Region_User.iSystemUser = #UserID OR #UserID = - 1
INTERSECT
SELECT vMobileLastMobileHistory.*
FROM vMobileLastMobileHistory
LEFT OUTER JOIN Contractor_User ON
vMobileLastMobileHistory.strContractor = Contractor_User.strContractor
WHERE Contractor_User.iSystemUser = #UserID OR #UserID = - 1
)

The problem is that if you have any indexes on your iSytemUser columns, the optimise is unable to use them because it has to account for a specific userID being passed, or returning all results, it would be better to logically separate your two cases. In addition, since you don't care about any columns in the auxiliary tables, you could use EXISTS in your case of specific users to take advantage of a semi join:
IF (#UserID = -1)
BEGIN
SELECT DISTINCT *
FROM vMobileLastMobileHistory;
END
ELSE
BEGIN
SELECT DISTINCT *
FROM vMobileLastMobileHistory AS mh
WHERE EXISTS
( SELECT 1
FROM Contractor_User AS cu
WHERE cu.strContractor = mh.strContractor
AND cu.iSystemUser = #UserID
)
AND EXISTS
( SELECT 1
FROM Region_User AS ru
WHERE ru.strRegion = mh.strRegion
AND ru.iSystemUser = #UserID
)
AND EXISTS
( SELECT 1
FROM MobileType_DomainAction AS da
INNER JOIN MobileType_User AS mu
ON mu.MobileID = da.ID
WHERE da.tiMobileType = mh.tiMobileType
AND mu.iSystemUser = #UserID
);
END
Now you can have two execution plans for each case (returning all results, or for a specific user), in each case you only need to read from vMobileLastMobileHistory once, and you also limit the sorts required by removing the INTERSECT and replacing with 3 EXISTS clauses.
If they don't already exist then you may also which to consider some indexes on your tables. A good way of finding out what indexes would help is to run the query in SQL Server Management Studio with the option "Show Actual Execution Plan" enabled, this will then show you any missing indexes.

Most of time Intersect and Inner Join will be same. You are not share your data, so based on my knowledge and this link, I just replace intersect query into Inner join query as :
--I think you don't need distinct upper query. If you have issue inform me.
SELECT DISTINCT vml.*
FROM vMobileLastMobileHistory vml
LEFT OUTER JOIN MobileType_DomainAction mtda ON mtda.tiMobileType = vml.tiMobileType
LEFT OUTER JOIN MobileType_User ON MobileType_User.MobileID = mtda.ID
LEFT OUTER JOIN dbo.Region_User ON dbo.vml.strRegion = dbo.Region_User.strRegion
LEFT OUTER JOIN Contractor_User ON vml.strContractor = Contractor_User.strContractor
WHERE
(MobileType_User.UserID = #UserID
and Region_User.iSystemUser = #UserID or Contractor_User.iSystemUser = #UserID
) OR #UserID = - 1

TSQL - Return recent date

Having issues getting a dataset to return with one date per client in the query.
Requirements:
Must have the recent date of transaction per client list for user
Will need have the capability to run through EXEC
Current Query:
SELECT
c.client_uno
, c.client_code
, c.client_name
, c.open_date
into #AttyClnt
from hbm_client c
join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
select
ba.payr_client_uno as client_uno
, max(ba.tran_date) as tran_date
from blt_bill_amt ba
left outer join #AttyClnt ac on ba.payr_client_uno = ac.client_uno
where ba.tran_type IN ('RA', 'CR')
group by ba.payr_client_uno
Currently, this query will produce at least 1 row per client with a date, the problem is that there are some clients that will have between 2 and 10 dates associated with them bloating the return table to about 30,000 row instead of an idealistic 246 rows or less.
When i try doing max(tran_uno) to get the most recent transaction number, i get the same result, some have 1 value and others have multiple values.
The bigger picture has 4 other queries being performed doing other parts, i have only included the parts that pertain to the question.
Edit (2011-10-14 # 1:45PM):
select
ba.payr_client_uno as client_uno
, max(ba.row_uno) as row_uno
into #Bills
from blt_bill_amt ba
inner join hbm_matter m on ba.matter_uno = m.matter_uno
inner join hbm_client c on m.client_uno = c.client_uno
inner join hbm_persnl p on c.resp_empl_uno = p.empl_uno
where p.login = #login
and c.status_code = 'C'
and ba.tran_type in ('CR', 'RA')
group by ba.payr_client_uno
order by ba.payr_client_uno
--Obtain list of Transaction Date and Amount for the Transaction
select
b.client_uno
, ba.tran_date
, ba.tc_total_amt
from blt_bill_amt ba
inner join #Bills b on ba.row_uno = b.row_uno
Not quite sure what was going on but seems the Temp Tables were not acting right at all. Ideally i would have 246 rows of data, but with the previous query syntax it would produce from 400-5000 rows of data, obviously duplications on data.

I think you can use ranking to achieve what you want:
WITH ranked AS (
SELECT
client_uno = ba.payr_client_uno,
ba.tran_date,
be.tc_total_amt,
rnk = ROW_NUMBER() OVER (
PARTITION BY ba.payr_client_uno
ORDER BY ba.tran_uno DESC
)
FROM blt_bill_amt ba
INNER JOIN hbm_matter m ON ba.matter_uno = m.matter_uno
INNER JOIN hbm_client c ON m.client_uno = c.client_uno
INNER JOIN hbm_persnl p ON c.resp_empl_uno = p.empl_uno
WHERE p.login = #login
AND c.status_code = 'C'
AND ba.tran_type IN ('CR', 'RA')
)
SELECT
client_uno,
tran_date,
tc_total_amt
FROM ranked
WHERE rnk = 1
ORDER BY client_uno
Useful reading:
Ranking Functions (Transact-SQL)
ROW_NUMBER (Transact-SQL)
WITH common_table_expression (Transact-SQL)
Using Common Table Expressions

JOIN codition in SQL Server

After applying join condition on two tables I want records which is maximum among records of left table
My query
SELECT a1.*,
t.*,
( a1.trnratefrom - t.trnratefrom )AS minrate,
( a1.trnrateto - t.trnrateto ) AS maxrate
FROM (SELECT a.srno,
trndate,
b.trnsrno,
Upper(Rtrim(Ltrim(b.trnstate))) AS trnstate,
Upper(Rtrim(Ltrim(b.trnarea))) AS trnarea,
Upper(Rtrim(Ltrim(b.trnquality))) AS trnquality,
Upper(Rtrim(Ltrim(b.trnlength))) AS trnlength,
Upper(Rtrim(Ltrim(b.trnunit))) AS trnunit,
b.trnratefrom,
b.trnrateto,
a.remark,
entdate
FROM mstprodrates a
INNER JOIN trnprodrates b
ON a.srno = b.srno)a1
INNER JOIN (SELECT c.srno,
trndate,
d.trnsrno,
Upper(Rtrim(Ltrim(d.trnstate))) AS trnstate,
Upper(Rtrim(Ltrim(d.trnarea))) AS trnarea,
Upper(Rtrim(Ltrim(d.trnquality))) AS trnquality,
Upper(Rtrim(Ltrim(d.trnlength))) AS trnlength,
Upper(Rtrim(Ltrim(d.trnunit))) AS trnunit,
d.trnratefrom,
d.trnrateto,
c.remark,
entdate
FROM mstprodrates c
INNER JOIN trnprodrates d
ON c.srno = d.srno) AS t
ON a1.trnstate = t.trnstate
AND a1.trnquality = t.trnquality
AND a1.trnunit = t.trnunit
AND a1.trnlength = t.trnlength
AND a1.trnarea = t.trnarea
AND a1.remark = t.remark
WHERE t.srno = (SELECT MAX(srno)
FROM a1
WHERE srno < a1.srno)

If you mean to say,
you want Records exist in Left table but not in right then use LEFT OUTER JOIN.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Joining on varchar(50) foreign key slows query - database

Ah, sorry I figured it out. Patient.Dosing was set as allow nulls. I guess that made it a different sort of index.

For the record, even though the question is answered. Usually things like this happen because the execution plan is changed. Compare the plans in query analyzer.

Another gotcha is data types - if p.dosing and l.lookupid differ - nvarchar vs. varchar, for example, can have a huge impact.

Try creating an index on that table, being sure to properly include that VARCHAR field in the list of fields.

Related

Compare and identify new records in Select query for different dates

Convert PostgreSQL to MS SQL

How to increase performance of this query?

TSQL - Return recent date

JOIN codition in SQL Server

Categories

Resources