SUM vs EXIST in SqlServer

SUM vs EXIST in SqlServer - sql-server

The intent is to return all 'Unprocessed' TransactionSets if they have NO PaymentUid and NO ProcessStatus.value('/CPI/#ProcessItem)[1]'... relations, and also pick up 'No-Matched-Payments' TransactionSets if they have ANY PaymentUid AND ANY ProcessStatus.value('/CPI/#ProcessItem)[1]'... relations.
The SUM function in the having seem clunky and don't allow SQL to quit when it encounters any or none. So it seems like it's inefficient, and at the very least quite clunky to read and deal with. Is there a way to write this with something like an EXIST ?
select ts.TransactionSetUid
from TransactionSet ts
join TransactionHeader eh on ts.TransactionSet = eh.TransactionSet
join TransactionPayment tp on eh.TransactionHeaderUid = tp.TransactionHeaderUid
left join ServicePayment sp on tp.TransactionPaymentUid = sp.TransactionPaymentUid
where TransactionStatus in ('Unprocessed', 'No-Matched-Payments')
group by ts.TransactionSet
having (TransactionStatus = 'Unprocessed'
and SUM( CASE WHEN sp.TransactionItem is null THEN 0 ELSE 1 END) = 0
and SUM( CASE WHEN tp.ProcessStatus.value('(/CPI/#ProcessItem)[1]', 'varchar(50)') IS NULL THEN 0 ELSE 1 END) = 0)
or (ts.RuleStatus = 'No-Matched-Payments'
and (SUM( CASE WHEN sp.TransactionItem is null THEN 0 ELSE 1 END) <> 0
or SUM( CASE WHEN tp.ProcessStatus.value('(/CPI/#ProcessItem)[1]', 'varchar(50)') IS NULL THEN 0 ELSE 1 END) <> 0))
UPDATE to answer questions. The relationships between the TransactionSet is one to many with the other tables. There could be many TransactionPayment records but the query is only concerned with ProcessStatus.value that has an xml node at (/CPI/#processItem)[1]. But with ServicePayment, any non-null TransactionItem will do.
As I understand it, the group by is only in there because of the SUM functions. The intent is to flag any TransactionSet that meets one of two conditions.
The first condition is:
the Transaction Status is 'Unprocessed'
and
there are no Process Status values
and
there are no Transaction Items.
The second condition is:
the Transaction Status is 'No-Matched-Payments'
and
there is at least one Process Status value
or
there is at least one Transaction Item.
So the query was set up to use SUM to count the number of times the left join on ServicePayment comes up NULL or when the XML value in TransactionPayment doesn't contain a '/CPI/#processItem'.
It seems to me that instead of using a SUM, the query could instead use an EXIST or some other mechanism to short circuit the test condition. The value of the SUM is not really important, It just needs to know if there is at least one or if there are none.
--
Thank you to everyone: I know i'm not a database expert by any means, and I've been programming in the seven C's (C,C++,C#,Java,etc.) for so long that I sometimes forget that SQL is not an imperative language, or more likely, I just don't think in declarative terms.

I think something like this should do the trick:
select ts.TransactionSetUid
from TransactionSet ts
where CASE WHEN EXISTS(SELECT * FROM TransactionHeader eh
join TransactionPayment tp on eh.TransactionHeaderUid = tp.TransactionHeaderUid
left join ServicePayment sp on tp.TransactionPaymentUid = sp.TransactionPaymentUid
where ts.TransactionSet = eh.TransactionSet and
(
sp.TransactionItem is not null or
tp.ProcessStatus.value('(/CPI/#ProcessItem)[1]', 'varchar(50)') IS not NULL
)
) THEN 1 ELSE 0 END =
CASE TransactionStatus
WHEN 'Unprocessed' THEN 0
WHEN 'No-Matched-Payments' THEN 1
END
That is, I've put the EXISTS check in to test for either condition and put it inside a CASE expression so that we don't have to write it out twice for which result we want (for Unprocessed and No-Matched-Payments).
I've also crafted the second CASE expression to return 0, 1 or NULL so that if the TransactionStatus is something else, it doesn't matter what result the EXISTS produces.
I hope I've followed the correct chains of 0/1, true/false, and/or, NULL/NOT NULL logic here - if it's not 100%, it's hopefully just tweaks to those options. I've also assumed I can shift all of the tables except TransactionSet into the EXISTS - it may be that TransactionHeader has to stay outside if that's where TransactionStatus is coming from.
If this isn't correct, you should probably add bare-bones tables and sample data to your question, alongside the expected results.

Yes, this might work... -- your query did not include a select distinct, but if this this produces duplicate TransactionSetUids, add the keyword distinct...
select [distinct] ts.TransactionSetUid from TransactionSet ts
join TransactionHeader th
on th.TransactionSet = ts.TransactionSet
join TransactionPayment tp
on tp.TransactionHeaderUid = th.TransactionHeaderUid
where not exists
( Select * from ServicePayment
Where TransactionPaymentUid = tp.TransactionPaymentUid
and tp.ProcessStatus.value(
'(/CPI/#ProcessItem)[1]', 'varchar(50)') IS NULL
and TransactionStatus = 'Unprocessed')
Or exists
( Select * from ServicePayment
Where TransactionPaymentUid = tp.TransactionPaymentUid
and ts.RuleStatus = 'No-Matched-Payments'
and tp.ProcessStatus.value(
'(/CPI/#ProcessItem)[1]', 'varchar(50)') IS not NULL
and ts.RuleStatus = 'No-Matched-Payments')

Related

The multi-part identifier "[column name]" could not be bound in UPDATE of TEMP Table

I am trying to create a stored procedure whereupon I input a (simple for now) query into a temp table, and then replace some of the data with data from a different table based on a key.
Here is the complete code:
CREATE PROCEDURE GetInquiryList
AS
BEGIN
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#Inq ') IS NOT NULL
DROP TABLE #Inq
SELECT i.*,q.QuoteID INTO #Inq FROM Inquiries i left join Quotes q on i.InquiryId = q.InquiryId
WHERE i.YNDeleted = 0
--SELECT * FROM #Inq
UPDATE #Inq
SET j.InquiryCustomerName = c.CustomerName,
j.InquiryCustomerEmail = c.CustomerEmail,
j.InquiryCustomerPhone = c.CustomerPhone1,
j.InquiryBestTimetoCall = c.CustomerBestTimetoCall,
j.InquiryDay = c.customerDay,
j.InquiryNight = c.CustomerNight
SELECT c.CustomerName,
c.CustomerEmail,
c.CustomerPhone1,
c.CustomerBestTimetoCall,
c.customerDay,
c.CustomerNight
FROM Customers c
INNER JOIN #Inq j ON
j.InquiryCustomerID = c.CustomerID
SELECT * FROM #Inq
END
I get the following error:
Msg 4104, Level 16, State 1, Line 15 The multi-part identifier "j.InquiryCustomerName" could not be bound
I get this error for whatever column is placed first after the SET command.
Both query pieces of this work independently (the first select creating the temp table and the joined query at the bottom). The data returned is correct. I have tried using aliases (SELECT c.CustomerName AS Name, ...).
Originally, I used "#Inq i" in the second command, but changed to "j" out of an abundance of caution.
I have also run the command against the original table (substituting the Inquiry table for the temp table #Inq, and that fails as well).
Shortening it to this:
UPDATE #Inq
SET j.InquiryCustomerName = c.CustomerName,
j.InquiryCustomerEmail = c.CustomerEmail,
j.InquiryCustomerPhone = c.CustomerPhone1,
j.InquiryBestTimetoCall = c.CustomerBestTimetoCall,
j.InquiryDay = c.customerDay,
j.InquiryNight = c.CustomerNight
FROM Customers c
INNER JOIN #Inq j ON
j.InquiryCustomerID = c.CustomerID
I get a different error:
Msg 4104, Level 16, State 1, Line 15 The multi-part identifier "j.InquiryCustomerName" could not be bound
I'm sure it's probably something simple,(so simple that I can't find any references in any of my searches).
I'm sure it has something to do with the fact that you can't update the same instance of the table used in the join (I'm going to have to re-join again with a "k" alias). How do I do this?
data from the first query
data from the first query
data from the second select statement on the actual temp table
Here is what I updated the stored procedure to, which works exactly how I need it to:
SET NOCOUNT ON
IF OBJECT_ID('tempdb..#Inq ') IS NOT NULL
DROP TABLE #Inq
SELECT i.* INTO #Inq FROM (
select inquiries.InquiryId,
inquiries.InquiryDateReceived,
inquiries.InquiryCustomerID,
cust.CustomerName as InquiryCustomerName,
cust.CustomerEmail as InquiryCustomerEmail,
cust.CustomerPhone1 as InquiryCustomerPhone,
cust.CustomerBestTimeToCall as InquiryBestTimeToCall,
cust.CustomerDay as InquiryDay,
cust.CustomerNight as InquiryNight,
inquiries.InquiryServiceType,
inquiries.InquiryServiceID,
inquiries.InquiryTimeframe,
inquiries.InquiryProjectDescription,
inquiries.InquiryDateResponded,
inquiries.InquiryCustomerReply,
inquiries.YNMigrated,
inquiries.InquiryDateClosed,
inquiries.YNClosed,
inquiries.YNDeleted
from inquiries inner join dbo.Customers as cust
on inquiries.InquiryCustomerID = cust.CustomerID and inquiries.InquiryCustomerID > 0
UNION ALL
select inquiries.InquiryId,
inquiries.InquiryDateReceived,
inquiries.InquiryCustomerID,
InquiryCustomerName,
InquiryCustomerEmail,
InquiryCustomerPhone,
InquiryBestTimeToCall,
InquiryDay,
InquiryNight,
inquiries.InquiryServiceType,
inquiries.InquiryServiceID,
inquiries.InquiryTimeframe,
inquiries.InquiryProjectDescription,
inquiries.InquiryDateResponded,
inquiries.InquiryCustomerReply,
inquiries.YNMigrated,
inquiries.InquiryDateClosed,
inquiries.YNClosed,
inquiries.YNDeleted
from inquiries WHERE inquiries.InquiryCustomerID = 0
) i
select i.*, q.QuoteID
FROM #Inq i left join dbo.Quotes as q
on i.InquiryId = q.InquiryId
WHERE i.YNDeleted = 0
END

Just stop using this pattern without a really good reason. Here it only appears to create more work for the database engine with no obvious benefit. Your procedure - as posted - has trivially simple queries so why bother with the temp table and the update?
It is also time to start learning and using best practices. Terminate EVERY statement - eventually it will be required. Does order of the rows in your resultset matter? Usually it does and that is only guaranteed when that resultset is produced by a query that includes an ORDER BY clause.
As a developing/debugging short cut, you can harness the power of CTEs to help you build a working query. In this case, you can "stuff" your first query into a CTE and then simply join the CTE to Customers and "adjust" the columns you need in that resultset.
WITH inquiries as (
select inq.*, qt.QuoteID
FROM dbo.Inquiries as inq left join dbo.Quotes as qt
on inq.InquiryId = qt.InquiryId
WHERE inq.YNDeleted = 0
)
select inquiries.<col>,
...,
cust.CustomerName as "InquiryCustomerName",
...
from inquiries inner (? guessing) dbo.Customers as cust
on inquiries.InquiryCustomerID = cust.CustomerID
order by ...
;
Schema names added as best practice. Listing the columns you actually need in your resultset is another best practice. Note I did not do that for the query in the CTE but you should. You can choose to create aliases for your resultset columns as needed. I listed one example that corresponds to your UPDATE attempt.
It is odd and very suspicious that all of the columns you intended to UPDATE exist in the Inquiries table. Are you certain you need to do that at all? Do they actually differ from the related columns in the Customer table? Also odd that the value 0 exists in InquiryCustomerID - suggesting you might have not a FK to enforce the relationship. Perhaps that means you need to outer join rather than inner join (as I wrote). If an outer join is needed, then you will need to use CASE expressions to "choose" which value (the CTE value or the Customer value) to use for those columns.

After learning a lot more about how things get bound to models, and how to further use sql, here is what my stored procedure looks like:
ALTER PROCEDURE [dbo].[GetInquiryList]
#InquiryID int = 0
AS
BEGIN
SET NOCOUNT ON
select i.InquiryId,
i.InquiryDateReceived,
i.InquiryCustomerID,
InquiryCustomerName =
CASE i.InquiryCustomerID
WHEN 0 THEN i.InquiryCustomerName
ELSE c.CustomerName
END,
InquiryCustomerEmail =
CASE i.InquiryCustomerID
WHEN 0 THEN i.InquiryCustomerEmail
ELSE c.CustomerEmail
END,
InquiryCustomerPhone =
CASE i.InquiryCustomerID
WHEN 0 THEN i.InquiryCustomerPhone
ELSE c.CustomerPhone1
END,
InquiryBestTimetoCall =
CASE i.InquiryCustomerID
WHEN 0 THEN i.InquiryBestTimetoCall
ELSE c.CustomerBestTimetoCall
END,
InquiryDay =
CASE i.InquiryCustomerID
WHEN 0 THEN i.InquiryDay
ELSE c.CustomerDay
END,
InquiryNight =
CASE i.InquiryCustomerID
WHEN 0 THEN i.InquiryNight
ELSE c.CustomerNight
END,
i.InquiryServiceType,
i.InquiryServiceID,
i.InquiryTimeframe,
i.InquiryProjectDescription,
i.InquiryDateResponded,
i.InquiryCustomerReply,
i.YNMigrated,
i.InquiryDateClosed,
i.YNClosed,
i.YNDeleted, ISNULL(q.QuoteId,0) AS Quoteid
FROM dbo.Inquiries i
LEFT JOIN dbo.Quotes q ON i.InquiryId = q.InquiryId
LEFT JOIN dbo.Customers c ON i.InquiryCustomerID = c.CustomerId
WHERE i.YNDeleted = 0
END
I'm sure there are additional enhancements that could be made, but avoiding the union is a big savings. Thanks, everyone.

TSQL Select Statement using Case or Join

I am a little stuck on a situation that I have been trying to fight through. I have a page that allows a user to select all the filter options they want to search by and then it runs the query on that data.
Every field requires something to be picked but on a new field I am introducing, it's going to be optional.
It allows you to provide a list of supervisors and it will then provide all records where the agents supervisor is in the list provided; pretty straight forward. However, I am trying to make this optional as I don't want to always search by users. If I don't provide a name in the UI to pass to the stored procedure, then I want to ignore this part of the statement and get me everything regardless of the manager.
Here is the query I am working with:
SELECT a.[escID],
a.[escReasonID],
b.[ArchibusLocationName],
c.[ArchibusLocationName],
b.[DepartmentDesc],
c.[DepartmentDesc],
a.[escCreatedBy],
a.[escWorkedBy],
a.[escNotes],
a.[preventable],
a.[escalationCreated],
a.[escalationTracked],
a.[feedbackID],
typ.[EscalationType],
typ.[EscalationTypeText] AS escalationType,
d.reasonText AS reasonText
FROM [red].[dbo].[TFS_Escalations] AS a
LEFT OUTER JOIN
red.dbo.EmployeeTable AS b
ON a.escCreatedBy = b.QID
LEFT OUTER JOIN
red.dbo.EmployeeTable AS c
ON a.escWorkedBy = c.QID
LEFT OUTER JOIN
red.dbo.TFS_Escalation_Reasons AS d
ON a.escReasonID = d.ReasonID
INNER JOIN
dbo.TFS_EscalationTypes AS typ
ON d.escType = typ.EscalationType
WHERE B.[ArchibusLocationName] IN (SELECT location
FROM #tmLocations)
AND C.[ArchibusLocationName] IN (SELECT location
FROM #subLocations)
AND B.[DepartmentDesc] IN (SELECT department
FROM #tmDepartments)
AND C.[DepartmentDesc] IN (SELECT department
FROM #subDepartments)
AND DATEDIFF(second, '19700101', CAST (CONVERT (DATETIME, A.[escalationCreated], 121) AS INT)) >= #startDate
AND DATEDIFF(second, '19700101', CAST (CONVERT (DATETIME, A.[escalationCreated], 121) AS INT)) <= #endDate
AND a.[PREVENTABLE] IN (SELECT PREVENTABLE FROM #preventable)
AND b.MgrQID IN (SELECT leaderQID FROM #sourceLeaders)
The part that I am trying to make option is the very last line of the query:
AND b.MgrQID IN (SELECT leaderQID FROM #sourceLeaders)
Essentially, if there is no data in the temp table #sourceLeaders then it should ignore that piece of the query.
In all of the other instances of the WHERE clause, something is always required for those fields which is why that all works fine. I just cant figure out the best way to make this piece optional depending on if the temp table has data in it (the temp table is populated by the names entered in the UI that a user COULD search by).

So this line should be TRUE if something matches data in the table variable OR there is nothing in the table variable
AND
(
b.MgrQID IN (SELECT leaderQID FROM #sourceLeaders)
OR
NOT EXISTS (SELECT 1 FROM #sourceLeaders)
)

Similar to Nick.McDermaid's, but uses a case statement instead :
AND
(
1 = CASE WHEN NOT EXISTS(SELECT 1 FROM #sourceLeaders) THEN 1
WHEN b.MgrQID IN (SELECT leaderQID FROM #sourceLeaders) THEN 1
ELSE 0
END
)

Maybe at the top so you have a single check
DECLARE #EmptySourceLeaders CHAR(1)
IF EXISTS (SELECT 1 FROM #sourceLeaders)
SET #EmptySourceLeaders = 'N'
ELSE
SET #EmptySourceLeaders = 'Y'
Then in the joins
LEFT OUTER JOIN #SourceLeaders SL
ON b.MgrQID = SL.leaderQID
Then in the WHERE
AND (#EmptySourceLeaders = 'Y' OR SL.leaderQID IS NOT NULL)
lots of ways to do it.

SQL Case Statement - If first 'when' returns null then complete the 2nd when

Just wondering if anyone can help me , I am running a case statement that references a different table. It needs to look up the make, model and year of a car as well as the position (FL,FR,BL,BR) and return the kit number.
Up to 4 entries can exist in the table for the same vehicle with the fitting position column specifying which kit number to be selected, in order to only return 1 result i believe i need to put this in the where section of the query, if i add it anywhere else more than 1 value is returned.
However 4 entries won't always exist for the vehicle. A kit can exist for FL & BL but not FR and BR. Because of me adding the position column into the where section 'null' is returned.Rather than it returning nothing i want it to return the next part of the case statement.
This is where the sql works because a kit is available for FL
SELECT CAST (CASE WHEN '002' != 'UNI' THEN T0.U_MPLFK ELSE 'NOKIT' END AS VARCHAR)
FROM
[#CSOL_MILFORD] T0 INNER JOIN [dbo].[#CSOL_VEHICLES] T1 ON T0.[U_VehicleRef] = T1.[U_VehicleRef]
WHERE
T1.U_Manufacturer = 'Ford'
AND
T1.U_Model = 'Galaxy'
AND
T0.U_MPLFK > 1
AND
T0.U_FittingPosition = 'FL'
However when it changes to
SELECT CAST (CASE WHEN '002' != 'UNI' THEN T0.U_MPLFK ELSE 'NOKIT' END AS VARCHAR)
FROM
[#CSOL_MILFORD] T0 INNER JOIN [dbo].[#CSOL_VEHICLES] T1 ON T0.[U_VehicleRef] = T1.[U_VehicleRef]
WHERE
T1.U_Manufacturer = 'Ford'
AND
T1.U_Model = 'Galaxy'
AND
T0.U_MPLFK > 1
AND
T0.U_FittingPosition = 'FR'
I get no value retuned, i want it to return 'NOKIT'
Many Thanks,
Roisin

A left join returns a row with null columns if its on condition fails. So you could move the conditions to the on part of a left join. Something like:
...
FROM #CSOL_VEHICLES T1
LEFT JOIN
#CSOL_MILFORD T0
ON T0.U_VehicleRef = T1.U_VehicleRef
AND T0.U_MPLFK > 1
AND T0.U_FittingPosition = 'FR'
WHERE T1.U_Manufacturer = 'Ford'
AND T1.U_Model = 'Galaxy'
Having the condition in the on clause instead of the where clause means you'll get a row with nulls instead of no row.

Finding aggregate by Joining Table - SQL SERVER

Question: Find the percentage of people who died out of the cases reported for each country.
Data is in two tables- cases and death. There are country columns in both the tables. deaths is the column with no. of deaths in death table. And cases is the cases reported, which is in the cases table.
With the following query,
SELECT (sum(isnull(d.deaths,0))/sum(isnull(c.cases,0)))*100
FROM cases as c
JOIN death as d ON c.country=d.country
Im getting an answer 0.54493487236178.
==
Aggregating separately and averaging, I'm getting the average as the following. (same in excel)
SELECT sum(cases) FROM cases
The value is 106036635
SELECT sum(deaths) FROM death
The value is 716111
(716111/106036635)*100= 0.675343008.
How come both the values differ!!
==
ALSO
SELECT c.country, (sum(d.deaths)/sum(c.cases))*100
FROM cases as c
JOIN death as d ON c.country1=d.country1 AND c.cases IS NOT NULL AND d.deaths IS NOT NULL
GROUP BY c.country
is giving me Divide by zero error encountered.! I understand my codes are quite ugly and long since I'm a newbie. Plz help me ..

This is actually all correct. Your statements are all coming out with correct information you just need to adjust your queries to see it.
We will start with the ratios you feel are incorrect. Remember on your initial query you are doing an inner join that is why your sums seem to be incorrect. If you have cases without deaths then You will actually not count those cases. your check queries should look like
SELECT sum(cases) FROM cases where cases.country in (Select death.country from death)
and
SELECT sum(deaths) FROM death where death.country in (Select cases.country from cases)
Using this query should show you the correct ratio.
For your second problem there is a chance that you have countries listed without any cases. In this case we will want to add a conditional statement to help determine when there is an issue
Select c.country,
Case c.cases
When 0 Then
Case d.deaths
When 0 Then 0
Else 100
End
Else (sum(d.deaths)/sum(c.cases))*100
End As Ratio
From cases c
Inner Join death d
On c.country = d.country
Where c.cases Is Not Null
And d.death Is Not Null
Group By c.country
Or if you also want to Disclude 0 cases you can simplify the query to this
Select c.country,
(sum(d.deaths)/sum(c.cases))*100 As Ratio
From cases c
Inner Join death d
On c.country = d.country
Where c.cases <> 0
And c.cases Is Not Null
And d.death Is Not Null
Group By c.country

Try this:
SELECT c.country, cast(sum(d.deaths)/(case when sum(c.cases) = 0 then 1 else sum(c.cases) end) as float) *100
FROM cases as c
JOIN death as d ON c.country1=d.country1 AND c.cases IS NOT NULL AND d.deaths IS NOT NULL
GROUP BY c.country

Optimize TSQL query with 3 tables

I need to get all the runs from the database, but need to mark if there is an error for this run.
3 Tables:
Runs: contains the runs)
Runfiles: contains the file ids that were processed during a run
Messages: contains errors, warnings, ...
Can this query be optimized any further?
SELECT TOP 1000 runid,
start,
end,
userid,
CASE
WHEN EXISTS(SELECT rf.fk_fileid
FROM runfiles rf
WHERE rf.fk_runid = r.runid
AND EXISTS(SELECT m.messageid
FROM messages m
WHERE m.fk_fileid =
rf.fk_fileid
AND m.fk_statusid = 4))
THEN 1
ELSE 0
END AS ContainsError
FROM runs r
ORDER BY start DESC
Please don't comment on the table names, they were translated for this question.
Thanks!

Try this:
SELECT TOP 1000
r.runid
,r.start
,r.[end]
,r.userid
,CASE WHEN m.messageid IS NOT NULL THEN 1 ELSE 0 END AS ContainsError
FROM runs r
LEFT JOIN runfiles rf
ON rf.fk_runid = r.runid
LEFT JOIN [messages] m
ON m.fk_fileid = rf.fk_fileid
AND m.fk_statusid = 4
ORDER BY r.start DESC
Anything in the select list is ran for each row in the result set. This means that the nested subquery in your CASE statement is being executed for each of those TOP 1000 rows.
Using left joins and a CASE statement to check if the primary key is null allow the entire statement to be evaluated as a set, which SQL Server is built to do. It should perform better this way.