T-SQL not equal operator vs Case statement - sql-server

Assume I have a T-SQL statement:
select * from MyTable
where Code != 'RandomCode'
I've been tasked with making this kind of where statement perform more quickly. Books Online says that positive queries (=) are faster than negative (!= , <>).
So, one option is make this into a CASE statement e.g.
select * from MyTable
where
case when Code = 'RandomCode' then 0
else 1 end = 1
Does anyone know if this can be expected to be faster or slower than the original T-SQL ?
Thanks in advance.

You have to be more specific at what information you are interested in the table and what are the possible values of the Code column. Then you can create appropriate indexes to speed up the query.
For example, if values in the Code column could only be one of 'RandomCode', 'OtherCode', 'YetAnotherCode', you can re-write the query as:
SELECT * FROM MyTable WHERE Code = 'OtherCode' OR Code = 'YetAnotherCode'
And of course you need an index on the Code column.
If you have to do an inequality query, you can change SELECT * to a more narrow query like:
SELECT Id, Name, Whatever FROM MyTable WHERE Code != 'RandomCode'
Then create an index like:
CREATE INDEX idx_Code ON MyTable(Code) INCLUDE (Id,Name,Whatever)
This can reduce I/O by replacing a table scan with an index scan.

Related

Switching a case statement to a constant greatly slows query

I have run into an issue with SQL Server 2017 where replacing:
a CASE statement that assigns a numerical value
with a constant numerical value
slows down the query be a factor of 6+.
The rather complicated query has the general form of:
WITH CTE1 AS
(
...
),
WITH CTE2 AS
(
SELECT
--conditions based on below
FROM
(SELECT
--various math,
CASE
--statement assigning values to different runID combinations for samples with matching siteIDs and dates (due to the ON statement below)
ELSE NULL
....
END AS whichCombination
FROM
CTE1 AS value1
JOIN
CTE1 AS value2 ON (value1.siteID = value2.siteID,
value1.date = value2.date,
value1.sampleID <> value2.sampleID)
) AS combinations
WHERE combinations.whichCombination IS NOT NULL
)
SELECT various data
FROM dataTable
LEFT JOIN
(stuff from CTE2) AS pairTable ON dataTable.sampleID = pairTable.sampleID
The CASE statement assigns a pair number to different combinations of rows from the self join.
This then is used to select only the combinations that I want.
However, when the CASE statement is replaced with: 1 AS whichCombination (a constant value so no rows are assigned NULL) the query slows dramatically. This also occurs if CASE WHEN 1 = 1 THEN 1 is used.
This makes no sense to me as either way the values are:
numerical
not unique
not an index
The only thing that is unique is that each combination of rows is a assigned a unique value.
Is SQL Server somehow using this as an index that speeds things up?
And how would I replicate this behavior without the CASE statement as this answer says you cannot create indices for CTE's?
EDIT: Also of note is that the slowdown occurs only if main select statement (the last 5 lines) is included (i.e. if CTE2 is run as the main select statement instead of being a CTE)
Best, JD
One workaround would be spliting these CTE's to temp tables, then you could add indexes if needed.

Wildcard character in value in MERGE ON statement

I have a merge statement that starts like this:
MERGE INTO TEMSPASA
USING (SELECT *
FROM OPENQUERY(orad, 'SELECT * FROM CDAS.TDWHCORG')) AS TDWHPASA ON TEMSPASA.pasa_cd = LTRIM(RTRIM(TDWHPASA.corg_id)) AND
TEMSPASA.pasa_active_ind = TDWHPASA.corg_active_ind
WHEN MATCHED THEN
UPDATE
SET
TEMSPASA.pasa_desc = LTRIM(RTRIM(TDWHPASA.corg_nm)),
TEMSPASA.pasa_active_ind = TDWHPASA.corg_active_ind
WHEN NOT MATCHED THEN
INSERT (pasa_cd, pasa_desc, pasa_active_ind)
VALUES (LTRIM(RTRIM(TDWHPASA.corg_id)), TDWHPASA.corg_nm, TDWHPASA.corg_active_ind);
There are pasa_cd's like ('H04', 'H04*') where that * is NOT a wildcard. But I think the on statement is treating it like it is a wildcard because when I try to run the merge statement, I get the following error:
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
I have verified that there are no duplicates in my table. The only thing I can think of is what I mentioned above, that the ON part of the merge statement is seeing that * as a wildcard.
I have tried searching, saw something about an escape character, but that was in the where clause. Any ideas how to deal with this?
This means you have more than 1 row that is matching between the source and target tables. You need to find out what row(s) are the issue here. It could be from either table. Something like this should help you identify where the problem is coming from.
SELECT LTRIM(RTRIM(TDWHPASA.corg_id))
, TDWHPASA.corg_active_ind
FROM CDAS.TDWHCORG as TDWHPASA
group by LTRIM(RTRIM(TDWHPASA.corg_id))
, TDWHPASA.corg_active_ind
having count(*) > 1
select t.pasa_cd
, t.pasa_active_ind
from TEMSPASA t
group by t.pasa_cd
, t.pasa_active_ind
having count(*) > 1
So my theory in my initial post was incorrect. There were duplicates in my source table, but it was based on the case sensitivity. There were values H04F and H04f. Both different rows, but because of the case insensitivity in my sql, it was seeing them as duplicates. To resolve the issue I added COLLATE Latin1_General_CS_AS to the end of the ON clause and it did the trick

Optimize SQL in MS SQL Server that returns more than 90% of records in the table

I have the below sql
SELECT Cast(Format(Sum(COALESCE(InstalledSubtotal, 0)), 'F') AS MONEY) AS TotalSoldNet,
BP.BoundProjectId AS ProjectId
FROM BoundProducts BP
WHERE ( BP.IsDeleted IS NULL
OR BP.IsDeleted = 0 )
GROUP BY BP.BoundProjectId
I already have an index on the table BoundProducts on this column order (BoundProjectId, IsDeleted)
Currently this query takes around 2-3 seconds to return the result. I am trying to reduce it to zero seconds.
This query returns 25077 rows as of now.
Please provide me any ideas to improvise the query.
Looking at this in a bit different point of view, I can think that your OR condition is screwing up your query, why not to rewrite it like this?
SELECT CAST(FORMAT(SUM(COALESCE(BP.InstalledSubtotal, 0)), 'F') AS MONEY) AS TotalSoldNet
, BP.BoundProjectId AS ProjectId
FROM (
SELECT BP.BoundProjectId, BP.InstalledSubtotal
FROM dbo.BoundProducts AS BP
WHERE BP.IsDeleted IS NULL
UNION ALL
SELECT BP.BoundProjectId, BP.InstalledSubtotal
FROM dbo.BoundProducts AS BP
WHERE BP.IsDeleted = 0
) AS BP
GROUP BY BP.BoundProjectId;
I've had better experience with UNION ALL rather than OR.
I think it should work totally the same. On top of that, I'd create this index:
CREATE NONCLUSTERED INDEX idx_BoundProducts_IsDeleted_BoundProjectId_iInstalledSubTotal
ON dbo.BoundProducts (IsDeleted, BoundProjectId)
INCLUDE (InstalledSubTotal);
It should satisfy your query conditions and seek index quite well. I know it's not a good idea to index bit fields, but it's worth trying.
P.S. Why not to default your IsDeleted column value to 0 and make it NOT NULLABLE? By doing that, it should be enough to do a simple check WHERE IsDeleted = 0, that'd boost your query too.
If you really want to try index seek, it should be possible using query hint forceseek, but I don't think it's going to make it any faster.
The options I suggested last time are still valid, remove format and / or create an indexed view.
You should also test if the problem is the query itself or just displaying the results after that, for example trying it with "select ... into #tmp". If that's fast, then the problem is not the query.
The index name in the screenshot is not the same as in create table statement, but I assume that's just a name you changed for the question. If the scan is happening to another index, then you should include that too.

SQL Server where clause using In() vs Wildcard

Is there any performance difference between query A and query B?
Query A
SELECT * FROM SomeTable
WHERE 1 = 1 AND (SomeField LIKE '[1,m][6,e][n]%')
Query B
SELECT * FROM SomeTable
WHERE 1 = 1 AND (SomeField IN ('16', 'Mens'))
The first could be much slower. An index can't be used with LIKE unless there is a constant prefix, for example LIKE 'foo%'. The first query will therefore require a table scan. The second query however could use an index on SomeField if one is available.
The first query will also give the wrong results as it matches '1en'.

The fastest way to check if some records in a database table?

I have a huge table to work with . I want to check if there are some records whose parent_id equals my passing value .
currently what I implement this is by using "select count(*) from mytable where parent_id = :id"; if the result > 0 , means the they do exist.
Because this is a very huge table , and I don't care what's the exactly number of records that exists , I just want to know whether it exists , so I think count(*) is a bit inefficient.
How do I implement this requirement in the fastest way ? I am using Oracle 10.
#
According to hibernate Tips & Tricks https://www.hibernate.org/118.html#A2
It suggests to write like this :
Integer count = (Integer) session.createQuery("select count(*) from ....").uniqueResult();
I don't know what's the magic of uniqueResult() here ? why does it make this fast ?
Compare to "select 1 from mytable where parent_id = passingId and rowrum < 2 " , which is more efficient ?
An EXISTS query is the one to go for if you're not interested in the number of records:
select 'Y' from dual where exists (select 1 from mytable where parent_id = :id)
This will return 'Y' if a record exists and nothing otherwise.
[In terms of your question on Hibernate's "uniqueResult" - all this does is return a single object when there is only one object to return - instead of a set containing 1 object. If multiple results are returned the method throws an exception.]
There's no real difference between:
select 'y'
from dual
where exists (select 1
from child_table
where parent_key = :somevalue)
and
select 'y'
from mytable
where parent_key = :somevalue
and rownum = 1;
... at least in Oracle10gR2 and up. Oracle's smart enough in that release to do a FAST DUAL operation where it zeroes out any real activity against it. The second query would be easier to port if that's ever a consideration.
The real performance differentiator is whether or not the parent_key column is indexed. If it's not, then you should run something like:
select 'y'
from dual
where exists (select 1
from parent_able
where parent_key = :somevalue)
select count(*) should be lighteningly fast if you have an index, and if you don't, allowing the database to abort after the first match won't help much.
But since you asked:
boolean exists = session.createQuery("select parent_id from Entity where parent_id=?")
.setParameter(...)
.setMaxResults(1)
.uniqueResult()
!= null;
(Some syntax errors to be expected, since I don't have a hibernate to test against on this computer)
For Oracle, maxResults is translated into rownum by hibernate.
As for what uniqueResult() does, read its JavaDoc! Using uniqueResult instead of list() has no performance impact; if I recall correctly, the implementation of uniqueResult delegates to list().
First of all, you need an index on mytable.parent_id.
That should make your query fast enough, even for big tables (unless there are also a lot of rows with the same parent_id).
If not, you could write
select 1 from mytable where parent_id = :id and rownum < 2
which would return a single row containing 1, or no row at all. It does not need to count the rows, just find one and then quit. But this is Oracle-specific SQL (because of rownum), and you should rather not.
For DB2 there is something like select * from mytable where parent_id = ? fetch first 1 row only. I assume that something similar exists for oracle.
This query will return 1 if any record exists and 0 otherwise:
SELECT COUNT(1) FROM (SELECT 1 FROM mytable WHERE ROWNUM < 2);
It could help when you need to check table data statistics, regardless table size and any performance issue.

Resources