SQL slow subquery - sql-server

I am trying to retrieve data from a very large Audits table (millions of rows). So I need to make the query run as efficiently as possible.
First I am playing with a subquery to return the ObjectTypeId and use this to limit the query on the Audit table
This query is taking 4 minutes to run:
select distinct Audits.ObjectTypeID, COUNT(*) as Count
from Audits as Audits
where Audits.ObjectTypeID =
(select distinct ObjectType.ObjectTypeID from ObjectType where ObjectName = 'Data')
group by Audits.ObjectTypeID
If I default in the ObjectTypeID the query runs in 42 seconds
select distinct(Audits.ObjectTypeID), COUNT(*) as Count
from Audits
where Audits.ObjectTypeID = 1
group by Audits.ObjectTypeID
But the subquery when run in isolation only takes only a second to run. So why should the first query take so long?

I can see three things that might help:
Pull the ObjectTypeID into a variable: since there should be only one value for it
Take out the DISTINCT on both queries since they should be unnecessary (the subquery should only have one value and you are grouping by that value in the outer query
Take out the GROUP BY since you are only querying for one ObjectTypeID
So the final query would be:
DECLARE #ObjectTypeID INT
SELECT #ObjectTypeID = (select ObjectType.ObjectTypeID
from ObjectType
where ObjectName = 'Data')
select Audits.ObjectTypeID, COUNT(*) as Count
from Audits as Audits
where Audits.ObjectTypeID = #ObjectTypeID
If you are executing this as a single statement and not as a batch or stored procedure (meaning you can't use variables) thne you can keep the subquery:
select Audits.ObjectTypeID, COUNT(*) as Count
from Audits as Audits
where Audits.ObjectTypeID =
(select ObjectType.ObjectTypeID
from ObjectType
where ObjectName = 'Data')

The part where you are getting the most performance hit could be this line:
where Audits.ObjectTypeID =
(select distinct ObjectType.ObjectTypeID from ObjectType where ObjectName = 'Data')
You are actually calling the same query on every row of your table and it will search the ENTIRE ObjectType table and return the ENTIRE result of that subquery. This will be a big performance hit if your ObjectType table is HUGE. You could speed up that section of the query by using EXISTS so that it will return early once a result was found. Here is an example:
SELECT a.ObjectTypeID, COUNT(*) as Count
FROM Audits a
WHERE EXISTS
(
SELECT ot.ObjectTypeID
FROM ObjectType ot
WHERE ot.ObjectName = 'Data' AND ot.ObjectTypeID = a.ObjectTypeID
)
GROUP BY a.ObjectTypeID

Can you try this
SELECT DISTINCT Audits.ObjectTypeID, COUNT(*) as Count
FROM Audits as Audits
INNER JOIN
(SELECT DISTINCT ObjectTypeId, ObjectName FROM ObjectType
WHERE ObjectName = 'Data') as ObjectType ON Audits.ObjectTypeID = ObjectType.ObjectTypeID
GROUP BY Audits.ObjectTypeID

Related

Can you cache the results of a subquery that's used repeatedly in a stored procedure?

I have a series of queries being UNION'd together. Each query has a WHERE... IN clause that compares against the same list of IDs.
In a simplified form for example purposes it looks like this:
SELECT * FROM MyTable
WHERE AuthorUserId IN (SELECT UserId FROM Users WHERE TeamId = #teamId)
UNION
SELECT * FROM MyTable
WHERE PublisherUserId IN (SELECT UserId FROM USERS WHERE TeamId = #teamId)
UNION...
and so on. #teamId is an int stored procedure parameter.
Is there a way to tell SQL Server to hold on to the result set of
SELECT UserId FROM USERS WHERE TeamId = #teamId
so it doesn't fetch it for each SELECT?
One way you could do this is by capturing the results of that query and storing it in a temp table, and JOINing to those results in your query:
Declare #UserIds Table (UserId Int)
Insert #UserIds (UserId)
SELECT UserId
FROM Users
WHERE TeamId = #teamId
SELECT M.*
FROM MyTable M
JOIN #UserIds U ON M.AuthorUserId = U.UserId
UNION
SELECT M.*
FROM MyTable M
JOIN #UserIds U ON M.PublisherUserId = U.UserId
UNION...
SQL server is smart enough to store results of same query if the parameters are same. If your subquery is not using session variables, you will be fine. You can also put index on TeamId which will make this query faster.
If you are still worried about it and using any programming language to get data from SP, then you should store it in a variable in code and then pass that result set into SP.
Second option would be to store it in temp table and then query it from temp table

Multiple Select against one CTE

I have a CTE query filtering a table Student
Student
(
StudentId PK,
FirstName ,
LastName,
GenderId,
ExperienceId,
NationalityId,
CityId
)
Based on a lot filters (multiple cities, gender, multiple experiences (1, 2, 3), multiple nationalites), I create a CTE by using dynamic sql and joining the student table with a user defined tables (CityTable, NationalityTable,...)
After that I have to retrieve the count of student by each filter like
CityId City Count
NationalityId Nationality Count
Same thing the other filter.
Can I do something like
;With CTE(
Select
FROM Student
Inner JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId)
FROm CTE
GROUP BY CityId,City
SELECT GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
I want to something like what LinkedIn is doing with search(people search,job search)
http://www.linkedin.com/search/fpsearch?type=people&keywords=sales+manager&pplSearchOrigin=GLHD&pageKey=member-home
It's so fast and do the same thing.
You can not use multiple select but you can use more than one CTE like this.
WITH CTEA
AS
(
SELECT 'Coulmn1' A,'Coulmn2' B
),
CETB
AS
(
SELECT 'CoulmnX' X,'CoulmnY' Y
)
SELECT * FROM CTEA, CETB
For getting count use RowNumber and CTE some think like this.
ROW_NUMBER() OVER ( ORDER BY COLUMN NAME )AS RowNumber,
Count(1) OVER() AS TotalRecordsFound
Please let me know if you need more information on this.
Sample for your reference.
With CTE AS (
Select StudentId, S.CityId, S.GenderId
FROM Student S
Inner JOIN CITY C
ON S.CityId = C.CityId
INNER JOIN GENDER G
ON S.GenderId = G.GenderId)
,
GENDER
AS
(
SELECT GenderId
FROM CTE
GROUP BY GenderId
)
SELECT * FROM GENDER, CTE
It is not possible to get multiple result sets from a single CTE.
You can however use a table variable to cache some of the information and use it later instead of issuing the same complex query multiple times:
declare #relevantStudent table (StudentID int);
insert into #relevantStudent
select s.StudentID from Students s
join ...
where ...
-- now issue the multiple queries
select s.GenderID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.GenderID
select s.CityID, count(*)
from student s
join #relevantStudent r on r.StudentID = s.StudentID
group by s.CityID
The trick is to store only the minimum required information in the table variable.
As with any query whether this will actually improve performance vs. issuing the queries independently depends on many things (how big the table variable data set is, how complex is the query used to populate it and how complex are the subsequent joins/subselects against the table variable, etc.).
Do a UNION ALL to do multiple SELECT and concatenate the results together into one table.
;WITH CTE AS(
SELECT
FROM Student
INNER JOIN ...
INNER JOIN ....)
SELECT CityId,City,Count(studentId),NULL,NULL
FROM CTE
GROUP BY CityId,City
UNION ALL
SELECT NULL,NULL,NULL,GenderId,Gender,Count
FROM CTE
GROUP BY GenderId,Gender
Note: The NULL values above just allow the two results to have matching columns, so the results can be concatenated.
I know this is a very old question, but here's a solution I just used. I have a stored procedure that returns a PAGE of search results, and I also need it to return the total count matching the query parameters.
WITH results AS (...complicated foo here...)
SELECT results.*,
CASE
WHEN #page=0 THEN (SELECT COUNT(*) FROM results)
ELSE -1
END AS totalCount
FROM results
ORDER BY bar
OFFSET #page * #pageSize ROWS FETCH NEXT #pageSize ROWS ONLY;
With this approach, there's a small "hit" on the first results page to get the count, and for the remaining pages, I pass back "-1" to avoid the hit (I assume the number of results won't change during the user session). Even though totalCount is returned for every row of the first page of results, it's only computed once.
My CTE is doing a bunch of filtering based on stored procedure arguments, so I couldn't just move it to a view and query it twice. This approach allows avoid having to duplicate the CTE's logic just to get a count.

how to count number of tables/views/index in my database

how to count number of tables/views/index in my database
I am using sybase 11
select count(*) from sysobjects where type = 'U'
should get you the number of user tables. You can also use type = 'V' to count views.
select count(*) from sysindexes
will give you an index count. You may need to further filter both though, depending on which types of indexes you want.
sysobjects reference here.
sysindexes reference here.
For oracle
Count Tables:
SELECT COUNT(*) FROM user_tables;
Count Sequences
SELECT COUNT(*) FROM user_sequences;
Count Views
SELECT COUNT(*) FROM user_views;
Count Indexes
SELECT COUNT(*) FROM user_indexes;
Hi Hope this below sql works
SELECT COUNT(*) FROM USER_TABLES;
will return you number of tables in respective database.

FreeText COUNT query on multiple tables is super slow

I have two tables:
**Product**
ID
Name
SKU
**Brand**
ID
Name
Product table has about 120K records
Brand table has 30K records
I need to find count of all the products with name and brand matching a specific keyword.
I use freetext 'contains' like this:
SELECT count(*)
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE (contains(Product.Name, 'pants')
or
contains(Brand.Name, 'pants'))
This query takes about 17 secs.
I rebuilt the FreeText index before running this query.
If I only check for Product.Name. They query is less then 1 sec. Same, if I only check the Brand.Name. The issue occurs if I use OR condition.
If I switch query to use LIKE:
SELECT count(*)
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE Product.Name LIKE '%pants%'
or
Brand.Name LIKE '%pants%'
It takes 1 secs.
I read on MSDN that: http://msdn.microsoft.com/en-us/library/ms187787.aspx
To search on multiple tables, use a
joined table in your FROM clause to
search on a result set that is the
product of two or more tables.
So I added an INNER JOINED table to FROM:
SELECT count(*)
FROM (select Product.Name ProductName, Product.SKU ProductSKU, Brand.Name as BrandName FROM Product
inner join Brand
on product.BrandID = Brand.ID) as TempTable
WHERE
contains(TempTable.ProductName, 'pants')
or
contains(TempTable.BrandName, 'pants')
This results in error:
Cannot use a CONTAINS or FREETEXT predicate on column 'ProductName' because it is not full-text indexed.
So the question is - why OR condition could be causing such as slow query?
After a bit of trial an error I found a solution that seems to work. It involves creating an indexed view:
CREATE VIEW [dbo].[vw_ProductBrand]
WITH SCHEMABINDING
AS
SELECT dbo.Product.ID, dbo.Product.Name, dbo.Product.SKU, dbo.Brand.Name AS BrandName
FROM dbo.Product INNER JOIN
dbo.Brand ON dbo.Product.BrandID = dbo.Brand.ID
GO
CREATE UNIQUE CLUSTERED INDEX IX_VW_PRODUCTBRAND_ID
ON vw_ProductBrand (ID);
GO
If I run the following query:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
GO
SELECT count(*)
FROM Product
inner join vw_ProductBrand
on Product.BrandID = vw_ProductBrand.ID
WHERE (contains(vw_ProductBrand.Name, 'pants')
or
contains( vw_ProductBrand.BrandName, 'pants'))
It now takes 1 sec again.
I ran into a similar problem but i fixed it with union, something like:
SELECT *
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE contains(Product.Name, 'pants')
UNION
SELECT *
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE contains(Brand.Name, 'pants'))
Have you tried something like:
SELECT count(*)
FROM Product
INNER JOIN Brand ON Product.BrandID = Brand.ID
WHERE CONTAINS((Product.Name, Brand.Name), 'pants')

Is there a way to optimize the query given below

I have the following Query and i need the query to fetch data from SomeTable based on the filter criteria present in the Someothertable. If there is nothing present in SomeOtherTable Query should return me all the data present in SomeTable
SQL SERVER 2005
SomeOtherTable does not have any indexes or any constraint all fields are char(50)
The Following Query work fine for my requirements but it causes performance problems when i have lots of parameters.
Due to some requirement of Client, We have to keep all the Where clause data in SomeOtherTable. depending on subid data will be joined with one of the columns in SomeTable.
For example the Query can can be
SELECT
*
FROM
SomeTable
WHERE
1=1
AND
(
SomeTable.ID in (SELECT DISTINCT ID FROM SomeOtherTable WHERE Name = 'ABC' and subid = 'EF')
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC' and subid = 'EF')
)
AND
(
SomeTable.date =(SELECT date FROM SomeOtherTable WHERE Name = 'ABC' and subid = 'Date')
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC' and subid = 'Date')
)
EDIT----------------------------------------------
I think i might have to explain my problem in detail:
We have developed an ASP.net application that is used to invoke parametrize crystal reports, parameters to the crystal reports are not passed using the default crystal reports method.
In ASP.net application we have created wizards which are used to pass the parameters to the Reports, These parameters are not directly consumed by the crystal report but are consumed by the Query embedded inside the crystal report or the Stored procedure used in the Crystal report.
This is achieved using a table (SomeOtherTable) which holds parameter data as long as report is running after which the data is deleted, as such we can assume that SomeOtherTable has max 2 to 3 rows at any given point of time.
So if we look at the above query initial part of the Query can be assumed as the Report Query and the where clause is used to get the user input from the SomeOtherTable table.
So i don't think it will be useful to create indexes etc (May be i am wrong).
SomeOtherTable does not have any
indexes or any constraint all fields
are char(50)
Well, there's your problem. There's nothing you can do to a query like this which will improve its performance if you create it like this.
You need a proper primary or other candidate key designated on all of your tables. That is to say, you need at least ONE unique index on the table. You can do this by designating one or more fields as the PK, or you can add a UNIQUE constraint or index.
You need to define your fields properly. Does the field store integers? Well then, an INT field may just be a better bet than a CHAR(50).
You can't "optimize" a query that is based on an unsound schema.
Try:
SELECT
*
FROM
SomeTable
LEFT JOIN SomeOtherTable ON SomeTable.ID=SomeOtherTable.ID AND Name = 'ABC'
WHERE
1=1
AND
(
SomeOtherTable.ID IS NOT NULL
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC')
)
also put 'with (nolock)' after each table name to improve performance
The following might speed you up
SELECT *
FROM SomeTable
WHERE
SomeTable.ID in
(SELECT DISTINCT ID FROM SomeOtherTable Where Name = 'ABC')
UNION
SELECT *
FROM SomeTable
Where
NOT EXISTS (Select spName From SomeOtherTable Where spName = 'ABC')
The UNION will effectivly split this into two simpler queries which can be optiomised separately (depends very much on DBMS, table size etc whether this will actually improve performance -- but its always worth a try).
The "EXISTS" key word is more efficient than the "SELECT COUNT(1)" as it will return true as soon as the first row is encountered.
Or check if the value exists in db first
And you can remove the distinct keyword in your query, it is useless here.
if EXISTS (Select spName From SomeOtherTable Where spName = 'ABC')
begin
SELECT *
FROM SomeTable
WHERE
SomeTable.ID in
(SELECT ID FROM SomeOtherTable Where Name = 'ABC')
end
else
begin
SELECT *
FROM SomeTable
end
Aloha
Try
select t.* from SomeTable t
left outer join SomeOtherTable o
on t.id = o.id
where (not exists (select id from SomeOtherTable where spname = 'adbc')
OR spname = 'adbc')
-Edoode
change all your select statements in the where part to inner jons.
the OR conditions should be union all-ed.
also make sure your indexing is ok.
sometimes it pays to have an intermediate table for temp results to which you can join to.
It seems to me that there is no need for the "1=1 AND" in your query. 1=1 will always evaluate to be true, leaving the software to evaluate the next part... why not just skip the 1=1 and evaluate the juicy part?
I am going to stick to my original Query.

Resources