FreeText COUNT query on multiple tables is super slow

FreeText COUNT query on multiple tables is super slow - sql-server

I have two tables:
**Product**
ID
Name
SKU
**Brand**
ID
Name
Product table has about 120K records
Brand table has 30K records
I need to find count of all the products with name and brand matching a specific keyword.
I use freetext 'contains' like this:
SELECT count(*)
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE (contains(Product.Name, 'pants')
or
contains(Brand.Name, 'pants'))
This query takes about 17 secs.
I rebuilt the FreeText index before running this query.
If I only check for Product.Name. They query is less then 1 sec. Same, if I only check the Brand.Name. The issue occurs if I use OR condition.
If I switch query to use LIKE:
SELECT count(*)
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE Product.Name LIKE '%pants%'
or
Brand.Name LIKE '%pants%'
It takes 1 secs.
I read on MSDN that: http://msdn.microsoft.com/en-us/library/ms187787.aspx
To search on multiple tables, use a
joined table in your FROM clause to
search on a result set that is the
product of two or more tables.
So I added an INNER JOINED table to FROM:
SELECT count(*)
FROM (select Product.Name ProductName, Product.SKU ProductSKU, Brand.Name as BrandName FROM Product
inner join Brand
on product.BrandID = Brand.ID) as TempTable
WHERE
contains(TempTable.ProductName, 'pants')
or
contains(TempTable.BrandName, 'pants')
This results in error:
Cannot use a CONTAINS or FREETEXT predicate on column 'ProductName' because it is not full-text indexed.
So the question is - why OR condition could be causing such as slow query?

After a bit of trial an error I found a solution that seems to work. It involves creating an indexed view:
CREATE VIEW [dbo].[vw_ProductBrand]
WITH SCHEMABINDING
AS
SELECT dbo.Product.ID, dbo.Product.Name, dbo.Product.SKU, dbo.Brand.Name AS BrandName
FROM dbo.Product INNER JOIN
dbo.Brand ON dbo.Product.BrandID = dbo.Brand.ID
GO
CREATE UNIQUE CLUSTERED INDEX IX_VW_PRODUCTBRAND_ID
ON vw_ProductBrand (ID);
GO
If I run the following query:
DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE
GO
SELECT count(*)
FROM Product
inner join vw_ProductBrand
on Product.BrandID = vw_ProductBrand.ID
WHERE (contains(vw_ProductBrand.Name, 'pants')
or
contains( vw_ProductBrand.BrandName, 'pants'))
It now takes 1 sec again.

I ran into a similar problem but i fixed it with union, something like:
SELECT *
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE contains(Product.Name, 'pants')
UNION
SELECT *
FROM Product
inner join Brand
on Product.BrandID = Brand.ID
WHERE contains(Brand.Name, 'pants'))

Have you tried something like:
SELECT count(*)
FROM Product
INNER JOIN Brand ON Product.BrandID = Brand.ID
WHERE CONTAINS((Product.Name, Brand.Name), 'pants')

Related

Does NOT EXIST (in the absense of where clause in the subquery) implicitly assume that where clause is on column selected by subquery?

Link: https://blog.udemy.com/sql-not-exists/
select * from customers
where NOT EXISTS (
select customerID from orders)
I had always thought that the subquery should be written such that the subquery table (orders) should have a where clause to lookup the value from the outer table (customers).
However above example seems to be without where clause in the subquery.
For example I would have written it like this:
select * from customers c
where NOT EXISTS (
select 1 from orders o
where c.customerID=o.customerID)
So does the subquery implicitly consider where clause to be on customerID in the 1st example?

The queries do different things. The first one, returns record if specific condition is true. It's like the followings:
select * from customers
where 1 = 0
select * from customers
where 1 = 1
In such cases, the condition is not referring the data in the specified table. You can use this, for example, to return or not return rows based on specific input parameter.
In your case, I expect that the NOT EXISTS ( select customerID from orders) will always be false, as orders exists, so no data is returned.
The second query is doing what you actually want - return customers that do not have any orders. I prefer using LEFT JOINs for such queries. In AdventureWorks2012 it will look like:
select count(*)
from Production.Product
where NOT EXISTS (
select ProductID from Sales.SalesOrderDetail)
select count(*)
from Production.Product c
where NOT EXISTS (
select 1 from Sales.SalesOrderDetail o
where c.ProductID=o.ProductID)
SELECT count(*)
from Production.Product c
LEFT JOIN Sales.SalesOrderDetail o
ON c.ProductID=o.ProductID
WHERE o.ProductID IS NULL

SQL queries combined into one row

I'm having some difficulty combining the following queries, so that the results display in one row rather than in multiple rows:
SELECT value FROM dbo.parameter WHERE name='xxxxx.name'
SELECT dbo.contest.name AS Event_Name
FROM contest
INNER JOIN open_box on open_box.contest_id = contest.id
GROUP BY dbo.contest.name
SELECT COUNT(*) FROM open_option AS total_people
SELECT SUM(scanned) AS TotalScanned,SUM(number) AS Totalnumber
FROM dbo.open_box
GROUP BY contest_id
SELECT COUNT(*) FROM open AS reff
WHERE refer = 'True'
I would like to display data from the fields in each column similar to what is shown in the image below. Any help is appreciated!

Tab's solution is fine, I just wanted to show an alternative way of doing this. The following statement uses subqueries to get the information in one row:
SELECT
[xxxx.name]=(SELECT value FROM dbo.parameter WHERE name='xxxxx.name'),
[Event Name]=(SELECT dbo.contest.name
FROM contest
INNER JOIN open_box on open_box.contest_id = contest.id
GROUP BY dbo.contest.name),
[Total People]=(SELECT COUNT(*) FROM open_option),
[Total Scanned]=(SELECT SUM(scanned)
FROM dbo.open_box
GROUP BY contest_id),
[Total Number]=(SELECT SUM(number)
FROM dbo.open_box
GROUP BY contest_id),
Ref=(SELECT COUNT(*) FROM open WHERE refer = 'True');
This requires the Total Scanned and Total Number to be queried seperately.
Update: if you then want to INSERT that into another table there are essentially two ways to do that.
Create the table directly from the SELECT statement:
SELECT
-- the fields from the first query
INTO
[database_name].[schema_name].[new_table_name]; -- creates table new_table_name
Insert into a table that already exists from the INSERT
INSERT INTO [database_name].[schema_name].[existing_table_name](
-- the fields in the existing_table_name
)
SELECT
-- the fields from the first query

Just CROSS JOIN the five queries as derived tables:
SELECT * FROM (
Query1
) AS q1
CROSS JOIN (
Query2
) AS q2
CROSS JOIN (...
Assuming that each of your individual queries only returns one row, then this CROSS JOIN should result in only one row.

How to optimize view performance in SQL Server 2012 by indexing

I have a view like that:
create view dbo.VEmployeeSalesOrders
as
select
employees.employeeID, Products.productID,
Sum(Price * Quantity) as Total,
salesDate,
COUNT_BIG() as [RecordCount]
from
dbo.Employees
inner join
dbo.sales on employees.employeeID = sales.employeeID
inner join
dbo.products on sales.productID = products.ProductID
group by
Employees.employeeID, products.ProductID, salesDate
When I select * from dbo.VEmployeeSalesOrders it takes 97% of the execution plan. It needs it to be faster.
And when I try to create an index, an exception fires with the following message:
select list doesn't include a proper use on count_Big()
Why am getting this error?

1-first you need to alter your view and make it contains COUNT_BIG() function because you used aggregate function in select statment,
AND THE REASON FOR USING THAT is that SQL Server needs to track the record where the record is ,number of records
like this
create view dbo.VEmployeeSalesOrders
as
select employees.employeeID,Products.productID,Sum(Price*Quantity)
as Total,salesDate,COUNT_BIG(*) as [RecordCount]
from dbo.Employees
inner join dbo.sales on employees.employeeID=sales.employeeID
inner join dbo.products on sales.productID-products.ProductID
group by Employees.employeeID,products.ProductID,salesDate
2- then you need to create index like that
Create Unique Clustered Index Cidx_IndexName
on dbo.VEmployeeSalesOrders(employedID,ProductID,SalesDate)
Hope It Works

Small ms Sql query to get the max of an id with some criteria

I want sql query to get the above result. The result is the maximum Id in TableA whose s_id in TableB has Stat=true i.e. 1.
The following does not do what I want:
select i.category_id,i.image_id,i.image_original,i.image_title,i.photographer
from images i
inner join schedule s
on i.scheduleid=s.scheduleid
and s.status='live'
where image_id=(select max(image_id) from images)

Use TOP to retrieve only 1 row
Use ORDER BY to control the sorting, so you get the single row you want
SELECT TOP(1) a.id, a.[image], a.s_id, b.stat, b.[desc]
FROM TableA a
JOIN TableB b on a.s_id = b.s_id
WHERE b.stat = 1
ORDER BY A.ID DESC
An SQLFiddle showing this.

Updating a table with missing records from another table

I am running a query that returns records from the LEFT of the EXCEPT that are not on the right query;
USE AdventureWorks;
GO
SELECT ProductID
FROM Production.Product
EXCEPT
SELECT ProductID
FROM Production.WorkOrder;
Lets say there are 6 records returned (there are 6 records in Production.Product table, that are not in Production.WorkOrder)
How would I write the query to update the 6 records into Production.WorkOrder table?

insert into workorder (productid)
select productid from product where productid not in (select productid from workorder)
This will insert into workorder all the productid's in the product table that aren't already in workorder.

I'd use a left join, like this:
USE AdventureWorks;
GO
SELECT p.ProductID
FROM Production.Product p
LEFT
JOIN Production.WorkOrder wo
ON p.ProductID = wo.ProductID
WHERE wo.ProductID IS NULL;
This will return all the ProductID values from Product that do not appear in WorkOrder. The problem with the other answer (WHERE NOT IN) is that the sub-query will execute once/row, and if the tables are large, this will be really slow. A LEFT JOIN will only execute once, and then SQL will match the rows up - on a small table, there won't be much difference in practice, but on a larger table or in a production database, the difference will be immense.

Just turn your query into an INSERT?
INSERT INTO Production.WorkOrder(ProductID, ...)
SELECT ProductID, ...
FROM Production.Product
EXCEPT
SELECT ProductID
FROM Production.WorkOrder;

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

FreeText COUNT query on multiple tables is super slow - sql-server

I ran into a similar problem but i fixed it with union, something like: SELECT * FROM Product inner join Brand on Product.BrandID = Brand.ID WHERE contains(Product.Name, 'pants') UNION SELECT * FROM Product inner join Brand on Product.BrandID = Brand.ID WHERE contains(Brand.Name, 'pants'))

Have you tried something like: SELECT count(*) FROM Product INNER JOIN Brand ON Product.BrandID = Brand.ID WHERE CONTAINS((Product.Name, Brand.Name), 'pants')

Related

Does NOT EXIST (in the absense of where clause in the subquery) implicitly assume that where clause is on column selected by subquery?

SQL queries combined into one row

How to optimize view performance in SQL Server 2012 by indexing

Small ms Sql query to get the max of an id with some criteria

Updating a table with missing records from another table

Categories

Resources