What does a select statement inside a select statement mean? - sql-server

What does a Transact SQL statement Select statement inside a from mean?
I mean something like this
.. from (
select ..
)
Also, I need to know if the statement is bad for performance. Can you provide me a link to the official documentation about this topic in Transact SQL?

I think you are talking about subquery. A subquery is used to return data that will be used in the main query as a condition to further restrict the data to be retrieved.
Please refer this link:- http://www.tutorialspoint.com/sql/sql-sub-queries.htm

See this link on MSDN about Subquery Fundamentals.
Subqueries can be fine, but be warned that they are not indexed. If the outer part of the query must join to the results of the subquery, performance will likely suffer. Note that the query optimizer may also choose a different execution order for your query, so even if you "start from" a subquery, the optimizer may start the query somewhere else and join to your subquery.
Correlated Subqueries (Joe Stefanelli linked here first in the comments above) are another performance problem. Any time you have a query that must be run repeatedly for the results of an outer query, performance will suffer.
See this link about Common Table Expressions (CTEs). CTEs may be a better way to write your query. Other alternatives to subqueries include #table variables and #temporary tables.
One of the most common uses of subqueries is when updating a table. You cannot have an aggregate function in the SET list of an UPDATE statement. You have to calculate the aggregate in a subquery, then join back to the main query to update the table. For example:
-- A table of state and the total sales per state
declare #States table
(
ID varchar(2) primary key,
totalSales decimal(10,2)
)
-- Individual sales per state
declare #Sales table
(
salesKey int identity(1,1) primary key,
stateID varchar(2),
sales decimal(10,2)
)
-- Generate test data with no sales totalled
insert into #States (ID, totalSales)
select 'CA', 0
union select 'NY', 0
-- Test sales
insert into #Sales (stateID, sales)
select 'CA', 5000
union select 'NY', 5500
-- This query will cause an error:
-- Msg 157, Level 15, State 1, Line 13
-- An aggregate may not appear in the set list of an UPDATE statement.
update #States
set totalSales = SUM(sales)
from #States
inner join #Sales on stateID = ID
-- This query will succeed, because the subquery performs the aggregate
update #States
set totalSales = sumOfSales
from
(
select stateID, SUM(sales) as sumOfSales
from #Sales
group by stateID
) salesSubQuery
inner join #States on ID = stateID
select * from #States

You'll find lots of information on this with a quick search. For example, see
Subquery Fundamentals from MSDN
A subquery is a query that is nested inside a SELECT, INSERT, UPDATE,
or DELETE statement, or inside another subquery. A subquery can be
used anywhere an expression is allowed.

Related

What is the "lifespan" of a postgres CTE expression? e.g. WITH... AS

I have a CTE I am using to pull some data from two tables then stick in an intermediate table called cte_list, something like
with cte_list as (
select pl.col_val from prune_list pl join employees.employee emp on pl.col_val::uuid = emp.id
where pl.col_nm = 'employee_ref_id' limit 100
)
Then, I am doing an insert to move records from the cte_list to another archive table (if they don't exist) called employee_arch_test
insert into employees.employee_arch_test (
select * from employees.employee where id in (select col_val::uuid from cte_list)
and not exists (select 1 from employees.employee_arch_test where employees.employee_arch_test.id=employees.employee.id)
);
This seems to work fine. The problem is when I add another statement after, to do some deletions from the main employee table using this aforementioned cte_list - the cte_list apparently no longer exists?
SQL Error [42P01]: ERROR: relation "cte_list" does not exist
the actual delete query:
delete from employees.employee where id in (select col_val::uuid from cte_list);
Can the cte_list CTE table only be used once or something? I'm running these statements in a LOOP and I need to run the exact same calls for about 2 or 3 other tables but hit a sticking point here.
A CTE only exists for the duration of the statement of which it's a part. I gather you have an INSERT statement with the CTE preceding it:
with cte_list
as (select pl.col_val
from prune_list pl
join employees.employee emp
on pl.col_val::uuid = emp.id
where pl.col_nm = 'employee_ref_id'
limit 100
)
insert into employees.employee_arch_test
(select *
from employees.employee
where id in (select col_val::uuid from cte_list)
and not exists (select 1
from employees.employee_arch_test
where employees.employee_arch_test.id = employees.employee.id)
);
The CTE is part of the INSERT statement - it is not a separate statement by itself. It only exists for the duration of the INSERT statement.
If you need something which lasts longer your options are:
Add the same CTE to each of your following statements. Note that because data may be changing in your database each invocation of the CTE may return different data.
Create a view which performs the same operations as the CTE, then use the view in place of the CTE. Note that because data may be changing in your database each invocation of the view may return different data.
Create a temporary table to hold the data from your CTE query, then use the temporary table in place of the CTE. This has the advantage of providing a consistent set of data to all operations.

sub-query return more than 1 value this is not permitted when the subquery follows = , >=,>,<,<= [duplicate]

I run the following query:
SELECT
orderdetails.sku,
orderdetails.mf_item_number,
orderdetails.qty,
orderdetails.price,
supplier.supplierid,
supplier.suppliername,
supplier.dropshipfees,
cost = (SELECT supplier_item.price
FROM supplier_item,
orderdetails,
supplier
WHERE supplier_item.sku = orderdetails.sku
AND supplier_item.supplierid = supplier.supplierid)
FROM orderdetails,
supplier,
group_master
WHERE invoiceid = '339740'
AND orderdetails.mfr_id = supplier.supplierid
AND group_master.sku = orderdetails.sku
I get the following error:
Msg 512, Level 16, State 1, Line 2
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Any ideas?
Try this:
SELECT
od.Sku,
od.mf_item_number,
od.Qty,
od.Price,
s.SupplierId,
s.SupplierName,
s.DropShipFees,
si.Price as cost
FROM
OrderDetails od
INNER JOIN Supplier s on s.SupplierId = od.Mfr_ID
INNER JOIN Group_Master gm on gm.Sku = od.Sku
INNER JOIN Supplier_Item si on si.SKU = od.Sku and si.SupplierId = s.SupplierID
WHERE
od.invoiceid = '339740'
This will return multiple rows that are identical except for the cost column. Look at the different cost values that are returned and figure out what is causing the different values. Then ask somebody which cost value they want, and add the criteria to the query that will select that cost.
Check to see if there are any triggers on the table you are trying to execute queries against. They can sometimes throw this error as they are trying to run the update/select/insert trigger that is on the table.
You can modify your query to disable then enable the trigger if the trigger DOES NOT need to be executed for whatever query you are trying to run.
ALTER TABLE your_table DISABLE TRIGGER [the_trigger_name]
UPDATE your_table
SET Gender = 'Female'
WHERE (Gender = 'Male')
ALTER TABLE your_table ENABLE TRIGGER [the_trigger_name]
SELECT COLUMN
FROM TABLE
WHERE columns_name
IN ( SELECT COLUMN FROM TABLE WHERE columns_name = 'value');
note: when we are using sub-query we must focus on these points:
if our sub query returns 1 value in this case we need to use (=,!=,<>,<,>....)
else (more than one value), in this case we need to use (in, any, all, some )
cost = Select Supplier_Item.Price from Supplier_Item,orderdetails,Supplier
where Supplier_Item.SKU=OrderDetails.Sku and
Supplier_Item.SupplierId=Supplier.SupplierID
This subquery returns multiple values, SQL is complaining because it can't assign multiple values to cost in a single record.
Some ideas:
Fix the data such that the existing subquery returns only 1 record
Fix the subquery such that it only returns one record
Add a top 1 and order by to the subquery (nasty solution that DBAs hate - but it "works")
Use a user defined function to concatenate the results of the subquery into a single string
The fix is to stop using correlated subqueries and use joins instead. Correlated subqueries are essentially cursors as they cause the query to run row-by-row and should be avoided.
You may need a derived table in the join in order to get the value you want in the field if you want only one record to match, if you need both values then the ordinary join will do that but you will get multiple records for the same id in the results set. If you only want one, you need to decide which one and do that in the code, you could use a top 1 with an order by, you could use max(), you could use min(), etc, depending on what your real requirement for the data is.
I had the same problem , I used in instead of = , from the Northwind database example :
Query is : Find the Companies that placed orders in 1997
Try this :
SELECT CompanyName
FROM Customers
WHERE CustomerID IN (
SELECT CustomerID
FROM Orders
WHERE YEAR(OrderDate) = '1997'
);
Instead of that :
SELECT CompanyName
FROM Customers
WHERE CustomerID =
(
SELECT CustomerID
FROM Orders
WHERE YEAR(OrderDate) = '1997'
);
Either your data is bad, or it's not structured the way you think it is. Possibly both.
To prove/disprove this hypothesis, run this query:
SELECT * from
(
SELECT count(*) as c, Supplier_Item.SKU
FROM Supplier_Item
INNER JOIN orderdetails
ON Supplier_Item.sku = orderdetails.sku
INNER JOIN Supplier
ON Supplier_item.supplierID = Supplier.SupplierID
GROUP BY Supplier_Item.SKU
) x
WHERE c > 1
ORDER BY c DESC
If this returns just a few rows, then your data is bad. If it returns lots of rows, then your data is not structured the way you think it is. (If it returns zero rows, I'm wrong.)
I'm guessing that you have orders containing the same SKU multiple times (two separate line items, both ordering the same SKU).
The select statement in the cost part of your select is returning more than one value. You need to add more where clauses, or use an aggregation.
The error implies that this subquery is returning more than 1 row:
(Select Supplier_Item.Price from Supplier_Item,orderdetails,Supplier where Supplier_Item.SKU=OrderDetails.Sku and Supplier_Item.SupplierId=Supplier.SupplierID )
You probably don't want to include the orderdetails and supplier tables in the subquery, because you want to reference the values selected from those tables in the outer query. So I think you want the subquery to be simply:
(Select Supplier_Item.Price from Supplier_Item where Supplier_Item.SKU=OrderDetails.Sku and Supplier_Item.SupplierId=Supplier.SupplierID )
I suggest you read up on correlated vs. non-correlated subqueries.
As others have suggested, the best way to do this is to use a join instead of variable assignment. Re-writing your query to use a join (and using the explicit join syntax instead of the implicit join, which was also suggested--and is the best practice), you would get something like this:
select
OrderDetails.Sku,
OrderDetails.mf_item_number,
OrderDetails.Qty,
OrderDetails.Price,
Supplier.SupplierId,
Supplier.SupplierName,
Supplier.DropShipFees,
Supplier_Item.Price as cost
from
OrderDetails
join Supplier on OrderDetails.Mfr_ID = Supplier.SupplierId
join Group_Master on Group_Master.Sku = OrderDetails.Sku
join Supplier_Item on
Supplier_Item.SKU=OrderDetails.Sku and Supplier_Item.SupplierId=Supplier.SupplierID
where
invoiceid='339740'
Even after 9 years of the original post, this helped me.
If you are receiving these types of errors without any clue, there should be a trigger, function related to the table, and obviously it should end up with an SP, or function with selecting/filtering data NOT USING Primary Unique column. If you are searching/filtering using the Primary Unique column there won't be any multiple results. Especially when you are assigning value for a declared variable. The SP never gives you en error but only an runtime error.
"System.Data.SqlClient.SqlException (0x80131904): Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
The statement has been terminated."
In my case obviously there was no clue, but only this error message. There was a trigger connected to the table and the table updating by the trigger also had another trigger likewise it ended up with two triggers and in the end with an SP. The SP was having a select clause which was resulting in multiple rows.
SET #Variable1 =(
SELECT column_gonna_asign
FROM dbo.your_db
WHERE Non_primary_non_unique_key= #Variable2
If this returns multiple rows, you are in trouble.

SQL WHERE NOT EXISTS (skip duplicates)

Hello I'm struggling to get the query below right. What I want is to return rows with unique names and surnames. What I get is all rows with duplicates
This is my sql
DECLARE #tmp AS TABLE (Name VARCHAR(100), Surname VARCHAR(100))
INSERT INTO #tmp
SELECT CustomerName,CustomerSurname FROM Customers
WHERE
NOT EXISTS
(SELECT Name,Surname
FROM #tmp
WHERE Name=CustomerName
AND ID Surname=CustomerSurname
GROUP BY Name,Surname )
Please can someone point me in the right direction here.
//Desperate (I tried without GROUP BY as well but get same result)
DISTINCT would do the trick.
SELECT DISTINCT CustomerName, CustomerSurname
FROM Customers
Demo
If you only want the records that really don't have duplicates (as opposed to getting duplicates represented as a single record) you could use GROUP BY and HAVING:
SELECT CustomerName, CustomerSurname
FROM Customers
GROUP BY CustomerName, CustomerSurname
HAVING COUNT(*) = 1
Demo
First, I thought that #David answer is what you want. But rereading your comments, perhaps you want all combinations of Names and Surnames:
SELECT n.CustomerName, s.CustomerSurname
FROM
( SELECT DISTINCT CustomerName
FROM Customers
) AS n
CROSS JOIN
( SELECT DISTINCT CustomerSurname
FROM Customers
) AS s ;
Are you doing that while your #Tmp table is still empty?
If so: your entire "select" is fully evaluated before the "insert" statement, it doesn't do "run the query and add one row, insert the row, run the query and get another row, insert the row, etc."
If you want to insert unique Customers only, use that same "Customer" table in your not exists clause
SELECT c.CustomerName,c.CustomerSurname FROM Customers c
WHERE
NOT EXISTS
(SELECT 1
FROM Customers c1
WHERE c.CustomerName = c1.CustomerName
AND c.CustomerSurname = c1.CustomerSurname
AND c.Id <> c1.Id)
If you want to insert a unique set of customers, use "distinct"
Typically, if you're doing a WHERE NOT EXISTS or WHERE EXISTS, or WHERE NOT IN subquery,
you should use what is called a "correlated subquery", as in ypercube's answer above, where table aliases are used for both inside and outside tables (where inside table is joined to outside table). ypercube gave a good example.
And often, NOT EXISTS is preferred over NOT IN (unless the WHERE NOT IN is selecting from a totally unrelated table that you can't join on.)
Sometimes if you're tempted to do a WHERE EXISTS (SELECT from a small table with no duplicate values in column), you could also do the same thing by joining the main query with that table on the column you want in the EXISTS. Not always the best or safest solution, might make query slower if there are many rows in that table and could cause many duplicate rows if there are dup values for that column in the joined table -- in which case you'd have to add DISTINCT to the main query, which causes it to SORT the data on all columns.
-- Not efficient at all.
And, similarly, the WHERE NOT IN or NOT EXISTS correlated subqueries can be accomplished (and give the exact same execution plan) if you LEFT OUTER JOIN the table you were going to subquery -- and add a WHERE . IS NULL.
You have to be careful using that, but you don't need a DISTINCT. Frankly, I prefer to use the WHERE NOT IN subqueries or NOT EXISTS correlated subqueries, because the syntax makes the intention clear and it's hard to go wrong.
And you do not need a DISTINCT in the SELECT inside such subqueries (correlated or not). It would be a waste of processing (and for WHERE EXISTS or WHERE IN subqueries, the SQL optimizer would ignore it anyway and just use the first value that matched for each row in the outer query). (Hope that makes sense.)

Is there a way to optimize the query given below

I have the following Query and i need the query to fetch data from SomeTable based on the filter criteria present in the Someothertable. If there is nothing present in SomeOtherTable Query should return me all the data present in SomeTable
SQL SERVER 2005
SomeOtherTable does not have any indexes or any constraint all fields are char(50)
The Following Query work fine for my requirements but it causes performance problems when i have lots of parameters.
Due to some requirement of Client, We have to keep all the Where clause data in SomeOtherTable. depending on subid data will be joined with one of the columns in SomeTable.
For example the Query can can be
SELECT
*
FROM
SomeTable
WHERE
1=1
AND
(
SomeTable.ID in (SELECT DISTINCT ID FROM SomeOtherTable WHERE Name = 'ABC' and subid = 'EF')
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC' and subid = 'EF')
)
AND
(
SomeTable.date =(SELECT date FROM SomeOtherTable WHERE Name = 'ABC' and subid = 'Date')
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC' and subid = 'Date')
)
EDIT----------------------------------------------
I think i might have to explain my problem in detail:
We have developed an ASP.net application that is used to invoke parametrize crystal reports, parameters to the crystal reports are not passed using the default crystal reports method.
In ASP.net application we have created wizards which are used to pass the parameters to the Reports, These parameters are not directly consumed by the crystal report but are consumed by the Query embedded inside the crystal report or the Stored procedure used in the Crystal report.
This is achieved using a table (SomeOtherTable) which holds parameter data as long as report is running after which the data is deleted, as such we can assume that SomeOtherTable has max 2 to 3 rows at any given point of time.
So if we look at the above query initial part of the Query can be assumed as the Report Query and the where clause is used to get the user input from the SomeOtherTable table.
So i don't think it will be useful to create indexes etc (May be i am wrong).
SomeOtherTable does not have any
indexes or any constraint all fields
are char(50)
Well, there's your problem. There's nothing you can do to a query like this which will improve its performance if you create it like this.
You need a proper primary or other candidate key designated on all of your tables. That is to say, you need at least ONE unique index on the table. You can do this by designating one or more fields as the PK, or you can add a UNIQUE constraint or index.
You need to define your fields properly. Does the field store integers? Well then, an INT field may just be a better bet than a CHAR(50).
You can't "optimize" a query that is based on an unsound schema.
Try:
SELECT
*
FROM
SomeTable
LEFT JOIN SomeOtherTable ON SomeTable.ID=SomeOtherTable.ID AND Name = 'ABC'
WHERE
1=1
AND
(
SomeOtherTable.ID IS NOT NULL
OR
0=(SELECT Count(1) FROM SomeOtherTable WHERE spName = 'ABC')
)
also put 'with (nolock)' after each table name to improve performance
The following might speed you up
SELECT *
FROM SomeTable
WHERE
SomeTable.ID in
(SELECT DISTINCT ID FROM SomeOtherTable Where Name = 'ABC')
UNION
SELECT *
FROM SomeTable
Where
NOT EXISTS (Select spName From SomeOtherTable Where spName = 'ABC')
The UNION will effectivly split this into two simpler queries which can be optiomised separately (depends very much on DBMS, table size etc whether this will actually improve performance -- but its always worth a try).
The "EXISTS" key word is more efficient than the "SELECT COUNT(1)" as it will return true as soon as the first row is encountered.
Or check if the value exists in db first
And you can remove the distinct keyword in your query, it is useless here.
if EXISTS (Select spName From SomeOtherTable Where spName = 'ABC')
begin
SELECT *
FROM SomeTable
WHERE
SomeTable.ID in
(SELECT ID FROM SomeOtherTable Where Name = 'ABC')
end
else
begin
SELECT *
FROM SomeTable
end
Aloha
Try
select t.* from SomeTable t
left outer join SomeOtherTable o
on t.id = o.id
where (not exists (select id from SomeOtherTable where spname = 'adbc')
OR spname = 'adbc')
-Edoode
change all your select statements in the where part to inner jons.
the OR conditions should be union all-ed.
also make sure your indexing is ok.
sometimes it pays to have an intermediate table for temp results to which you can join to.
It seems to me that there is no need for the "1=1 AND" in your query. 1=1 will always evaluate to be true, leaving the software to evaluate the next part... why not just skip the 1=1 and evaluate the juicy part?
I am going to stick to my original Query.

Is recursion good in SQL Server?

I have a table in SQL server that has the normal tree structure of Item_ID, Item_ParentID.
Suppose I want to iterate and get all CHILDREN of a particular Item_ID (at any level).
Recursion seems an intuitive candidate for this problem and I can write an SQL Server function to do this.
Will this affect performance if my table has many many records?
How do I avoid recursion and simply query the table? Please any suggestions?
With the new MS SQL 2005 you could use the WITHkeyword
Check out this question and particularly this answer.
With Oracle you could use CONNECT BY keyword to generate hierarchical queries (syntax).
AFAIK with MySQL you'll have to use the recursion.
Alternatively you could always build a cache table for your records parent->child relationships
As a general answer, it is possible to do some pretty sophisticated stuff in SQL Server that normally needs recursion, simply by using an iterative algorithm. I managed to do an XHTML parser in Transact SQL that worked surprisingly well. The the code prettifier I wrote was done in a stored procedure. It aint elegant, it is rather like watching buffalo doing Ballet. but it works .
Are you using SQL 2005?
If so you can use Common Table Expressions for this. Something along these lines:
;
with CTE (Some, Columns, ItemId, ParentId) as
(
select Some, Columns, ItemId, ParentId
from myTable
where ItemId = #itemID
union all
select a.Some, a.Columns, a.ItemId, a.ParentId
from myTable as a
inner join CTE as b on a.ParentId = b.ItemId
where a.ItemId <> b.ItemId
)
select * from CTE
The problem you will face with recursion and performance is how many times it will have to recurse to return the results. Each recursive call is another separate call that will have to be joined into the total results.
In SQL 2k5 you can use a common table expression to handle this recursion:
WITH Managers AS
(
--initialization
SELECT EmployeeID, LastName, ReportsTo
FROM Employees
WHERE ReportsTo IS NULL
UNION ALL
--recursive execution
SELECT e.employeeID,e.LastName, e.ReportsTo
FROM Employees e INNER JOIN Managers m
ON e.ReportsTo = m.employeeID
)
SELECT * FROM Managers
or another solution is to flatten the hierarchy into a another table
Employee_Managers
ManagerId (PK, FK to Employee table)
EmployeeId (PK, FK to Employee table)
All the parent child relation ships would be stored in this table, so if Manager 1 manages Manager 2 manages employee 3, the table would look like:
ManagerId EmployeeId
1 2
1 3
2 1
This allows the hierarchy to be easily queried:
select * from employee_managers em
inner join employee e on e.employeeid = em.employeeid and em.managerid = 42
Which would return all employees that have manager 42. The upside will be greater performance, but downside is going to be maintaining the hierarchy
Joe Celko has a book (<- link to Amazon) specifically on tree structures in SQL databases. While you would need recursion for your model and there would definitely be a potential for performance issues there, there are alternative ways to model a tree structure depending on what your specific problem involves which could avoid recursion and give better performance.
Perhaps some more detail is in order.
If you have a master-detail relationship as you describe, then won't a simple JOIN get what you need?
As in:
SELECT
SOME_FIELDS
FROM
MASTER_TABLE MT
,CHILD_TABLE CT
WHERE CT.PARENT_ID = MT.ITEM_ID
You shouldn't need recursion for children - you're only looking at the level directly below (i.e. select * from T where ParentId = #parent) - you only need recursion for all descendants.
In SQL2005 you can get the descendants with:
with AllDescendants (ItemId, ItemText) as (
select t.ItemId, t.ItemText
from [TableName] t
where t.ItemId = #ancestorId
union
select sub.ItemId, sub.ItemText
from [TableName] sub
inner join [TableName] tree
on tree.ItemId = sub.ParentItemId
)
You don't need recursion at all....
Note, I changed columns to ItemID and ItemParentID for ease of typing...
DECLARE #intLevel INT
SET #intLevel = 1
INSERT INTO TempTable(ItemID, ItemParentID, Level)
SELECT ItemID, ItemParentID, #intLevel
WHERE ItemParentID IS NULL
WHILE #intLevel < #TargetLevel
BEGIN
SET #intLevel = #intLevel + 1
INSERT INTO TempTable(ItemID, ItemParentID, Level)
SELECt ItemID, ItemParentID, #intLevel
WHERE ItemParentID IN (SELECT ItemID FROM TempTable WHERE Level = #intLevel-1)
-- If no rows are inserted then there are no children
IF ##ROWCOUNT = 0
BREAK
END
SELECt ItemID FROM TempTable WHERE Level = #TargetLevel

Resources