Nested Subquery vs Derived table - sql-server

I'm learning SQL Server and have a question between nested Subquery vs Derived table using from clause.
Example for nested Subquery where it is using the from clause.
Example was taken from the link : https://www.tutorialgateway.org/sql-subquery/
USE [SQL Tutorial]
GO
SELECT subquery.FirstName + ' ' + subquery.LastName AS [Full Name]
,subquery.[Occupation]
,subquery.[YearlyIncome]
,subquery.[Sales]
FROM (
SELECT [Id]
,[FirstName]
,[LastName]
,[Education]
,[Occupation]
,[YearlyIncome]
,[Sales]
FROM [Employee Table]
WHERE [Sales] > 500
) AS [subquery]
Example for Derived table where it is using the from clause.
Example was taken from the link : https://www.tutorialgateway.org/sql-derived-table/
USE [SQLTEST]
GO
SELECT *
FROM (
SELECT [EmpID]
,[FirstName]
,[LastName]
,[Education]
,[YearlyIncome]
,[Sales]
,[DeptID]
FROM [EmployeeDetails]
) AS [Derived Employee Details]
WHERE [Sales] > 500
what makes the nested subquery different form the derived table.
Thank you for your time.

A derived table is specifically a subquery that is used in the from clause, that returns a result set with an arbitrary number of columns and rows.
A subquery is more generic and refers to any query-within-a-query. One type of subquery, for instance, is a scalar subquery. Such a subquery returns at most one row and one column. It can be used in select and where (and some other places) where a scalar value can be used. A scalar subquery can also be used in the from clause.

Related

Snowflake: Unsupported subquery for DISTINCT - Column order matters?

I have two related tables (unnecessary columns not listed):
LOCATION
VENUE_ID - NUMBER(38,0)
VISIT
ID - NUMBER(38,0)
VENUE_ID - NUMBER(38,0)
DEVICE_ID - VARCHAR(16777216)
The tables are related such that visits are associated with a location via VENUE_ID.
I'm attempting to get the count of unique device ids by location, so I wrote the following query:
SELECT "d"."VENUE_ID"
, (
SELECT COUNT(*)
FROM (
SELECT DISTINCT "f0"."DEVICE_ID"
FROM "MAIN"."VISIT" AS "f0"
WHERE "d"."VENUE_ID" = "f0"."VENUE_ID"
) AS "t")
FROM "MAIN"."LOCATION" AS "d"
Unfortunately, this query resulted in the cryptic error SQL compilation error: Unsupported subquery type cannot be evaluated.
Through a bit of experimentation, I've found that I can get the query to return without error, but only if I add an additional (useless) subquery prior to the existing one in the SELECT:
SELECT "d"."VENUE_ID"
-- New Useless Subquery
, (
SELECT COUNT(*)
FROM "MAIN"."VISIT" AS "f"
WHERE "d"."VENUE_ID" = "f"."VENUE_ID")
--
, (
SELECT COUNT(*)
FROM (
SELECT DISTINCT "f0"."DEVICE_ID"
FROM "MAIN"."VISIT" AS "f0"
WHERE "d"."VENUE_ID" = "f0"."VENUE_ID"
) AS "t")
FROM "MAIN"."LOCATION" AS "d"
If I move the new subquery to anywhere in the select after the distinct subquery, the error returns. I've reviewed the documentation on subqueries in Snowflake and either I am not understanding how that applies to my query here or I'm facing undocumented behavior. Anyone have any idea what's going on here?
I think you're making this more complex than this needs to be. Below should be all you need:
SELECT l.venue_id
, count(distinct v.device_id)
FROM location l
LEFT JOIN visit v
on l.venue_id = v.venue_id
GROUP BY l.venue_id
The answer is a little cryptic, but what happens is this:
You are asking for ONE value and you need to guarantee that only ONE value is returned by your subquery. A distinct clause cannot guarantee that. In some databases that will work as long as the data returns one row, but the moment you get two rows then the database will throw an error.
Snowflake is strict on its subquery analysis. So you need to use a subquery that is guarantee to return always one value, for example select sum(..), select count(..)

Cannot use WITH statement two times

I wan't to create a view that is constructed like this:
(simplified)
Create VIEW viewAll AS
With TempLevel1 AS
(
SELECT statement
)
With TempLevel2 AS (SELECT * from TempLevel1)
SELECT * from TempLevel2
The problem is that I cannot use With statement like this because of
the following error:
Incorrect syntax near the keyword 'With'.
Incorrect syntax near the keyword 'with'.
If this statement is a
common table expression, an xmlnamespaces clause or a change tracking
context clause, the previous statement must be terminated with a
semicolon.
I have to specify that the SELECT queries are way more complex and I do have to use With two times.
Would it be a better practice to create the first with statement as another view like viewTempLevel1 (and use it in the With TempLevel2 statement)?
From the documentation for Common table Expressions (CTE), you can
Use a comma to separate multiple CTE definitions
Example is (taken straight out of the docs)
WITH Sales_CTE (SalesPersonID, TotalSales, SalesYear)
AS
-- Define the first CTE query.
(
SELECT SalesPersonID, SUM(TotalDue) AS TotalSales, YEAR(OrderDate) AS SalesYear
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY SalesPersonID, YEAR(OrderDate)
)
, -- Use a comma to separate multiple CTE definitions.
-- Define the second CTE query, which returns sales quota data by year for each sales person.
Sales_Quota_CTE (BusinessEntityID, SalesQuota, SalesQuotaYear)
AS
(
SELECT BusinessEntityID, SUM(SalesQuota)AS SalesQuota, YEAR(QuotaDate) AS SalesQuotaYear
FROM Sales.SalesPersonQuotaHistory
GROUP BY BusinessEntityID, YEAR(QuotaDate)
)
-- Define the outer query by referencing columns from both CTEs.
SELECT SalesPersonID...
In your case, the syntax would be...
With TempLevel1 AS
( SELECT statement [...]),
TempLevel2 AS
(SELECT * from TempLevel1)
SELECT * from TempLevel2
You don't need to repeat the WITH keyword. Separate the CTE expressions by comma:
With CTE_Level1 AS
(
SELECT statement
),
CTE_Level2 AS
(
SELECT * from CTE_Level1
)
SELECT * from CTE_Level2

Why do I need to use "as" keyword in this sql query?

I have this SQL query:
select top(1)
salary
from
(select top(2) salary
from employee
order by salary desc) as b
order by
salary asc
If I don't utilize as b it will give me an error:
Incorrect syntax near ...
Why is mandatory to use as in this query?
You don't need the as keyword. In fact, I advise using as for column aliases but not for table aliases. So, I would write this as:
select top(1) salary
from (select top(2) salary
from employee
order by salary desc
) b
order by salary asc;
You do need the table alias for the subquery, because SQL Server requires that all subqueries in the from clause be named.
This is TSql syntax. Subquery in FROM must have an alias even it's never used. Oracle for example considers this alias optional.
This is because you have a sub-query that, according to the Transact-SQL documentation on FROM, makes the use of an alias mandatory:
When a derived table, rowset or table-valued function, or operator clause (such as PIVOT or UNPIVOT) is used, the required table_alias at the end of the clause is the associated table name for all columns, including grouping columns, returned.
Note that with derived table the kind of sub-query is intended that you use in your SQL statement:
derived_table
Is a subquery that retrieves rows from the database. derived_table is used as input to the outer query.
Because you are using 'salary' twice. Without an alias the interpreter won't know what 'salary' to order the results by. By using an alias it can discern between employee.salary and b.salary.
A different approach to get the 2nd highest salary... as if you need the 3rd or 4th you're approach would get much more challenging...
SELECT *
FROM (SELECT salary, row_number() over (order by salary desc) rn
FROM employee) E
WHERE rn = 2
You are creating two queries. The first one selects the top 2 salaries from employee. You are calling this list "b". Then you are selecting the top salary from "b".

What does a select statement inside a select statement mean?

What does a Transact SQL statement Select statement inside a from mean?
I mean something like this
.. from (
select ..
)
Also, I need to know if the statement is bad for performance. Can you provide me a link to the official documentation about this topic in Transact SQL?
I think you are talking about subquery. A subquery is used to return data that will be used in the main query as a condition to further restrict the data to be retrieved.
Please refer this link:- http://www.tutorialspoint.com/sql/sql-sub-queries.htm
See this link on MSDN about Subquery Fundamentals.
Subqueries can be fine, but be warned that they are not indexed. If the outer part of the query must join to the results of the subquery, performance will likely suffer. Note that the query optimizer may also choose a different execution order for your query, so even if you "start from" a subquery, the optimizer may start the query somewhere else and join to your subquery.
Correlated Subqueries (Joe Stefanelli linked here first in the comments above) are another performance problem. Any time you have a query that must be run repeatedly for the results of an outer query, performance will suffer.
See this link about Common Table Expressions (CTEs). CTEs may be a better way to write your query. Other alternatives to subqueries include #table variables and #temporary tables.
One of the most common uses of subqueries is when updating a table. You cannot have an aggregate function in the SET list of an UPDATE statement. You have to calculate the aggregate in a subquery, then join back to the main query to update the table. For example:
-- A table of state and the total sales per state
declare #States table
(
ID varchar(2) primary key,
totalSales decimal(10,2)
)
-- Individual sales per state
declare #Sales table
(
salesKey int identity(1,1) primary key,
stateID varchar(2),
sales decimal(10,2)
)
-- Generate test data with no sales totalled
insert into #States (ID, totalSales)
select 'CA', 0
union select 'NY', 0
-- Test sales
insert into #Sales (stateID, sales)
select 'CA', 5000
union select 'NY', 5500
-- This query will cause an error:
-- Msg 157, Level 15, State 1, Line 13
-- An aggregate may not appear in the set list of an UPDATE statement.
update #States
set totalSales = SUM(sales)
from #States
inner join #Sales on stateID = ID
-- This query will succeed, because the subquery performs the aggregate
update #States
set totalSales = sumOfSales
from
(
select stateID, SUM(sales) as sumOfSales
from #Sales
group by stateID
) salesSubQuery
inner join #States on ID = stateID
select * from #States
You'll find lots of information on this with a quick search. For example, see
Subquery Fundamentals from MSDN
A subquery is a query that is nested inside a SELECT, INSERT, UPDATE,
or DELETE statement, or inside another subquery. A subquery can be
used anywhere an expression is allowed.

Use of table valued function in order by clause

Can I use my table valued function in order by clause of my select query????
Like this :
declare #ID int
set #ID=9011
Exec ('select top 10 * from cs_posts order by ' + (select * from dbo.gettopposter(#ID)) desc)
GetTopPoster(ID) is my table valued function.
Please help me on this.
You can use a table-valued function with a join. That also allows you to choose any combination of columns to sort by:
select top 10 *
from cs_posts p
join dbo.gettopposter(#ID) as gtp
on p.poster_id = gtp.poster_id
order by
gtp.col1
, gtp.col2
Yes. You can use a Table Valued Function just as a normal table.
Your query is not valid SQL though, despite the TVF.
For further reference:
http://msdn.microsoft.com/en-us/library/ms191165.aspx
You can't do it like that - how does it know what to order by? It doesn't know how the TVF relates to the original query. You can join the two however (as I assume cs_posts has an id column which relates to the TVF) and then order by the the TVF id column.

Resources