Column name Arguments in the "With" Statement in SQL Server - sql-server

From the Microsoft online document (https://learn.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-ver15), it mentioned that the "With" statement can have a column name as an argument, and then it says that:
"The list of column names is optional only if distinct names for all resulting columns are supplied in the query definition."
What is "if distinct names for all resulting columns are supplied in the query definition" actually means? I use the "With" statement very often, but I never specify column names in the argument.
I tried to go through the entire document but it appears nowhere have explained this in further detail.
Does anyone know under what situation I need to put specify the column name?
Thanks in advance!

Quite simply, the resultset of the query that defines the CTE must return a set of columns with distinct names. For example, the following will not work:
with cte as (select 1 as x, 2 as x)
select * from cte;
The resultset has 2 columns named "x". In such a case, you MUST supply the column names in the definition of the cte since the query produces a resultset with duplicate names. So you would need to use the form:
with cte(x, y) as (select 1 as x, 2 as x)
select * from cte;
As a general matter, it is a best practice for any resultset to NOT have duplicate column names.

Related

I dont understand this Update query can you explain

so here is a little ms sql server query:
update tblSaleReturnMain set sync=0
from tblSaleOrderMain s join tblSaleReturnMain r on r.ID=s.intReturnOrderId
where s.sync=0
it updates my "tblSaleReturnMain" table just fine, also I wrote this query myself, but I dont know why it works. My question is, with all the many join-ed tables that I could reference after the "from" clause, and all the possible data that can be produced, how does this query know that the tblSaleReturnMain mentioned in "update tblSaleReturnMain .." is the same that is being filtered in join statement? Is that always like a protocol, like we mention a table before the "set" keyword and do not give it an alias, but then go on filtering/joining its data any way we like, and what remains in a resultset is what the "set" statement will apply to?
My question is specifically for Update statements that have JOINS after the FROM keyword.
Also this question is not about "how to use join in Update statement", because I already did that successfully above.
Yes, as long as you only use the target table once in the FROM SQL Server will assume it is the same table reference. From the docs (emphasis mine):
FROM <table_source>
Specifies that a table, view, or derived table source is used to provide the criteria for the update operation. For more information, see FROM (Transact-SQL).
If the object being updated is the same as the object in the FROM clause and there is only one reference to the object in the FROM clause, an object alias may or may not be specified. If the object being updated appears more than one time in the FROM clause, one, and only one, reference to the object must not specify a table alias. All other references to the object in the FROM clause must include an object alias.
If you reference the same table more than once and try to update it using just the table name rather than the alias, you'll get an error along the lines of:
Msg 8154 Level 16 State 1 Line 2
The table 'tblSaleReturnMain ' is ambiguous.
You can reference the same table more than once, but if doing so you must use the alias as the table_or_view_name, e.g.
UPDATE Alias
SET Col = 1
FROM dbo.T1 AS Alias
INNER JOIN dbo.T1 AS Alias2
ON Alias.ID = Alias2.ID;
Examples on DB<>Fiddle
I personally always use the alias regardless of whether the full table reference would be ambiguous.
SQL Server's UPDATE ... FROM syntax is non-standard and confusing.
Much better to write a CTE, examine the results, and then update the CTE, eg:
with q as
(
select r.sync
from tblSaleOrderMain s
join tblSaleReturnMain r
on r.ID=s.intReturnOrderId
where s.sync=0
)
update q set sync = 0

What does "a" do at the end of a sql statement?

I have a code here in the screenshot. At the end of the code you see a "a"
When i try to remove the "a" and run the code, it fails but it works with the "a"
what is the significance of this ?
Edit: Question was originally tagged MySQL. However, the explanation below should still apply for all the major RDBMS.
It is an Aliasname for the Derived Table. A Derived table is basically a sub-select query. In MySQL, every Derived Table should have its own Alias, so that outer Select queries can refer to the columns/expressions from the Derived Table. Without a table name/alias, MySQL cannot determine the origin of a column value unambiguously.
From Docs:
The [AS] tbl_name clause is mandatory because every table in a FROM
clause must have a name. Any columns in the derived table must have
unique names.

Group by an evaluated field (sql server) [duplicate]

Why are column ordinals legal for ORDER BY but not for GROUP BY? That is, can anyone tell me why this query
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY OrgUnitID
cannot be written as
SELECT OrgUnitID, COUNT(*) FROM Employee AS e GROUP BY 1
When it's perfectly legal to write a query like
SELECT OrgUnitID FROM Employee AS e ORDER BY 1
?
I'm really wondering if there's something subtle about the relational calculus, or something, that would prevent the grouping from working right.
The thing is, my example is pretty trivial. It's common that the column that I want to group by is actually a calculation, and having to repeat the exact same calculation in the GROUP BY is (a) annoying and (b) makes errors during maintenance much more likely. Here's a simple example:
SELECT DATEPART(YEAR,LastSeenOn), COUNT(*)
FROM Employee AS e
GROUP BY DATEPART(YEAR,LastSeenOn)
I would think that SQL's rule of normalize to only represent data once in the database ought to extend to code as well. I'd want to only right that calculation expression once (in the SELECT column list), and be able to refer to it by ordinal in the GROUP BY.
Clarification: I'm specifically working on SQL Server 2008, but I wonder about an overall answer nonetheless.
One of the reasons is because ORDER BY is the last thing that runs in a SQL Query, here is the order of operations
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
so once you have the columns from the SELECT clause you can use ordinal positioning
EDIT, added this based on the comment
Take this for example
create table test (a int, b int)
insert test values(1,2)
go
The query below will parse without a problem, it won't run
select a as b, b as a
from test
order by 6
here is the error
Msg 108, Level 16, State 1, Line 3
The ORDER BY position number 6 is out of range of the number of items in the select list.
This also parses fine
select a as b, b as a
from test
group by 1
But it blows up with this error
Msg 164, Level 15, State 1, Line 3
Each GROUP BY expression must contain at least one column that is not an outer reference.
There is a lot of elementary inconsistencies in SQL, and use of scalars is one of them. For example, anyone might expect
select * from countries
order by 1
and
select * from countries
order by 1.00001
to be a similar queries (the difference between the two can be made infinitesimally small, after all), which are not.
I'm not sure if the standard specifies if it is valid, but I believe it is implementation-dependent. I just tried your first example with one SQL engine, and it worked fine.
use aliasses :
SELECT DATEPART(YEAR,LastSeenOn) as 'seen_year', COUNT(*) as 'count'
FROM Employee AS e
GROUP BY 'seen_year'
** EDIT **
if GROUP BY alias is not allowed for you, here's a solution / workaround:
SELECT seen_year
, COUNT(*) AS Total
FROM (
SELECT DATEPART(YEAR,LastSeenOn) as seen_year, *
FROM Employee AS e
) AS inline_view
GROUP
BY seen_year
databases that don't support this basically are choosing not to. understand the order of the processing of the various steps, but it is very easy (as many databases have shown) to parse the sql, understand it, and apply the translation for you. Where its really a pain is when a column is a long case statement. having to repeat that in the group by clause is super annoying. yes, you can do the nested query work around as someone demonstrated above, but at this point it is just lack of care about your users to not support group by column numbers.

how to use all columns (*) in select statement from different tables using UNION ALL?

I have 10 tables of which 4 tables have 99 columns and 6 tables have 100 columns. I have to combine using UNION ALL. when executing SQL query getting below error
Msg 205, Level 16, State 1, Line 6
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
I understood the reason of error is for not same number of columns. I tried using NULL as Column100 but still getting same error.
please can anyone suggest me how to use * and UNION ALL in SQL query.
Thanks.
If the extra column happens to be at the beginning or end and the other columns are in exactly the same order, then you can add the column manually:
select t99.*, 't99' as col
from t99
union all
select t100.*
from t100;
But really, is it that hard to list the columns? An explicit column list is much less prone to error. And, it will work regardless of where the 100th column appears.
You can get the list in SQL Server Management Studio by clicking on the table name. You can also run a query such as:
select column_name
from information_schema.columns
where table_name = 't99';
And then use the column names to construct the query (I often use a spreadsheet for this purpose).
UNION requres that columns before and after it MATCH.
You can not do union of 99 columns and then 100 columns. You have to either provide dummy value for 100th column that do not exist in that table, or tell DB to skipp that column.
So add to the smaller table select:
NULL AS missing-column-name
Or list all the common columns by hand omitting columns that do not exists in both.

Setting alias name from a subquery in SQL

In my Select query I just want to to set the alias name of a column based on a sub-query (that is, a value in another table). Is this possible in SQL Server 2008?
Like:
SELECT tax_Amt AS (SELECT tax FROM Purchase.tblTax WHERE tax_ID=#tax_ID)
FROM Table
Any way to achieve the above query?
No, you cannot dynamically set an alias or column name in standard SQL.
You'd have to use dynamic SQL if you want: but note the alias applies to the column therefore all rows have the same alias. You can't vary the alias row by row
Personally, I'd have an extra column called "TaxType" or such because it sounds like you want to vary the name per row. I'd do that anyway even if all rows have the same alias so my client code expects "TaxType"
Try like this:
SELECT (SELECT tax FROM Purchase.tblTax WHERE tax_ID=#tax_ID) AS tax_Amt
FROM Table

Resources