Why the stored procedure needs an alphabet as suffix? - sql-server

I have a stored Procedure as:
CREATE PROCEDURE [dbo].[spGetEmployeesNotInSkill]
AS
BEGIN
SELECT COUNT (*) as Total FROM
(
SELECT tblUser.EmployeeID FROM tblUser where tblUser.FirstName <> 'guest'
EXCEPT
SELECT tblSkillMetrics.EmployeeID FROM tblSkillMetrics
) r -- why is 'r' used here?
END
What I want to know is why are we using this r? If we change the r to any other letter "a/b/c/...x/y/z" it gives correct output, but if we remove it shows error.
Can anyone please explain this to me?

Whenever you introduce a subquery, CTE, or anything else the will be providing rows as part of a query, you need to provide a name by which that particular set of rows may be referred to elsewhere in the query.
In the case of a table or view, the introduction of an alias is optional, and if omitted, the name of the table or view is used. But for anything else, the name must be explicitly provided.
E.g. you could have had:
SELECT COUNT (*) as Total FROM (
SELECT tblUser.EmployeeID FROM tblUser u where u.FirstName <> 'guest'
EXCEPT
SELECT tblSkillMetrics.EmployeeID FROM tblSkillMetrics)r
Where I've now introduced an alias(u) for tblUser in the inner query. And so in your query, r is the name/alias that's being used for the subquery as a whole.

The r is being used as a table alias. The query is saying "select the number of records in this table that I'm going to call r". So you can call it anything you like.
When you select from a virtual table like this you have to give it an alias - it might be more helpful to call it something like SkilledUsers but as it's only used in one place it's quite common only to use a single character.

It's a quirk of the SQL Server implementation of SQL. The "r" is an alias for the subquery: SQL Server requires that it must be named, even if it is not otherwise referenced. You could just as easily have named it FOO or SUBQUERY or EMPLOYEES.
Other implementations of SQL don't impose this syntactic restriction.

We are using r as an table alias.When u use select * from (select cn from table_name) r you are assigning name r to derived table (select cn from table_name).It will be useful when you are using joins.

Related

How do I use the contents of a SQL column as a database name in another select's from statement?

I have a database that contains a table. wherein one of the columns in that table is [database_name]. I want to use the contents of that column to link to a table in another database and return the contents of one of the columns in that table.
Here's what I'm trying to do, but it doesn't work.
SELECT t1.col1
,t1.col2
,t1.col3
,t1.database_name
,t1.cust_id
,(SELECT t2.ordered_on
FROM [t1.database_name].[dbo].[order_info] t2
WHERE t2.cust_id = t1.cust_id
) as order_placed_on
FROM [my_database].[dbo].[table1] t1
ORDER BY order_placed_on DESC
The problem, of course, is that SQL Server takes the [t1.data_source] thing literally, and doesn't substitute the CONTENTS of the column t1.data_source.
I also tried declaring a variable #databaseName and using #databaseName.[dbo].[order_info] but I can't seem to get that to work either. Here's the example of that non-working code:
DECLARE #databaseName nvarchar(50)
SELECT t1.col1
,t1.col2
,t1.col3
,t1.database_name
,#databaseName = t1.database_name
,t1.cust_id
,(SELECT t2.ordered_on
FROM #databaseName.[dbo].[order_info] t2
WHERE t2.cust_id = t1.cust_id
) as order_placed_on
FROM [my_database].[dbo].[table1] t1
ORDER BY order_placed_on DESC
That code gives me
Incorrect syntax near '.'"
Clearly it doesn't like the #databaseName.[dbo].[order_info] format. And if I add brackets around the #dataSource variable, so it looks like [#databaseName].[dbo].[order_info] I get a "A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operations." error. No clue why. select #variableName = column_name is precisely how you assign data to a variable in a select statement. So I have no idea what's going on there... Something is getting it confused, but I don't know what, and at this point, I just need this stupid thing to work.
Anyway, I just need to know how to make SQL Server substitute the CONTENTS of the column for the column name (or variable name) in that second select's from statement.

Microsoft SQL Server: run arbitrary query and save result into temp table

Given an arbitrary select query, how can I save its results into a temporary table?
To simplify things let's assume the select query does not contain an order by clause at the top level; it's not dynamic SQL; it really is a select (not a stored procedure call), and it's a single query (not something that returns multiple result sets). All of the columns have an explicit name. How can I run it and save the results to a temp table? Either by processing the SQL on the client side, or by something clever in T-SQL.
I am not asking about any particular query -- obviously, given some particular SQL I could rewrite it by hand to save into a temp table -- but about a rule that will work in general and can be programmed.
One possible "answer" that does not work in general
For simple queries you can do
select * into #tmp from (undl) x where undl is the underlying SQL query. But this fails if undl is a more complex query; for example if it uses common table expressions using with.
For similar reasons with x as (undl) select * into #tmp from x does not work in general; with clauses cannot be nested.
My current approach, but not easy to program
The best I've found is to find the top level select of the query and munge it to add into #tmp just before the from keyword. But finding which select to munge is not easy; it requires parsing the whole query in the general case.
Possible solution with user-defined function
One approach may be to create a user-defined function wrapping the query, then select * into #tmp from dbo.my_function() and drop the function afterwards. Is there something better?
More detail on why the simple approach fails when the underlying uses CTEs. Suppose I try the rule select * into #tmp from (undl) x where undl is the underlying SQL. Now let undl be with mycte as (select 5 as mycol) select mycol from mycte. Once the rule is applied, the final query is select * into #tmp from (with mycte as (select 5 as mycol) select mycol from mycte) x which is not valid SQL, at least not on my version (MSSQL 2016). with clauses cannot be nested.
To be clear, CTEs must be defined at the top level before the select. They cannot be nested and cannot appear in subqueries. I fully understand that and it's why I am asking this question. An attempt to wrap the SQL that ends up trying to nest the CTEs will not work. I am looking for an approach that will work.
"Put an into right before the select". This will certainly work but requires parsing the SQL in the general case. It's not always obvious (to a computer program) which select needs to change. I did try the rule of adding it to the last select in the query, but this also fails. For example if the underlying query is
with mycte as (select 5 as mycol) select mycol from mycte except select 6
then the into #x needs to be added to the second select, not to the one that appears after except. Getting this right in the general case involves parsing the SQL into a syntax tree.
In the end creating a user-defined function appears to be the only general answer. If undl is the underlying select query, then you can say
create function dbo.myfunc() returns table as return (undl)
go
select * into #tmp from dbo.myfunc()
go
drop function dbo.myfunc
go
The pseudo-SQL go indicates starting a new batch. The create function must be executed in one batch before the select, otherwise you get a syntax error. (Just separating them with ; is not enough.)
This approach works even when undl contains subqueries or common table expressions using with. However, it does not work when the query uses temporary tables.

Determine Tables Referenced By Letters In Stored Procedure

I have a stored procedure that references a few tables. However, it refers to the tables with letters.
So let's say a column called Name is from the table Users, then the stored procedure may call the column name U.Users.
My question is, how do I get a list of all such mappings i.e all the letters that map to a table?
you are referring to table aliases, each distinct query can have their own "mapping". These alias values are not specific to stored procedures. Here is an example:
select
a.col1, a.col2
FROM YourTable1 a
select
b.col1, b.col2
from YourTable2 a
inner join YourTable1 b on a.col1=b.col2
YourTable1.col1 and YourTable1.col2 are returned in both of the above queries, although they have the "a" alias in the first query and "b" alias in the second query. See this Using Table Aliases.
In examples like above, people often use a table alias because is is quicker to write a.col1 than YourTable.col1. There is no way for anyone to know the aliases used in your stored procedure, you need to figure that out, look at these examples to help:
FROM YourTable a
-- ^table ^alias
FROM YourTable AS a
-- ^table ^alias
FROM YourTable1 a
INNER JOIN YourTable2 b ON a.col1=b.col1
-- ^table ^alias
FROM YourTable1 AS a
INNER JOIN YourTable2 AS b ON a.col1=b.col1
-- ^table ^alias
Each statement assigns its own aliases for the tables it uses, if at all. The same table used in different statements could be aliased differently (or not aliased). There cannot possibly be a single place to look up every table alias or a simple method to recover them from all the statements in your stored procedure or function or view etc.

Wrong case in subquery column name causes incorrect results, but no error

Using SQL Server Management Studio, I am getting some undesired results (looks like a bug to me..?)
If I use (FIELD rather than field for the other_table):
SELECT * FROM main_table WHERE field IN (SELECT FIELD FROM other_table)
I get all results from main_table.
Using the correct case:
SELECT * FROM main_table WHERE field IN (SELECT field FROM other_table)
I get the expected results where field appears in other.
Running the subquery on it's own:
SELECT FIELD FROM other_table
I get an invalid column name error.
Surely I should get this error in the first case?
Is this related to collation?
The DB is binary collation.
The server is case insensitive however.
It seems to me like the server component is saying "this code is OK" and not allowing the DB to say the field is the wrong name..?
What are my options for a solution?
Let's illustrate what is happening using something that doesn't depend on case sensitivity:
USE tempdb;
GO
CREATE TABLE dbo.main_table(column1 INT);
CREATE TABLE dbo.other_table(column2 INT);
INSERT dbo.main_table SELECT 1 UNION ALL SELECT 2;
INSERT dbo.other_table SELECT 1 UNION ALL SELECT 3;
SELECT column1 FROM dbo.main_table
WHERE column1 IN (SELECT column1 FROM dbo.other_table);
Results:
column1
-------
1
2
Why doesn't that raise an error? SQL Server is looking at your queries and seeing that the column1 inside can't possibly be in other_table, so it is extrapolating and "using" the column1 that exists in the outer referenced table (just like you could reference a column that only exists in the outer table without a table reference). Think about this variation:
SELECT [column1] FROM dbo.main_table
WHERE EXISTS (SELECT [column1] FROM dbo.other_table WHERE [column2] = [column1]);
Results:
column1
-------
1
Again SQL Server knows that column1 in the where clause also doesn't exist in the locally referenced table, but it tries to find it in the outer scope. So in an imaginary world you might consider the query to actually be saying:
SELECT m.[column1] FROM dbo.main_table AS m
WHERE EXISTS (SELECT m.[column1] FROM dbo.other_table AS o WHERE o.[column2] = m.[column1]);
(Which is not how I typed it, but if I do type it that way, it still works.)
It doesn't make logical sense in some of the cases but this is the way the query engine does it and the rule has to be applied consistently. In your case (no pun intended), you have an extra complication: case sensitivity. SQL Server didn't find FIELD in your subquery, but it did find it in the outer query. So a couple of lessons:
Always prefix your column references with the table name or alias (and always prefix your table references with the schema).
Always create and reference your tables, columns and other entities using consistent case. Especially when using a binary or case-sensitive collation.
Very interesting find. The unspoken mandate is that you always should alias tables in your subqueries and use those aliases to be explicit about which table your column comes from. Subqueries allow you to make reference to a field from your outer query which is the cause of your issue, but in your scenario I would agree that either the default should be the internal query's field list, or to give you a column ambiguity error. Regardless, this method below is always preferable:
select * from main_table a where a.field in
(select x.field from other_table x)

Order Of Execution of the SQL query

I am confused with the order of execution of this query, please explain me this.
I am confused with when the join is applied, function is called, a new column is added with the Case and when the serial number is added. Please explain the order of execution of all this.
select Row_number() OVER(ORDER BY (SELECT 1)) AS 'Serial Number',
EP.FirstName,Ep.LastName,[dbo].[GetBookingRoleName](ES.UserId,EP.BookingRole) as RoleName,
(select top 1 convert(varchar(10),eventDate,103)from [3rdi_EventDates] where EventId=13) as EventDate,
(CASE [dbo].[GetBookingRoleName](ES.UserId,EP.BookingRole)
WHEN '90 Day Client' THEN 'DC'
WHEN 'Association Client' THEN 'DC'
WHEN 'Autism Whisperer' THEN 'DC'
WHEN 'CampII' THEN 'AD'
WHEN 'Captain' THEN 'AD'
WHEN 'Chiropractic Assistant' THEN 'AD'
WHEN 'Coaches' THEN 'AD'
END) as Category from [3rdi_EventParticipants] as EP
inner join [3rdi_EventSignup] as ES on EP.SignUpId = ES.SignUpId
where EP.EventId = 13
and userid in (
select distinct userid from userroles
--where roleid not in(6,7,61,64) and roleid not in(1,2))
where roleid not in(19, 20, 21, 22) and roleid not in(1,2))
This is the function which is called from the above query.
CREATE function [dbo].[GetBookingRoleName]
(
#UserId as integer,
#BookingId as integer
)
RETURNS varchar(20)
as
begin
declare #RoleName varchar(20)
if #BookingId = -1
Select Top 1 #RoleName=R.RoleName From UserRoles UR inner join Roles R on UR.RoleId=R.RoleId Where UR.UserId=#UserId and R.RoleId not in(1,2)
else
Select #RoleName= RoleName From Roles where RoleId = #BookingId
return #RoleName
end
Queries are generally processed in the follow order (SQL Server). I have no idea if other RDBMS's do it this way.
FROM [MyTable]
ON [MyCondition]
JOIN [MyJoinedTable]
WHERE [...]
GROUP BY [...]
HAVING [...]
SELECT [...]
ORDER BY [...]
SQL is a declarative language. The result of a query must be what you would get if you evaluated as follows (from Microsoft):
Logical Processing Order of the SELECT statement
The following steps show the logical
processing order, or binding order,
for a SELECT statement. This order
determines when the objects defined in
one step are made available to the
clauses in subsequent steps. For
example, if the query processor can
bind to (access) the tables or views
defined in the FROM clause, these
objects and their columns are made
available to all subsequent steps.
Conversely, because the SELECT clause
is step 8, any column aliases or
derived columns defined in that clause
cannot be referenced by preceding
clauses. However, they can be
referenced by subsequent clauses such
as the ORDER BY clause. Note that the
actual physical execution of the
statement is determined by the query
processor and the order may vary from
this list.
FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
The optimizer is free to choose any order it feels appropriate to produce the best execution time. Given any SQL query, is basically impossible to anybody to pretend it knows the execution order. If you add detailed information about the schema involved (exact tables and indexes definition) and the estimated cardinalities (size of data and selectivity of keys) then one can take a guess at the probable execution order.
Ultimately, the only correct 'order' is the one described ion the actual execution plan. See Displaying Execution Plans by Using SQL Server Profiler Event Classes and Displaying Graphical Execution Plans (SQL Server Management Studio).
A completely different thing though is how do queries, subqueries and expressions project themselves into 'validity'. For instance if you have an aliased expression in the SELECT projection list, can you use the alias in the WHERE clause? Like this:
SELECT a+b as c
FROM t
WHERE c=...;
Is the use of c alias valid in the where clause? The answer is NO. Queries form a syntax tree, and a lower branch of the tree cannot be reference something defined higher in the tree. This is not necessarily an order of 'execution', is more of a syntax parsing issue. It is equivalent to writing this code in C#:
void Select (int a, int b)
{
if (c = ...) then {...}
int c = a+b;
}
Just as in C# this code won't compile because the variable c is used before is defined, the SELECT above won't compile properly because the alias c is referenced lower in the tree than is actually defined.
Unfortunately, unlike the well known rules of C/C# language parsing, the SQL rules of how the query tree is built are somehow esoteric. There is a brief mention of them in Single SQL Statement Processing but a detailed discussion of how they are created, and what order is valid and what not, I don't know of any source. I'm not saying there aren't good sources, I'm sure some of the good SQL books out there cover this topic.
Note that the syntax tree order does not match the visual order of the SQL text. For example the ORDER BY clause is usually the last in the SQL text, but as a syntax tree it sits above everything else (it sorts the output of the SELECT, so it sits above the SELECTed columns so to speak) and as such is is valid to reference the c alias:
SELECT a+b as c
FROM t
ORDER BY c;
SQL query is not imperative but declarative, so you have no idea which the statement is executed first, but since SQL is evaluated by SQL query engines, most of the SQL engines follows similar process to obtain the results. You may have to understand how the query engine works internally to understand some SQL execution behavior.
Julia Evens has a great post explaining this, it is worth to check it out:
https://jvns.ca/blog/2019/10/03/sql-queries-don-t-start-with-select/
SQL is a declarative language, meaning that it tells the SQL engine what to do, not how. This is in contrast to an imperative language such as C, in which how to do something is clearly laid out.
This means that not all statements will execute as expected. Of particular note are boolean expressions, which may not evaluate from left-to-right as written. For example, the following code is not guaranteed to execute without a divide by zero error:
SELECT 'null' WHERE 1 = 1 OR 1 / 0 = 0
The reason for this is the query optimizer chooses the best (most efficient) way to execute a statement. This means that, for example, a value may be loaded and filtered before a transforming predicate is applied, causing an error. See the second link above for an example
See: here and here.
"Order of execution" is probably a bad mental model for SQL queries. Its hard to actually write a single query that would actually depend on order of execution (this is a good thing). Instead you should think of all join and where clauses happening simultaneously (almost like a template)
That said you could run display the Execution Plans which should give you insight into it.
However since its's not clear why you want to know the order of execution, I'm guessing your trying to get a mental model for this query so you can fix it in some way. This is how I would "translate" your query, although I've done well with this kind of analysis there's some grey area with how precise it is.
FROM AND WHERE CLAUSE
Give me all the Event Participants rows. from [3rdi_EventParticipants
Also give me all the Event Signup rows that match the Event Participants rows on SignUpID inner join 3rdi_EventSignup] as ES on EP.SignUpId = ES.SignUpId
But Only for Event 13 EP.EventId = 13
And only if the user id has a record in the user roles table where the role id is not in 1,2,19,20,21,22
userid in (
select distinct userid from userroles
--where roleid not in(6,7,61,64) and roleid not in(1,2))
where roleid not in(19, 20, 21, 22) and roleid not in(1,2))
SELECT CLAUSE
For each of the rows give me a unique ID
Row_number() OVER(ORDER BY (SELECT 1)) AS 'Serial Number',
The participants First Name EP.FirstName
The participants Last Name Ep.LastName
The Booking Role name GetBookingRoleName
Go look in the Event Dates and find out what the first eventDate where the EventId = 13 that you find
(select top 1 convert(varchar(10),eventDate,103)from [3rdi_EventDates] where EventId=13) as EventDate
Finally translate the GetBookingRoleName in Category. I don't have a table for this so I'll map it manually (CASE [dbo].[GetBookingRoleName](ES.UserId,EP.BookingRole)
WHEN '90 Day Client' THEN 'DC'
WHEN 'Association Client' THEN 'DC'
WHEN 'Autism Whisperer' THEN 'DC'
WHEN 'CampII' THEN 'AD'
WHEN 'Captain' THEN 'AD'
WHEN 'Chiropractic Assistant' THEN 'AD'
WHEN 'Coaches' THEN 'AD'
END) as Category
So a couple of notes here. You're not ordering by anything when you select TOP. You should probably have na order by there. You could also just as easily put that in your from clause e.g.
from [3rdi_EventParticipants] as EP
inner join [3rdi_EventSignup] as ES on EP.SignUpId = ES.SignUpId,
(select top 1 convert(varchar(10),eventDate,103)
from [3rdi_EventDates] where EventId=13
Order by eventDate) dates
There is a logical order to evaluation of the query text, but the database engine can choose what order execute the query components based upon what is most optimal. The logical text parsing ordering is listed below. That is, for example, why you can't use an alias from SELECT clause in a WHERE clause. As far as the query parsing process is concerned, the alias doesn't exist yet.
FROM
ON
OUTER
WHERE
GROUP BY
CUBE | ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
See the Microsoft documentation (see "Logical Processing Order of the SELECT statement") for more information on this.
Simplified order for T-SQL -> SELECT statement:
1) FROM
2) Cartesian product
3) ON
4) Outer rows
5) WHERE
6) GROUP BY
7) HAVING
8) SELECT
9) Evaluation phase in SELECT
10) DISTINCT
11) ORDER BY
12) TOP
as I had done so far - same order was applicable in SQLite.
Source => SELECT (Transact-SQL)
... of course there are (rare) exceptions.

Resources