Understanding an ambiguous column name for inner query - sql-server

I ran into a weird query today that I thought would be failed, but it succeeded in an unexpected way. Here's a minimal reproduction of it.
Tables and data:
CREATE TABLE Employee(ID int, Name varchar(max))
CREATE TABLE Engineer(ID int, Title varchar(max))
GO
INSERT INTO Employee(ID, Name) VALUES (1, 'Bobby')
INSERT INTO Employee(ID, Name) VALUES (2, 'Sue')
INSERT INTO Engineer(ID, Title) VALUES (1, 'Electrical Engineer')
INSERT INTO Engineer(ID, Title) VALUES (2, 'Network Engineer')
Queries:
--Find all Engineers with same title as Bobby has
SELECT * FROM Engineer WHERE Title IN (select Title from Employee WHERE Name = 'Bobby')
This returns all rows in Engineer table (unexpected, I thought it would fail). Note that the above query is incorrect. The inner query uses a column "Title" which doesn't exist in the table being selected from ("Employee"). So it must be binding the Title column value from Engineer in the outer query....which is always equal to itself so all rows are returned I think.
I can force it too if I fully qualify the column name, and that would fail as expected:
SELECT * FROM Engineer WHERE Title IN
(select Empl.Title from Employee Empl WHERE Name = 'Bobby')
This fails with "Invalid column name 'Title'."
Apparently if I were to add the Title column to the Employee table, it uses the Employee.Title column value instead.
ALTER TABLE Employee ADD Title varchar(max)
GO
UPDATE Employee SET Title = 'Electrical Engineer' WHERE ID = 1
UPDATE Employee SET Title = 'Network Engineer' WHERE ID = 2
SELECT * FROM Engineer WHERE Title IN
(select Title from Employee WHERE Name = 'Bobby')
This returns just one row (as expected).
I kind of understand what is happening here, what I'm looking for is a link to some documentation or some keyword that would help me read up and understand it fully (or even some explanation).

Of course it fails. There is no column named Title in your Employee table. In the query that does work it is a subquery so it is pulling Title from Engineer.
You can avoid this entirely if you develop the habit of ALWAYS referencing columns with 2 part naming instead of just the column name.
But in your queries you should start learning how to use joins instead of subqueries for everything. Your code would be far less confusing.

Since Title is not qualified it uses the Title from table Engineer
SELECT * FROM Engineer WHERE Title IN (select Title from Employee WHERE Name = 'Bobby')
In the last it uses the closest Title (from Employee) .
If you use alias and 2 part name then you stay out of this confusion.
As far as documentation. Finding closest column is probably an undocumented feature.

I found the documentation on the behavior: Qualifying Column Names in Subqueries
The general rule is that column names in a statement are implicitly qualified by the table referenced in the FROM clause at the same level. If a column does not exist in the table referenced in the FROM clause of a subquery, it is implicitly qualified by the table referenced in the FROM clause of the outer query.

Related

SQL Server 2016 : need to rank attributes of char type from different tables

I have a small db for a college SQL class project. The database is a collection of information regarding scuba training.
I have a table which is CLASS and has a column IREQ char(2). This is a list of classes and the IREQ column is the instructor type required for that class.
Another table is INST with a column INSTYPE char(2). This is a table of instructor information and the INSTYPE column is the instructors current type.
A third table is ITITLE with three columns; TNUM int, TITLE varchar (40), TABR char(2). TNUM is sequential numbers for each row for ranking. TITLE is the full name of each trainer level. TABR is the two character abbreviation of the TITLE and corresponds to INSTYPE and IREQ in the previous tables.
I need to check which instructors have a high enough trainer level to teach a given set of courses for a given month.
I have the Class selection and month with
where
CNUMBER like 'SD____'
and 7 = month(STARTDATE)
The SELECT command includes each instructors First/Alias/Last names, Instructor Number, Class Number, Class Name, Start Date
From all three tables.
I've tried a subquery within a subquery but get an error requiring an 'exists' statement after the where.
I've tried a few other things but can't get the conversion to a number value for ranking comparison. I only have the 2 character field, that is an abbreviation for the title, that is common between all three tables but can't be a foreign key (that I can see). The only thing I can think of is to somehow set a value for the IREQ or TNUM fields equal to the TNUM value for the matching TABR field. THE ITITLE table is exactly 8 rows and the TABR are not in alphabetical order relative to the TNUM value. I made the table exclusively to assign a number to the TABR so I could do this and a similar comparison between another set of tables.
Thanks!
Try using with the windows functions in SQL
https://www.sqlshack.com/use-window-functions-sql-server/
Since you haven't provided the source of STARTDATE nor the CNUMBER, you can adapt this to your needs (add the where clause and update the columns you need):
select c.*, t1.*, i.*
from CLASS c
join ITITLE t1 on c.IREQ = t1.TABR
join ITITLE t2 on t1.TNUM <= t2.TNUM
join INST i on t2.TABR = i.INSTYPE
This will list every class with every instructor with high enough rank to teach it.

SQL Server FullText Search with Weighted Columns from Previous One Column

In the database on which I am attempting to create a FullText Search I need to construct a table with its column names coming from one column in a previous table. In my current implementation attempt the FullText indexing is completed on the first table Data and the search for the phrase is done there, then the second table with the search results is made.
The schema for the database is
**Players**
Id
PlayerName
Blacklisted
...
**Details**
Id
Name -> FirstName, LastName, Team, Substitute, ...
...
**Data**
Id
DetailId
PlayerId
Content
DetailId in the table Data relates to Id in Details, and PlayerId relates to Id in Players. If there are 1k rows in Players and 20 rows in Details, then there are 20k rows in Data.
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK]) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, '"Some phrase like team name and player name"')
AS KT ON c. DataID = KT.[KEY]
GROUP BY c.PlayerID
)
…
Then a table is made by selecting the rows in one column. Similar to a pivot.
…
SELECT rc.Rnk,
c.PlayerID,
PlayerName,
TeamID,
…
(SELECT Content FROM dbo.Data data WHERE DetailID = 1 AND data.PlayerID = c.PlayerID) AS [TeamName],
…
FROM dbo.Players c
JOIN RankedPlayers rc ON c. PlayerID = rc. PlayerID
ORDER BY rc.Rnk DESC
I can return a ranked table with this implementation, the aim however is to be able to produce results from weighted columns, so say the column Playername contributes to the rank more than say TeamName.
I have tried making a schema bound view with a pivot, but then I cannot index it because of the pivot. I have tried making a view of that view, but it seems the metadata is inherited, plus that feels like a clunky method.
I then tried to do it as a straight query using sub queries in the select statement, but cannot due to indexing not liking sub queries.
I then tried to join multiple times, again the index on the view doesn't like self-referencing joins.
How to do this?
I have come across this article http://developmentnow.com/2006/08/07/weighted-columns-in-sql-server-2005-full-text-search/ , and other articles here on weighted columns, however nothing as far as I can find addresses weighting columns when the columns were initially row data.
A simple solution that works really well. Put weight on the rows containing the required IDs in another table, left join that table to the table to which the full text search had been applied, and multiply the rank by the weight. Continue as previously implemented.
In code that comes out as
DECLARE #Weight TABLE
(
DetailID INT,
[Weight] FLOAT
);
INSERT INTO #Weight VALUES
(1, 0.80),
(2, 0.80),
(3, 0.50);
WITH RankedPlayers AS
(
SELECT PlayerID, SUM(KT.[RANK] * ISNULL(cw.[Weight], 0.10)) AS Rnk
FROM Data c
INNER JOIN FREETEXTTABLE(dbo.Data, Content, 'Karl Kognition C404') AS KT ON c.DataID = KT.[KEY]
LEFT JOIN #Weight cw ON c.DetailID = cw.DetailID
GROUP BY c.PlayerID
)
SELECT rc.Rnk,
...
I'm using a temporary table here for evidence of concept. I am considering adding a column Weights to the table Details to avoid an unnecessary table and left join.

Relating 2 columns in a database

I can't find a way to call the database and ask for it to return specific entries.
I have a database called "Barrios" and another called "Localidades."
"Barrios" has 3 columns: Id (pk, int), Localidad (Fk, int), Barrio (varchar(50))
"Localidades" has 2: Id (PK, int) and Zona (varchar(50))
In "Localidades" I have the states, and in the Barrios I have the neighborhoods.
How do I format my database so that when someone inserts a neighborhood, the associated state is populated as well?
You could use a function in combination with an expression for the computed column?
SQL Server computed column select from another table
select * from Barrios inner join Localidades on Barrios.Localidad = Localidades.Id where Localidades.Id = 1
thats the sentence, i just made it :)

Strongly typed Dataset, Coumn DBNull

I have a stored procedure that works and returns some columns including "Column_A".
I am able to get Column_A value in a row when I execute Stored Procudure within SQL Manageemnt Tools.
When I try to preview the same row within a typed dataset Column_A is always null.
Any idea What could be wrong ?
Strongly-typed dataset rules of engagement.
The column name must be EXACT.
The data-type should match.
The column_names order should match. Which also means the total number of columns in the result query should equal the number of columns in the datatable definition.
Aka, if you have
Select e.LastName, e.FirstName, e.SSN from dbo.Employee
your strong dataset should be
LastName (string)
FirstName (string)
SSN (string)
It cannot be a deviation like any of the below:
Last_Name (string)
FName (string)
SSNumber (string)
If your strong-dataset-table__column does not allow "nulls", then you must code a dummy-value in the procedure.
example:
Select e.LastName, e.FirstName, IsNull(e.SSN, '') as SSN from dbo.Employee
Every time I have one of these issues, I always find I violated one of these mini rules.
Because I added the column myself, "Source" property of DataColumn was not set, so setting it or in other words mapping it to right column, fixed the issue.

Ordering SQL Results based on Input Params

In conjunction with the fn_split function, I'm returning a list of results from a table based on comma separated values.
The Stored Procedure T-SQL is as follows:
SELECT ProductCode, ProductDesc, ImageAsset, PriceEuros, PriceGBP, PriceDollars,
replace([FileName],' ','\_') as [filename],
ID as FileID, weight
from Products
LEFT OUTER JOIN Assets on Assets.ID = Products.ImageAsset
where ProductCode COLLATE DATABASE_DEFAULT IN
(select [value] from fn\_split(#txt,','))
and showOnWeb = 1
I pass in to the #txt param the following (as an example):
ABC001,ABC009,ABC098,ABC877,ABC723
This all works fine, however the results are not returned in any particular order - I need the products returning in the 'SAME ORDER' as the input param.
Unfortunately this is a live site with a built schema, so I can't change anything on it (but I wish I could) - otherwise I would make it more sensible.
If all of the references that are passed in on the #txt param are unique you could use CharIndex to find their position within the param e.g.
order by charindex(ProductCode, #txt)
In the stored procedure, I would create a table which has a numeric key which is the PK for the temp table and set to auto-increment. Then, I would insert the results of fn_split into that table, so that you have the parameters as they are ordered in the list (the auto-increment will take care of that).
Then, join the result set to this temp table, ordering by the numeric key.
If the entries in the parameter list are not unique, then after inserting the records into the temp table, I would delete any duplicates, except for the record where the id is the minimum for that particular parameter.

Resources