Using multipart column identifiers - sql-server

I'm sure this has to be documented SOMEWHERE but for the life of me I just can't seem to find any actual documentation explaining the behavior.
Taking the 4 ways to reference tables (I don't believe there are more but feel free to correct me):
Current Database
Remote Database
Linked Server
Synonym
Their behavior when using multi-part column identifiers seem to differ and I'm trying to understand the reasoning behind it. I've tested the various types of SELECT statements:
Current Database
Works
SELECT Column FROM Schema.Table;
SELECT Table.Column FROM Schema.Table;
SELECT Schema.Table.Column FROM Schema.Table;
SELECT Alias.Column FROM Schema.Table AS Alias;
Even this works!
(Obviously only when using dbo Schema, but still)
SELECT Schema.Table.Column FROM Table;
Remote Database
Works
SELECT Column FROM RemoteDB.Schema.Table;
SELECT Table.Column FROM RemoteDB.Schema.Table;
SELECT RemoteDB.Schema.Table.Column FROM RemoteDB.Schema.Table;
SELECT Alias.Column FROM RemoteDB.Schema.Table AS Alias;
Fails
SELECT Schema.Table.Column FROM RemoteDB.Schema.Table;
The multi-part identifier "Schema.Table.Column" could not be bound.
Linked Server
Works
SELECT Column FROM LinkedServer.RemoteDB.Schema.Table;
SELECT Table.Column FROM LinkedServer.RemoteDB.Schema.Table;
SELECT Alias.Column FROM LinkedServer.RemoteDB.Schema.Table AS Alias;
Fails
SELECT Schema.Table.Column FROM LinkedServer.RemoteDB.Schema.Table;
The multi-part identifier "Schema.Table.Column" could not be bound.
SELECT RemoteDB.Schema.Table.Column FROM LinkedServer.RemoteDB.Schema.Table;
The multi-part identifier "RemoteDB.Schema.Table.Column" could not be bound.
SELECT LinkedServer.RemoteDB.Schema.Table.Column FROM LinkedServer.RemoteDB.Schema.Table;
The multi-part identifier "LinkedServer.RemoteDB.Schema.Table.Column" could not be bound.
**I believe this fails are you're only allowed a maximum of 4 parts in the identifier?**
**Read this somewhere but nothing authoritive. Would appreciate a reference.**
Synonym
Works
SELECT Column FROM SynonymName;
SELECT Column FROM SynonymSchema.SynonymName;
SELECT SynonymName.ColumnName FROM SynonymSchema.SynonymName;
SELECT SynonymSchema.SynonymName.Column FROM SynonymSchema.SynonymName;
SELECT Alias.Column FROM SynonymSchema.SynonymName AS Alias;
Even this works!
(Obviously only when using dbo Schema, but still)
SELECT SynonymSchema.SynonymName.Column FROM SynonymName;
I'm trying to understand why certain multi-part identifiers work when used one way (like schema against local db) but then fail when used another way (like schema against a remote db/linked server).
Should you rather always use aliases to ensure things will always work?
Any advice would be highly appreciated, especially pointers to official documentation as to the reason behind the design and best practice advice for a "one size fits all" scenario (which I'm currently going to surmise to be the alias route).

Best practice - Alias your tables and use a two parts identifier for column names - first part is the table alias and the second one is the column name.
Why? because:
Using a single part identifier will break as soon as the query contains a join (or apply), and that column name happens to belong to more than one table.
Using more than two parts identifier for a column will force you to write most of the identifier twice - once for the column and once for the table. If anything changes (like table moved to a different schema, linked server name changed, synonym changes) you now have to change your query in (at least) two places.
Using a two parts identifier for a column means you know exactly what table this column belongs to, and even if you add a join / apply clause, or simply add a column with the same name to one of the existing tables in the query, you don't need to change the query at all. Also, you now have only one place that determines where the table comes from.
Using three or four parts identifier for columns is deprecated (thanks #larnu) for the link in the comments.
Most importantly - columns belong to tables. They don't belong to servers, databases or schemas.
Please note that the word table in this answer is interchangeable with view, table-valued function, table-variable etc'.

Related

Not using aliases when writing nested SQL queries

I asked a question about aliases recently: Discerning between alias, temp table, etc [SQL Server].
I got the impression that tables and resulting queries had to be named using aliases.
select customers.name as 'Customers'
from customers
where customers.id not in
(
select customerid from orders
)
In fact when you use an alias there is a runtime error. What gives?
When working with "tables" - that is, anything that can use a JOIN - a name of some sort is needed. For example, if your query was written as:
select customers.name as 'Customers'
from customers
LEFT JOIN (
select customerid from orders
) ___
WHERE ___ is null
Then you need to name the derived table, and fill in the blanks, because SQL Syntax requires a name in a JOIN statement.
However, in your sample code:
select customers.name as 'Customers'
from customers
where customers.id not in
(
select customerid from orders
)
The syntax does not require a name, and so the nested query does not require naming.
Aliases are there for convenience most of the time. There are times when you are required to use them, though.
https://www.geeksforgeeks.org/sql-aliases/
Temporary tables, derived look-ups (sub-queries), common table expressions (CTEs), duplicate table names in JOINs, and a couple other other things I'm sure I'm forgetting are the only times you need to use an alias.
Most of the time, it's simply to rename something because it's long, complex, a duplicate column name, or just to make things simpler or more readable.
The query you post won't likely need an alias, but using one makes things easier when you are using the results in code, as well as when/if you add more columns to the query.
Side note:
You may see a lot of single letter abbreviations in people's SQL. This is common, however, it's bad form. They will also likely abbreviate with the first letter of every word in a table name, such as cal for ClientAddressLookup, and this is also not great form, however typing ClientAddressLookup for each of the 12 columns you need when JOINing with other tables isn't great either. I'm as guilty of this as everyone else, just know that using good aliases are just as necessary and useful as using good names for your variables in code.

Common TVF in SQL Server to get results from different schema

I have been using SQL Server for the past month and I need a suggestion from SQL Server folks to help me on this use case.
The tables below are just to explain about the idea that I am looking for.
I have tables in different schema like this:
MyDb.dbo.Festivals
MyDb.India.Festivals
MyDb.China.Festivals
MyDb.USA.Festivals
I am writing a table value function without any schema prefixed in it like
CREATE FUNCTION getFestivals()
RETURNS TABLE
AS
RETURN
(SELECT * FROM festivals)
As I haven't applied any schema, it defaults to dbo and creates the TVF as dbo.getFestivals(). Now I have created synonyms for all other schemas
CREATE SYNONYM India.getFestivals FOR dbo.getFestivals;
CREATE SYNONYM USA.getFestivals FOR dbo.getFestivals;
I tried to query like
SELECT *
FROM MyDb.India.getFestivals()
and it returns the festivals from dbo.festivals and not india.festivals.
I understand that though the synonyms, we've created it just executes the select query in the dbo schema context and not in india schema context.
I want suggestions on how to have a common table value function that will query based on the schema prefixed, i.e. MyDB.India.getFestivals() should get festivals from India and MyDB.USA.getFestivals() should return festivals from USA.
Question
Is there a way I can have a table value function that can query based on the schema context.
the only possible way I can think of is to create the same TableValue function in all schemas
Caveats
I have to stick to table value function only and the above use case is a sample scenario to explain my problem
I understand that though the synonyms, we've created it just executes
the select query in the dbo schema context and not in india schema
context.
You should always schema qualify objects in your queries, since you did not do it, SQL Server first looks for festivals in the same schema where the procedure resides, if it's not found then dbo schema is checked, if it's not found even in dbo, the error is raised.
In your case procedure resides in dbo schema so only dbo schema is checked in order to find festivals.
It may be wrong design if many "similar" tables are created instead of one table, can you merge them all into one table adding country_id to distinguish the country?
If not, can you at least add this field to every table? If it's so, just add the field for the country in every table, add check constraint on this field to reflect the only country that is stored in every table an then use partitioned view in your function.
Partitioned view is a view composed of union all of some tables with the same structure, each of which has check constraint on the same column that defines the values this column is restricted to. When you use this view with the filter on country column, all the tables except for the correct one will be eliminated from execution plan thanks to check constraint defined on this column.
So you can change your function to accept the only parameter that is country and it will read only one table corresponding to parameter passed.
More on partitioned views here: Using Partitioned Views

query 2 tables with exact column names and types

I have two tables with identical column names and types, I would like to query all of the content from both tables into one result set, of just one set of column names. So tbl1.ID and tbl2.ID should be in one col.ID as tbl1.data and tbl2.data should be in one col.data. There are no common values between the tables, records are unrelated so nothing to JOIN on.
I am using vb.net to query an Access DB and update a SQL DB.
I believe in SQL I can use a SELECT INTO but I am not sure how to do this in access with one query, or do I need to create a table and just push everything into it first.
thanks,
In Access you can create a "Union Query". How this is done exactly depends on the version of Access you are using. Open that query in "SQL View" and then use the code from there in your VB application.

Updateable view in mssql with multiple tables and computed values

Huge database in mssql2005 with big codebase depending on the structure of this database.
I have about 10 similar tables they all contain either the file name or the full path to the file. The full path is always dependent on the item id so it doesn't make sense to store it in the database. Getting useful data out of these tables goes a little like this:
SELECT a.item_id
, a.filename
FROM (
SELECT id_item AS item_id
, path AS filename
FROM xMedia
UNION ALL
-- media_path has a different collation
SELECT item_id AS item_id
, (media_path COLLATE SQL_Latin1_General_CP1_CI_AS) AS filename
FROM yMedia
UNION ALL
-- fullPath contains more than just the filename
SELECT itemId AS item_id
, RIGHT(fullPath, CHARINDEX('/', REVERSE(fullPath))-1) AS filename
FROM zMedia
-- real database has over 10 of these tables
) a
I'd like to create a single view of all these tables so that new code using this data-disaster doesn't need to know about all the different media tables. I'd also like use this view for insert and update statements. Obviously old code would still rely on the tables to be up to date.
After reading the msdn page about creating views in mssql2005 I don't think a view with SCHEMABINDING would be enough.
How would I create such an updateable view?
Is this the right way to go?
Scroll down on the page you linked and you'll see a paragraph about updatable views. You can not update a view based on unions, amongst other limitations. The logic behind this is probably simple, how should Sql Server decide on what source table/view should receive the update/insert?
You can modify partitioned views, provided they satisfy certain conditions.
These conditions include having a partitioning column as a part of the primary key on each table, and having a set on non-overlapping check constraints for the partitioning column.
This seems to be not your case.
In your case, you may do either of the following:
Recreate you tables as views (with computed columns) for your legacy soft to work, and refer to the whole table from the new soft
Use INSTEAD OF triggers to update the tables.
If a view is based on multiple base tables, UPDATE statement on the view may or may not work depending on the UPDATE statement. If the UPDATE statement affects multiple base tables, SQL server throws an error. Whereas, if the UPDATE affects only one base table in the view then the UPDATE will work (Not correctly always). The insert and delete statements will always fail.
INSTEAD OF Triggers, are used to correctly UPDATE, INSERT and DELETE from a view that is based on multiple base tables. The following links has examples along with a video tutorial on the same.
INSTEAD OF INSERT Trigger
INSTEAD OF UPDATE Trigger
INSTEAD OF DELETE Trigger

How to create column names/descriptors programmatically

In SQL Server given a Table/View how can you generate a definition of the Table/View in the form:
C1 int,
C2 varchar(20),
C3 double
The information required to do it is contained in the meta-tables of SQL Server but is there a standard script / IDE faciltity to output the data contained there in the form described above ?.
For the curious I want this as I have to maintain a number of SP's which contain Table objects (that is a form of temporary table used by SQL Server). The Table objects need to match the definition of Tables or Views already in the database - it would make life a lot easier if these definitions could be generated automatically.
Here is an example of listing the names and types of columns in a table:
select
COLUMN_NAME,
COLUMN_DEFAULT,
IS_NULLABLE,
DATA_TYPE,
CHARACTER_MAXIMUM_LENGTH,
NUMERIC_PRECISION,
NUMERIC_SCALE
from
INFORMATION_SCHEMA.COLUMNS
where
TABLE_NAME = 'YOUR_TABLE_NAME_HERE'
order by
Ordinal_Position
Generating DDL from that information is more difficult. There seems to be some suggestions at SQLTeam
If you want to duplpicate a table definition you could use:
select top 0
*
into
newtable
from
mytable
Edit: Sorry, just re-read your question, and realised this might not answer it. Could you be clear on what you are after, do you want an exact duplicate of the table definition, or a table that contains information about the tables definition?
Thanks for your replies. Yes I do want an exact duplicate of the DDL but I've realised I misstated exactly what I needed. It's DDL which will create a temporary table which will match the columns of a view.
I realised this in looking at Duckworths suggestion - which is good but unfortunately doesn't cover the case of a view.
SELECT VIEWDEFINITION FROM
INFORMATIONSCHEMA.VIEWS
... will give you a list of columns in a view and (assuming that all columns in the view are derived directly from a table) it should then be possible to use an amended version of Duckworths suggestion to pull together the relevant DLL.
I'm just amazed it's not easier ! I was expecting someone to tell me that there was a well established routine to do this given the TABLE objects need to have all columns full defined (rather than the way Oracle does it which is to say - "give me something which looks like table X".
Anyway thanks again for help and any further suggestions welcomed.
In this posting to another question I've got a DB reverse engineering script that will do tables, views, PK, UK and index definitions and foreign keys. This one is for SQL Server 2005 and is a port of one I originally wrote for SQL Server 2000. If you need a SQL Server 2000 version add a comment to this post and I'll post it up here.

Resources