How does Snowflake handle case sensitivity in object identifiers? - snowflake-cloud-data-platform

In Snowflake, if I run commands like the following:
create temporary table "Tab" (i int);
select * from "Tab"; -- this works
select * from Tab; -- error
I get the following error:
SQL compilation error: Object 'TAB' does not exist or not authorized.
Snowflake's docs claim that Unquoted object identifiers are case-insensitive. If that's true, why is it looking for TAB? And why doesn't it recognize that Tab (or TAB) refers to the same thing as "Tab"?

Snowflake's documentation on this point was technically inaccurate and misleading. It has now been updated to read:
Unquoted object identifiers ... Are stored and resolved as uppercase characters (e.g. id is stored and resolved as ID).
Unquoted identifiers in Snowflake always resolve as if they were in all-capitals.
If the QUOTED_IDENTIFIERS_IGNORE_CASE parameter is changed from its default (FALSE) to TRUE, then quoted identifiers are given the same behavior. Despite the parameter's name, this doesn't mean the identifiers are case-insensitive: they just resolve to all-capitals, which matches the behavior of unquoted identifiers.
Important: That means that if you ever create any table, field, etc., with double-quotes around its name, using the default settings (QUOTED_IDENTIFIERS_IGNORE_CASE = FALSE), and the quoted name is not in all-capitals:
You will never be able to refer to that object without using quoted identifiers.
You will never be able to refer to that object with QUOTED_IDENTIFIERS_IGNORE_CASE set to TRUE, even with quoted identifiers. Or, as the docs say:
If the parameter is then changed to TRUE, the identifier for the newly-created object is not retrievable/resolvable.
Knowing this, it seems the safest approach would be to set QUOTED_IDENTIFIERS_IGNORE_CASE to TRUE at the Account level when you're first getting started with Snowflake, so it becomes the default policy. As long as you keep that setting consistently, then the statement that "Unquoted object identifiers are case-insensitive" will become effectively true for you.
The docs provide the following Tip:
Due to the impact that changing the parameter can have on resolving identifiers, we highly recommend choosing an identifier resolution method early in your implementation of Snowflake and then dictating the default behavior by setting the parameter at the account level accordingly, which can be done by any account administrator for your account. The parameter can always be overridden at the session level, but we don’t encourage changing the parameter from the default, unless you have an explicit need to do so.
If there's ever a corner case where you need to be able to refer to something case-sensitively, you can change the QUOTED_IDENTIFIERS_IGNORE_CASE setting for that particular session. These situations should be rare, because when consuming "SHOW" results (which is the commonly-cited case for using quoted identifiers) setting QUOTED_IDENTIFIERS_IGNORE_CASE to TRUE obviates the need for quoted identifiers through some magic beyond my understanding.
alter session set quoted_identifiers_ignore_case = false;
show tables;
-- This fails because `name` resolves to `NAME` rather than `"name"`
select name
from table(result_scan(last_query_id()));
alter session set quoted_identifiers_ignore_case = true;
show tables;
-- This succeeds, for some reason, even though `name` still resolves to `NAME`.
select name
from table(result_scan(last_query_id()));
Note: Database Collation settings appear to have no impact on the behaviors of identifiers in Snowflake.

As mentioned above. Snowflake actually converts everything to uppercase when resolving identifiers.
This is normally not an issue if you build queries or objects within Snowflake, but when you build them from an external source (For example, a Python script or an Alteryx workflow), you need to be sure to use the exact casing you used in the external resource, because Snowflake will not convert those object names to uppercase. So if you use Alteryx to create a column named Record_id, the only way to reference it within snowflake would be to actually respect the casing: Record_id. Things like RECORD_ID or record_id will throw an invalid identifier error.

Related

Incorrect syntax near 'Case'. Expecting ID, QUOTED_ID, or '.' Error in SQL Server

I multiple tables in my database. All of my tables are showing output using select query instead of only 1 table "case". it also have data and columns but when I use it in my query it shows syntax error. I have also attached picture which have list of table and a simple query. This code is not developed by me so I am not sure why it is showing error. Is there any kind of restriction we can set so that it cannot be used in queries?
CASE is a reserved keyword in SQL Server. Therefore, you must escape it in double brackets:
SELECT * FROM dbo.[Case];
But best naming practice dictates that we should avoid naming database objects using reserved keywords. So, don't name your tables CASE.
Reserved words are not recommended for use as a database, table, column, variable, or other object names. If you desire to use a reserved word is used as an object name in ANSI standard syntax, it must be enclosed in double-quotes OR "[]" to allow the Relational Engine (whichever that one is) that the word is being used as an object and not as a keyword in the given context. Here is the sample code.
SELECT * FROM dbo."Case"
Or
SELECT * FROM dbo.[Case]

SQL Server and Keyword Schemas

I am in the process of transitioning us from an MSAccess backend to a SQL Server back end.
Without really considering keywords our plan has an Admin, Order and Address schema.I have always read and been taught that you should never use keywords as schema, function,stored procedure, etc. names and that doing so will really hose you.
If I plan to make it standard practice to always explicitely define my schema (E.G. [Admin].CompanyInformation) then is using a keyword an issue?
Once again, qualifying your object names with the schema is NOT related to the use of reserved words as identifiers. You will still encounter a problem using a reserved word as a name even if you qualify it with the schema name. Example:
set nocount on;
use tempdb;
go
create table dbo.[table] (id int not null);
print 'creating dbo.table as a table';
go
-- the next two statements fail
select * from table;
select * from dbo.table;
go
print '';
print 'select from dbo.[table] works**';
select * from dbo.[table];
if object_id('dbo.table') is not null
drop table dbo.[table];
go
So - yes you should use the schema name. And yes - you should avoid the use of reserved words as object names. Doing the former does not negate the need to do the latter. And there are additional rules for object names that you should know - the rules for regular identifiers are https://learn.microsoft.com/en-us/sql/relational-databases/databases/database-identifiers.
And even if you choose to NOT follow the rules, you will probably use software that you did not develop and that was not written carefully - which will fail to work correctly when it encounters an object name that is not a regular identifier. And THAT reason is the best reason for adhering to the rules for regular identifiers (one of which is to avoid using reserved words as names).
No, this not an issue if you write [Admin] (not Admin).
P.S. Anyway you should always excplicetly define your schema because default schema is usually dbo

Oracle - Table names and columns - could I change the casing?

By default - Are all oracle table names and columns stored in uppercase?
Could I change to casing?
In the data dictionary, yes, identifiers are converted to upper case by default.
You can change that behavior by creating case-sensitive identifiers. It is generally not a good idea to do so, but you can. In order to do so, you would need to enclose the table name and column names in double quotes both when you create the object and every time you want to refer to them. You'll also need to get the casing right because the identifiers will be case-sensitive unlike the normal case-insensitive behavior.
If you
CREATE TABLE "foo" (
"MyMixedCaseColumn" number
);
then the table name and column name will be stored in mixed case in the data dictionary. You'll need to use double-quotes to refer to either identifier in the future. So
SELECT "MyMixedCaseColumn"
FROM "foo"
will work. However, something like
SELECT MyMixedCaseColumn
FROM foo
will not. Nor will
SELECT "MyMixedCaseColumn"
FROM "Foo"
Generally, future developers will be grateful if you don't use case-sensitive identifiers. It's annoying to have to use double-quotes all over the place and not every tool or library has been tested against systems that use case-sensitive identifiers so it's not uncommon for things to break.

SQL Server Column names case sensitivity

The DB I use has French_CI_AS collation (CI should stand for Case-Insensitive) but is case-sensitive anyway. I'm trying to understand why.
The reason I assert this is that bulk inserts with a 'GIVEN' case setup fail, but they succeed with another 'Given' case setup.
For example:
INSERT INTO SomeTable([GIVEN],[COLNAME]) VALUES ("value1", "value2") fails, but
INSERT INTO SomeTable([Given],[ColName]) VALUES ("value1", "value2") works.
EDIT
Just saw this:
http://msdn.microsoft.com/en-us/library/ms190920.aspx
so that means it should be possible to change a column's collation without emptying all the data and recreating the related table?
Given this critical piece of information (that is in a comment on the question and not in the actual question):
In fact I use Microsoft .Net's bulk insert method, so I don't really know the exact query it sends to the DB server.
it makes sense that the column names are being treated as case-sensitive, even in a case-insensitive DB, since that is how the SqlBulkCopy Class works. Please see Column mappings in SqlBulkCopy are case sensitive.
ADDITIONAL NOTES
When asking about an error, please always include the actual, and full, error message in the question. Simply saying that there was an error leads to a lot of guessing and wild-goose chases that in turn lead to off-topic answers.
When asking a question, please do not change the circumstances that you are dealing with. For example, the question states (emphasis added):
bulk inserts with a 'GIVEN' case setup fail, but they succeed with another 'Given' case setup.
Yet the example statements are single INSERTs. Also, a comment on the question states:
In fact I use Microsoft .Net's bulk insert method, so I don't really know the exact query it sends to the DB server.
Using .NET and SqlBulkCopy is waaaay different than using BULK INSERT or INSERT, making the current question misleading, making it difficult (or even impossible) to answer correctly. This new bit of info also leads to more questions because when using SqlBulkCopy, you don't write any INSERT statements: you just write a SELECT statement and specify the name of the destination Table. If you specify column names at all for the destination Table, it is in the optional column mappings. Is that where the issue is?
Regarding the "EDIT" section of the question:
No, changing the Collation of the column won't help at all, even if you weren't using SqlBulkCopy. The Collation of a column determines how data stored in the column behaves, not how the column names (i.e. meta-data of the Table) behaves. It is the Collation of the Database itself that determines how Database-level object meta-data behaves. And in this case, you claim that the DB is using a case-insensitive Collation (correct, the _CI_ portion of the Collation name does mean "Case Insensitive").
Regarding the following statements made by Jonathan Leffler on the question:
that gets into a very delicate area of the interaction between delimited identifiers (normally case-sensitive) and collations (this one is case-insensitive).
No, delimited identifiers are not normally case-sensitive. The sensitivities (case, accent, kana type, width, and starting in SQL Server 2017 variation selector) of delimited identifiers is the same as for non-delimited identifiers at that same level. "Same level" means that Instance-level names (Databases, Logins, etc) are controlled by the Instance-level Collation, while Database-level names (Schemas, Objects--Tables, Views, Functions, Stored Procedures, etc--, Users, etc) are controlled by the Database-level Collation. And these two levels can have different Collations.
you need to research whether the SQL column names in a database are case-sensitive when delimited. It may also depend on how the CREATE TABLE statement is written (were the names delimited in that?). Normally, SQL is case-insensitive on column and table names; you could write INSERT INTO SoMeTaBlE(GiVeN, cOlNaMe) VALUES("v1", "v2") and if the names were never delimited, it'd be OK.
It does not matter if the column names were delimited or not when creating the Table, at least not in terms of how their resolution is handled. Column names are Database-level meta-data, and that is controlled by the default Collation of the Database. And it is the same for all Database-level meta-data within each Databases. You cannot have some column names being case-sensitive while others are case-insensitive.
Also, there is nothing special about Table and column names. They are Database-level meta-data just like User names, Schema names, Index names, etc. All of this meta-data is controlled by the Database's default Collation.
Meta-data (both Instance-level and Database-level) is only "normally" case-insensitive due to the default Collation suggested during installation being a case-insensitive Collation.
a 'delimited identifier' is a column name, table name, or something similar enclosed in double quotes, such as CREATE TABLE "table"(...)
It is more accurate to say that a delimited identifier is an identifier enclosed in whatever character(s) the DBMS in question has defined as its delimiters. And which particular characters are used for delimiters varies between the different DBMSs.
In SQL Server, delimited identifiers are enclosed in square brackets: [GIVEN]
While square brackets always work as delimiters for identifiers, it is possible to use double-quotes as delimiters IF you have the session-level property of QUOTED_IDENTIFIER set to ON (which is best to always do anyway).
There are arcane parts to SQL (and delimited identifier handling is one of them)
Well, delimited identifiers are actually quite simple. The whole point of delimiting an identifier is to effectively ignore the rules of regular (i.e. non-delimited) identifiers. But, in terms of regular identifiers, yes, those rules are rather arcane (mainly due to the official documentation being incomplete and incorrect). So, in order to take the mystery out of how identifiers in SQL Server actually work, I did a bunch of research and published the results here (which includes links to the research itself):
Completely Complete List of Rules for T-SQL Identifiers
For more info on Collations / Encodings / Unicode / ASCII, especially as they relate to Microsoft SQL Server, please visit:
Collations.Info
The fact the column names are case sensitive means that the MASTER database has been created using a case sensitive collation.
In the case I just had that lead me to investigate this, someone entered
Latin1_CS_AI instead of Latin1_CI_AS
When setting up SQL server.
Check the collation of the columns in your table definition, and the collation of the tempdb database (i.e. the server collation). They may differ from your database collation.

Is there any reason to write a database column between square brackets?

I am maintaining a database created by another person in SQL Server. In one table I found a column whose name is between square brackets. The name of the field is desc and it is stored in the table as [desc]. The other fields are stored without square brackets. Is there any special reason/convention behind this choice?
The applications built on top of the Database are developed either in C# or VB.NET.
Thanks
The brackets (or other identifiers in other database engines) are just an explicit way of telling the query engine that this term is an identifier for an object in the database. Common reasons include:
Object names which contain spaces would otherwise fail to parse as part of the query unless they're wrapped in brackets.
Object names which are reserved words can fail to parse (or, worse, correctly parse and do unexpected things).
(I suppose it's also possible that there may be an ever-so-slight performance improvement since the engine doesn't need to try to identify what that item means, it's been explicitly told that it's an object. It still needs to validate that, of course, but it may be a small help in the inner workings.)
If your names contains either a reserved word (such as SELECT) or spaces, then you need to surround the name with [].
In your example, you have [desc], which is short for DESCENDING.
For example if you have a field that is a keyword e.g [Date] or [Select] or in this case [desc]

Resources