I am in a sticky situation. Originally my database catalog is case-insensitive. I write happy queries without minding capitalization of my variable names. Everything was good. After my database is migrated to a different host, the collation at server instance level is case sensitive. Now my variable names need spell checked and case checked.
that is alright as I use variables sparingly.
Recently a situation arisen where I need to use temp tables to buffer some results from multiple views before referring them in my main query. In essence:
SELECT * INTO #myview1 FROM vw_myview1
SELECT * INTO #myview2 FROM vw_myview2
and
SELECT *
FROM #myview1 v1
JOIN #myview2 v2 on v1.id = v2.id
All would be good if my database instance had case insensitive collation. But no. In my main queries, I would have column name capitalization all messed up, WorkID, workId, workid, You name it. I have more than 50 of these queries where I need the temp tables workaround. It's insane and error prone to have to fix the capitalization for every instance that refer to columns in the new temp table . Is there anyways I can flip a switch and say ignore column name collations for this temp table?
If you have to insure the temporary table has the column collation you require, there are a couple of ways. It a is PITA, but not so bad. For 50+ tables, it's still a PITA.
In SSMS, use the Tools -> Options dialog to get to "SQL Server Object Explorer". The "Scripting" options are located here. Set the option for "include collation" to true. Then script out the table or view. Convert the script from the table to the temp table syntax you need.
You can also "cheat" and use a new table in the same database rather than a temp table. It's up to you to clean up. If it's large, it will be in the transaction log.
I've seen examples where the insert into uses a subquery for a column name from the source with a where 1=2. This causes the collation to be included. It only creates the table. It seems funky to me and more tedious than scripting it out.
Related
Problem: Junior SQL dev here, working with a SQL Server database where we have many functions that use temp tables to pull data from various tables to populate Crystal reports etc. We had an issue where a user action in our client caused a string to overflow the defined NVARCHAR(100) character limit of the column. As a quick fix, one of our seniors decided on a schema change to set the column definition to NVARCHAR(255), instead of fixing the issue of the the string getting too long. Now, we have lots of these table based functions that are using temp tables referencing the column in question but the temp table variable is defined as 100 instead of 255.
Question: Is there an easy way to find and update all of these functions? Some functions might not reference the table/column in question at all, but some heavily rely on this data to feed reports etc. I know I can right click a table and select "View Dependencies" in SQL Server Management Studio, but this seems very tedious to have to go through all of them and then update our master schema before deploying it to all customers.
I thought about a find and replace if there is a way to script or export the functions but I fear a problem I will run into is one variable in one function might be declared as TransItemDescription NVARCHAR(100) and one might be TransItemDesc NVARCHAR (100). I've heard of people avoiding temp tables maybe because of these issues so maybe there is just bad database design here?
Thus far I've been going through them one at a time using "View Dependencies" in SSMS.
I think the best solution would be to script out the whole database into a single script from SSMS. Then use Notepad++ (or equivalent) to either find:
All occurrences of NVARCHAR(100)
All occurrences of the variable name, e.g. TransItemDescription, TransItemDesc.
Once you have found all occurrences then make a list of all of the functions to be fixed. Then you would still need to do a manual fix to all functions, but once complete the issue should be totally resolved.
This seems ridiculously easy, but I can't find it anywhere...
I have a VERY simple sequence container with two tasks: Truncate a SQL table, and repopulate it from production. But this container will be repeated for about 50 tables. The container's name (entered manually) = the name of both the source and destination tables.
I have two variables:
"TableName" is entered manually.
"DelTable" is an expression that uses #[User::TableName] to generate a simple SQL statement.
I'm super-lazy and would like to use an expression to set "TableName" = the name of the current scope so I only have to enter it once.
Ideas???
THANK YOU!
if you are truncating all tables in a DB and replacing with exactly the same structure, how about this approach:
Execute SQL:
select table_name
from INFORMATION_SCHEMA.TABLES --Add a where to limit the tables to the ones you want
Save results to an object variable called TABLES
Add a for each loop:
Loop through ADO Object setting value to a string variable called table
Add Execute SQL to FE LOOP: truncate table ? and map parameter.
Add a 2nd Execute SQL statement:
INSERT INTO SERVER.DB.SCHEMA.?
select * from ?
Again map the parameters.
If you are having trouble mapping parameters set up variables and use them to create the SQL statements to run.
#TomPhillips is correct, I cannot unfortunately comment or make that answer useful. hence commenting here.
There's no easy quick fix to use a loop/automate unless all the 50 tables are same structure which is rare by any stretch of imagination.
BIML is the way to go if you are lazy :)
SSIS is not dynamic. Data Flows require fixed input and output at compile time, not runtime. You cannot simply change the table name and have it work.
If you have a list of 50 tables to do the same function on, you can use BIML to dynamically generate the SSIS package(s). But the DF itself cannot be dynamic.
I need to do join query to MS SQL Server 2014 DB based on a column name value. The same query runs when doing query directly to DB, but when doing query through Mule I'm getting error. The query looks something like this :
SELECT * FROM sch.emple JOIN sch.dept on sch.emple.empid = sch.dept.empid;
The above query work fine while doing query directly to MS SQL Server DB, but gives the following error through mulesoft.
Record cannot be mapped as it contains multiple columns with the same label. Define column aliases to solve this problem (java.lang.IllegalArgumentException). Message payload is of type: String
Request you to please help me out.
Specify columns list directly:
SELECT e.<col1>, e.<col2>, ...., d.<col1>,...
FROM sch.emple AS e
JOIN sch.dept AS d
ON e.empid = d.empid;
Remarks:
You could use aliases instead of schema.table_name
SELECT * in production code in 95% cases is bad practice
The column that has duplicate is empid(or more). You could add alias for it e.empid AS emple_empid and d.empid AS dept_empid or just specify e.empid once.
To avoid specifying all columns manually, you could drag and drop them from object explorer to query pane like Drag and Drop Column List into query window.
Second way is to use plugin like Redgate Prompt to expand SELECT *:
Image from: https://www.simple-talk.com/sql/sql-tools/sql-server-intellisense-vs.-red-gate-sql-prompt/
Addendum
But the same query works directly.
It works because you don't bind them. Please read carefully link I provided for SELECT * antipattern and especially:
Binding Problems
When you SELECT *, it's possible to retrieve two columns of the same name from two different tables. This can
often crash your data consumer. Imagine a query that joins two
tables, both of which contain a column called "ID". How would a
consumer know which was which? SELECT * can also confuse views (at
least in some versions SQL Server) when underlying table structures
change -- the view is not rebuilt, and the data which comes back can
be nonsense. And the worst part of it is that you can take care
to name your columns whatever you want, but the next guy who comes
along might have no way of knowing that he has to worry about adding a
column which will collide with your already-developed names.
But the same query works directly.
by Dave Markle
I have SQL Server 2012 installed that is used for a few different applications. One of our applications needs to be installed, but the company is saying that:
The SQL collation isn't correct, it needs to be: SQL_Latin1_General_CP1_CI_AS
You can just uninstall the SQL Server Database Engine & upon reinstall select the right collation.
What possible reason would this company have to want to change the collation of the database engine itself?
Yes, you are able to set the collation at the database level. To do so, here is an example:
USE master;
GO
ALTER DATABASE <DatabaseName>
COLLATE SQL_Latin1_General_CP1_CI_AS;
GO
You can alter the database Collation even after you have created the database using the following query
USE master;
GO
ALTER DATABASE Database_Name
COLLATE Your_New_Collation;
GO
For more information on database collation Read here
What possible reason would this company have to want to change the collation of the database engine itself?
The other two answers are speaking in terms of Database-level Collation, not Instance-level Collation (i.e. "database engine itself"). The most likely reason that the vendor has for wanting a highly specific Collation (not just a case-insensitive one of your choosing, for example) is that, like most folks, they don't really understand how Collations work, but what they do know is that their application works (i.e. does not get Collation conflict errors) when the Instance and Database both have a Collation of SQL_Latin1_General_CP1_CI_AS, which is the Collation of their Instance and Database (that they develop the app on), because that is the default Collation when installing on an OS having English as its language.
I'm guessing that they have probably had some customers report problems that they didn't know how to fix, but narrowed it down to those Instances not having SQL_Latin1_General_CP1_CI_AS as the Instance / Server -level Collation. The Instance-level Collation controls not just tempdb meta-data (and default column Collation when no COLLATE keyword is specified when creating local or global temporary tables), which has been mentioned by others, but also name resolution for variables / parameters, cursors, and GOTO labels. Even if unlikely that they would be using GOTO statements, they are certainly using variables / parameters, and likely enough to be using cursors.
What this means is that they likely had problems in one or more of the following areas:
Collation conflict errors related to temporary tables:
tempdb being in the Collation of the Instance does not always mean that there will be problems, even if the COLLATE keyword was never used in a CREATE TABLE #[#]... statement. Collation conflicts only occur when attempting to combine or compare two string columns. So assuming that they created a temporary table and used it in conjunction with a table in their Database, they would need to be JOINing on those string columns, or concatenating them, or combining them via UNION, or something along those lines. Under these circumstances, an error will occur if the Collations of the two columns are not identical.
Unexpected behavior:
Comparing a string column of a table to a variable or parameter will use the Collation of the column. Given their requirement for you to use SQL_Latin1_General_CP1_CI_AS, this vendor is clearly expecting case-insensitive comparisons. Since string columns of temp tables (that were not created using the COLLATE keyword) take on the Collation of the Instance, if the Instance is using a binary or case-sensitive Collation, then their application will not be returning all of the data that they were expecting it to return.
Code compilation errors:
Since the Instance-level Collation controls resolution of variable / parameter / cursor names, if they have inconsistent casing in any of their variable / parameter / cursor names, then errors will occur when attempting to execute the code. For example, doing this:
DECLARE #CustomerID INT;
SET #customerid = 5;
would get the following error:
Msg 137, Level 15, State 1, Line XXXXX
Must declare the scalar variable "#customerid".
Similarly, they would get:
Msg 16916, Level 16, State 1, Line XXXXX
A cursor with the name 'Customers' does not exist.
if they did this:
DECLARE customers CURSOR FOR SELECT 1 AS [Bob];
OPEN Customers;
These problems are easy enough to avoid, simply by doing the following:
Specify the COLLATE keyword on string columns when creating temporary tables (local or global). Using COLLATE DATABASE_DEFAULT is handy if the Database itself is not guaranteed to have a particular Collation. But if the Collation of the Database is always the same, then you can specify either DATABASE_DEFAULT or the particular Collation. Though I suppose DATABASE_DEFAULT works in both cases, so maybe it's the easier choice.
Be consistent in casing of identifiers, especially variables / parameters. And to be more complete, I should mention that Instance-level meta-data is also affected by the Instance-level Collation (e.g. names of Logins, Databases, server-Roles, SQL Agent Jobs, SQL Agent Job Steps, etc). So being consistent with casing in all areas is the safest bet.
Am I being unfair in assuming that the vendor doesn't understand how Collations work? Well, according to a comment made by the O.P. on M.Ali's answer:
I got this reply from him: "It's the other way around, you need the new SQL instance collation to match the old SQL collation when attaching databases to it. The collation is used in the functioning of the database, not just something that gets set when it's created."
the answer is "no". There are two problems here:
No, the Collations of the source and destination Instances do not need to match when attaching a Database to a new Instance. In fact, you can even attach a system DB to an Instance that has a different Collation, thereby having a mismatch between the attached system DB and the Instance and the other system DBs.
It's unclear if "database" in that last sentence means actual Database or the Instance (sometimes people use the term "database" to refer to the RDBMS as a whole). If it means actual "Database", then that is entirely irrelevant because the issue at hand is the Instance-level Collation. But, if the vendor meant the Instance, then while true that the Collation is used in normal operations (as noted above), this only shows awareness of simple cause-effect relationship and not actual understanding. Actual understanding would lead to doing those simple fixes (noted above) such that the Instance-level Collation was a non-issue.
If needing to change the Collation of the Instance, please see:
Changing the Collation of the Instance, the Databases, and All Columns in All User Databases: What Could Possibly Go Wrong?
For more info on working with Collations / encodings / Unicode / etc, please visit:
Collations.Info
Edit: I'm aware that SELECT * is bad practice, but it's used here just to focus the example SQL on the table statement rather than the rest of the query. Mentally exchange it for some column names if you prefer.
Given a database server MyServer (which we are presently connected to in SSMS), with several databases MyDb1, MyDb2, MyDb3 etc and default schema dbo, are any of the following equivilant queries (they will all return exactly the same result set) more "optimal" than the others?
SELECT * FROM MyServer.MyDb1.dbo.MyTable
I was told that this method (explicitly providing the full database name including server name) treats MyServer as a linked server and causes the query to run slower. Is this true?
SELECT * FROM MyDb1.dbo.MyTable
The server name isn't required as we're already connected to it, but would this run 'faster' than the above?
USE MyDb1
GO
SELECT * FROM dbo.MyTable
State the database we're using initially. I can't imagine that this is any better than the previous for a single query, but would it be more optimal for subsequent queries on the same database (ie, if we had more SELECT statements in the same format below this)?
USE MyDb1
GO
SELECT * FROM MyTable
As above, but omitting the default schema. I don't think this makes any difference. Does it?
SQL Server will always look for the objects you sepcify within the current "Context" if you do not specify a fully qualified name.
Is one faster than the other, sure, the same as a file name on your hard drive of "This is a really long name for a file but at long as it is under 254 it is ok.txt" will take up more hard-drive (toc) space than "x.txt". Will you ever notice it, no!
As far as the "USE" keyword, this just sets the context for you, so you dont have to fully qualify object names. The "USE" keyword is NOT sql, you cannot use in in another application (like a vb/c# app) or within a stored procedure but it is like the "GO" keyword in that it tells SSMS to do something, change the context.