SQL Server collation default for US/English - best practice - sql-server

Briefly stated: As a Windows DBA in the US/English locale, is it worth setting Latin1_General_CI_AS as your system-level default on new SQL server builds, or is it wiser to stick with the SQL_Latin1_General_CP1_CI_AS default for the foreseeable future? Do we think MS will eventually make Latin1_General_CI_AS the default in the US?
More background:
https://learn.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver16#Server-level-collations
Per Microsoft, if your Windows locale is any English speaking country (except the US) your default SQL server collation during setup is: Latin1_General_CI_AS, but in the US it is still SQL_Latin1_General_CP1_CI_AS for backward compatibility. New databases take the system collation by default (unless you override), etc...
We have kept the SQL_Latin1_General_CP1_CI_AS as the default on our own SQL server builds, and our in-house databases have evolved with that same collation. We just received our first database with Latin1_General_CI_AS from a third party and it got me to question if we should be getting behind the "newer" way of collating, especially if we cross join or interact between databases in the same environment.
I know we can do collate commands for joins between the different databases, but I was curious if folks are preemptively configuring Latin1_General_CI_AS as their server level default in US English locale to be more compatible with folks in other countries or to get ahead of any possible "final pivot" on Microsoft's side in future Windows builds.
Thanks!

Briefly stated: As a Windows DBA in the US/English locale, is it worth setting Latin1_General_CI_AS as your system-level default on new SQL server builds
Briefly answered: no.
The only time you're going to care about the instance collation is when an existing workload breaks because of it. And sticking with the old SQL collation instead of the better-and-more-modern Windows collation is the lower-risk path for DBAs.
Individual databases can adopt Latin1_General_CI_AS and accept that tempdb will use SQL_Latin1_General_CP1_CI_AS, or can adopt partial database containment and get contained database collation.

I cannot speak to a future change in the installation default but, having worked in organizations with varying instance collations, I suggest you stick with whatever collation is the standard for your organization (including de-facto one). This is is typically SQL_Latin1_General_CP1_CI_AS in the US since it's the installation default.
Standardizing on a collation facilitates instance consolidation, replication, etc. Third party applications may require a different collation.

Related

Import MS Access 2019 desktop tables and data into SQL Server 2019 Developer edition

Can anyone please suggest a way to import an MS Access 2019 desktop database (tables and data) into SQL Server 2019 Developer edition to create a new SQL Server database?
I know this question has been asked many times for earlier versions of this software but I am hoping that there may be a 2019+ method for the 2019 versions.
Thanks in advance.
If you looking to just import 1 or 2 tables, then use the SQL server management tools. However, those imports don't even preserve the PK and your indexes. And such a import does not support relationships between tables.
However, if you looking to move up lots of tables, keep + set your PK, keep + set your indexing, and ALSO move up related data between tables?
Then I suggest you use Sql Server Migration Assistant for Access.
The so called SSMA can be found here:
https://www.microsoft.com/en-us/download/details.aspx?id=54255
Note in above, you can download a x86 version, or a x64 bit version. While your local say running copy of SQL server can (and even should be) x64 bits, it is STILL VERY VERY VERY common that your Access/office install may well be x32 bits. As a result, you want to download + choose + run the x86 version of SSMA.
While total free, it is a relative complex package, so try a few test migrations, and I high recommend this package, since as noted, it has smarts for not only moving tables, but also indexing, and even relations between tables.
I also STRONG but STRONG suggest that you change the default mapping for the data types. By default, it will use datetime2 for any Access Date/Time columns, and I STRONG but STRONG but STRONG but STRONG but STRONG suggest you change that default back to using sql server datetime for dates. the default is datetime2 and you REALLY but REALLY but REALLY do not want to use this default.
You can also have it "try" to move up your sql (saved) queries. I in most cases don't do this, but that's another long post here.
However, it can and will attempt to try and move saved queries to sql server views - this I don't recommend in most cases, but each use case I suppose is a different use case.
In summary:
To import a table or maybe 2-3 tables - use the SQL manager.
To import a whole lot of tables, and keep things like relations intact, then use SSMA for access.

When we run SQL Server 2008 R2 in compatibility mode 80, what features do we get?

We have SQL Server 2008 R2 running in compatibility mode 80 (2000) as we have lot of discontinued features used. Initially i thought i will get only features of 2000 to use, but as pleasant surprise i show CTE work, I thought this is superset case. we have access to all features of 2000, 2005, 2008 and R2 but recently when I was playing around with DMV/DMF I tried to pass sql_handle to sys.dm_exec_sql_text, but it did not work. A bit of googe/hit and run showed me that i need to change compatibility mode as this will not work in comaptibilty mode 80. So what features we have access to when we use 2008 r2 in compatibility mode 80 (2000) ???
Also does this compatibility mode apply on SSIS?
From comments I realize it is PARTIAL backward compatibility, so in my scenario I get all features of 2000 and some of 2008.
The exact set of language features that will work in a given compatibility mode depends on the hosting server level. For example, compatibility level 80 of a database running on SQL Server 2005 can have some differences from the same level database being run on SS 2008R2.
The underlying query planner and other aspects of the database engine don't change by changing compatibility level, but some undocumented default situations might behave differently. For example, even though a bad choice, some developers depended on the default ordering of rows in a SQL Server 2000 database, but that default ordering, being undocumented, changed in some cases in 2005, causing a problem when that database was run on 2005 with compatibility mode 80. Of course, depending on default ordering is a huge no-no anyway, but this is an example where the underlying engine changed, whereas the actual code executed did not.
What you'd need to fully answer this is, for every version of SQL Server, to list all of the back-level versions it supports in compatibility modes, and for each level that that SQL Server version supports, a full list of all language features supported and all those disallowed (that may be allowed in a later level) in that compatibility level, on that specific level of server.
While I've found some examples of later features being allowed in lower levels (like 2008R2 allowing throw/catch in a compatibility 80 database), I haven't found anything close to definitive lists, which would be helpful.

What is the real benefit of contained databases

In SQL Server 2012, they have introduce the Contained Database. What is the real purpose of this feature? What drawbacks of previous versions has it fixed?
They are being developed to make migration of databases between systems easier (both your databases, and databases on SQL Azure that they need to move around to balance resources). Anything that has a dependency outside of the database is considered a risk, because it's extra scaffolding that has to go with the database - easy to forget, easy to get wrong, easy to fall out of sync.
For example, in Denali these issues are addressed:
Today when you move a database to another server, you also have to migrate all the SQL logins at the server level - this can be a pain especially when the SIDs get out of sync. With contained databases, database-level users that don't have a tie to a SQL Server login just come along for the ride when a database is backed up, detached, mirrored, replicated, etc. Nice and easy.
If you have a database with collation that differs from the server collation, you may find that you have collation conflicts when you join or perform other operations with #temp tables, because the #temp tables that get created will inherit the server collation, not the calling database. While you can get around that by specifying a COLLATE clause on every single column reference, with contained databases, #tempdb inherits the collation of the calling database, overriding the server collation.
THROW() can almost fall into this category as well - since you no longer have to use sys.messages to store custom messages. This is not as common as the above two issues, but it certainly does make migrating to a new server work better if there is no requirement to also keep sys.messages in sync. This is not restricted to contained databases, but it plays the same role.
For things that don't meet "containment" criteria, there is a DMV that can show you a list of things that will potentially break if you move them to another server. For example, a call to a three- or four-part name.
In future versions, there are other issues that will be addressed. For example:
SQL Server Agent is an external dependency. When you move a database to a different server, SQL Agent jobs that reference that database do not automatically move with the database, you have to determine which ones are affected and script them out yourself (it is not quite as simple as just bringing along msdb too). In a future version of SQL Server, I envision that either (a) each database will be able to have its own Agent, or (b) Agent will be moved to an OS-level architecture, where some translation layer tells you where the database is, instead of having to have Agent live on the same machine. The latter option can get complicated when we're talking about Azure, geo-disparate networks, etc.
Linked Servers are also an external dependency. This could be easily solved with database-level linked servers - especially since these are little more than synonym containers / pointers.
There are others, but those are the heavy hitters.

SQL Server: how to determine what will break when downgrading a database?

We're building an application for a client with the assumption that they'd be upgrading to a minimum of SQL Server 2005 from SQL Server 2000. We're finished our application, built on 2005, and are ready to integrate. Turns out that they're not going to upgrade their DB server.
So, now we're stuck with trying to sort out what will break.
We don't have access to SQL Server 2000, so we can only change the compatibility of the database to 80.
Aside from complete testing and reviewing every stored procedure (and I've read that changing the compatibility mode is not foolproof - so testing wouldn't be bombproof), is there any other way to determine what will break? Any tools out there? Scripts?
Edit
I'd prefer not to try restoring this onto their production DB server to see what errors are spit out, so that's not a good option.
Suggest you look in Books online for the page that spells out the differences between the two and look for those things. YOu can look over the list and then search for some new keywords in the table where the sp text is stored. That will give you a beginning list.
#rwmnau noted some good ones, I'll add two more
SQL Server 2000 does not have varchar(max) or nvarchar (max), use text instead.
SQl Server 2000 also does not have SSIS - if you are creating SSIS packages to import data or move data to a data warehouse or export data, all of those need to be redone in DTS.
Also it looks to me like you can still download the free edition of SQL Server 2000:
http://www.microsoft.com/downloads/details.aspx?familyid=413744d1-a0bc-479f-bafa-e4b278eb9147&displaylang=en
You might want to do that and test on that.
I wouldn't be worried about your ANSI-SQL (setting the database compatibility level should take care of most of that), but there are a few big features you may have used that aren't available in SQL 2000 (there are many more, but these are the ones I've seen that are most popular):
Common Table Expressions (CTE) - http://msdn.microsoft.com/en-us/library/ms190766.aspx
TRY...CATCH blocks
CLR-integrated stored procs
Also, though you shouldn't be, any selections directly from system tables (objects that begin with "sys" or are in the "sys." schema) may have changed dramatically between SQL 2000 and 2005+, so I'd see if you're selecting from any of those:
SELECT *
FROM syscomments --I know, using a sys table to figure it out :)
WHERE text like '%sys%'
Also, it's worth noting that while extended support is available for a hefty fee, Microsoft has officially ended mainstream support for SQL 2000, and will end extended support in the near future. This leaves your client (and you) without any updates from Microsoft in the case of security patches, bugs, or anything else you discover. I'd strongly encourage them ot upgrade to a newer version (at least 2005), though I suspect you've already been down that road.

What are the limitations to SQL Server Compact? (Or - how does one choose a database to use on MS platforms?)

The application I want to build using MS Visual C# Express (I'm willing to upgrade to Standard if that becomes required) that needs a database.
I was all psyched about the SQL Server Compact - because I don't want the folks who would be installing my application on their computers to have to install the whole of SQL Server or something like that. I want this to be as easy as possible for the end user to install.
So I was all psyched until it seems that there are limitations to what I can do with the columns in my tables. I created a new database, created a table and when I went to create columns it seems that there isn't a "text" datatype - just something called "ntext" that seems to be limited to 255 characters. "int" seems to be limited to 4 (I wanted 11). And there doesn't seem to be an "auto_increment" feature.
Are these the real limitations I would have to live with? (Or is it because I'm using "Express" and not "Standard"). If these are the real limitations, what are my other database options that meet my requirements? (easy installation for user being the biggie - I'm assuming that my end user is just an average user of computers and if it's complicated would get frustrated with my application)
-Adeena
PS: I also want my database data to be encrypted to the end user. I don't want them to be able to access the database tables directly.
PPS. I did read: http://www.microsoft.com/Sqlserver/2005/en/us/compact.aspx and didn't see a discussion on these particular limitations
I'm not sure about encryption, but you'll probably find this link helpful:
http://msdn.microsoft.com/en-us/library/ms171955.aspx
As for the rest of it:
"Text" and "auto_increment" remind me of Access. SQL Server Compact is supposed to be upgrade compatible to the server editions of SQL Server, in that queries and tables used in your compact database should transfer to a full database without modification. With that in mind, you should first look at the SQL Server types and names rather than Access names: in this case namely varchar(max), bigint, and identity columns.
Unfortunately, you'll notice this fails with respect to varchar(max), because Compact Edition doesn't yet have the varchar(max) type. Hopefully they'll fix that soon. However, the ntext type you were looking at supports many more than 255 bytes: 230 in fact, which amounts to more than 500 million characters.
Finally, bigint uses 8 bytes for storage. You asked for 11. However, I think you may be confused here that the storage size indicates the number of decimal digits available. This is definitely NOT the case. 8 bytes of storage allows for values up to 264, which will accomodate many more than 11 digits. If you have that many items you probably want a server-class database anyway. If you really want to think in terms of digits, there is a numeric type provided as well.
A few, hopefully helpful comments:
1st - do not use SQLite unless you like having to have the entire database locked during writes (http://www.sqlite.org/faq.html#q6) and perhaps more importantly in a .Net application it is NOT thread safe or more to the point it must be recompiled to support threads (http://www.sqlite.org/faq.html#q6)
As an alternate for my current project I looked at Scimore DB (they have an embedded version with ADO.Net provider: http://www.scimore.com/products/embedded.aspx) but I needed to use LINQ To SQL as an O/RM so I had to use Sql Server CE.
The auto increment (if you are referring to automatic key incrementing) is what it always has been - example table:
-- Table Users
CREATE TABLE Tests (
Id **int IDENTITY(1,1) PRIMARY KEY NOT NULL,**
TestName nvarchar(100) NOT NULL,
TimeStamp datetime NOT NULL
)
GO
As far as the text size I think that was answered.
Here is a link to information on encryption from microsoft technet: (http://technet.microsoft.com/en-us/library/ms171955.aspx)
Hope this helps a bit....
Had to chime in on two factors:
I use Sql Compact a lot and its great for what it works for -- a single user, embedded, database, with a single file data store. It has all the SQL goodness and transactions. It hadles parallellism well enough for me. Notice that few of the naysayers on this page use the product regularly. Don't use it on a server, that's not what its for. Many of my customers don't even know the file is a "database", that is just an implementation issue.
You want to encrypt the data from your users -- presumably so they can only view it from your program. This simply isn't going to happen. If your program can decrypt the data, then you have to store the key somewhere, and a sufficently dedicated attacker will find it, period.
You may be able to hide the key well enough that the effort to recover it isn't worth the value of the information. Windows has some neat machine and user local encryption routines to help. But if your design has a strong requirement that a user never find data you have hidden on their computer (but your program will) you need to redesign -- that guarentee simply cannot be accomplished.
SQL CE is a puzzle to me. Did we really need yet another different SQL database platform? And it's the third in the last several years targeted at mobile platforms from MS ... I wouldn't have a lot of confidence that it will be the final one. It doesn't share much if any technology with SQL Server - it's a new one from scratch as far as I can tell.
I've tried it, and then been more successful with both SQLite and Codebase.
EDIT: Here is a list of the (many) differences.
ntext supports very large text data (see MSDN - this is for Compact 4.0, but the same applies to 3.5 for the data types you are mentioning).
int is a numeric data type, so the size of 4 means 4 bytes/32 bits of storage (–2,147,483,648 to 2,147,483,647). If you intend to store 11 bytes of data in a single column, use the varbinary type with a size of 11.
Automatically incrementing columns in the SQL Server world are done using the IDENTITY keyword. This causes the value of the column to be automatically determined by SQL Server when inserting data into a row, preventing collisions with any other rows.
You can also set a password or encrypt the database when creating it in SQL Compact to prevent users from directly accessing your application. See Securing Databases on MSDN.
All of the items you mention above are not really limitations, so much as they are understanding how to use SQL Server.
Having said that, there are some limitations to SQL Compact.
No support for NVARCHAR(MAX)
NTEXT works just fine for this
No support for VIEWs or PROCEDUREs
This is what I see as the primary limitation
I've used the various SQL Server Compact editions on a few occasions, but only ever as data capture repositories on mobile platforms - where it works well for syncing with a server database, and with that sort of scenario is undoubtedly the optional choice.
However if you need something to do more than that and act as a primary database to your application then I'd suggest SQLLite is probably the better option, it's completely solid, widely supported and found in all sorts of places (used on the iPhone for example) but is surprisingly capable (The Virtual Reality simulator OpenSim uses it as it's default database) and there are lots of others (including Microsoft).
I must also chime in here with VistaDB as an alternative to SQL CE.
VistaDB does support encryption (Blowfish), it also supports TEXT as well as NTEXT (including FTS indexes on them).
And yes the post above is correct in that you have to look at the SQL Server types to really match them up, VistaDB also uses the SQL Server types (we actually support more than SQL CE does; only missing XML).
To see other comparisons between VistaDB and SQL CE visit the comparison page. Also see the SO thread on Advantages of VistaDB for more information.
(Full disclosure - I am the owner of VistaDB so I may be biased)
According to this post (http://www.nelsonpires.com/web-development/microsoft-webmatrix-the-dawn-of-a-new-era/) it says that because it uses a database file, only one process can access it for every read/write and as a result it needs exclusive access to the file, also it is limited to 256 connections and the whole file will most likely have to be loaded in memory. So SQL server compact might not be good for your site when it grows.
There are constraints... Joel seems to have addressed the details. SQL CE is really geared for mobile development. Most of the "embedded" database solutions have similar constraints. Check out
SQLite
No TEXT field character limit
Auto increment only on INTEGER PRIMARY KEY column
Some third party encryption support
Esent
(unmanaged code isn't my forte, and I can't decipher the unmanaged docs)

Resources