Microsoft SQL Server 2014 sequence - sql-server

A brief description, I am building a VB.NET program from scratch including database. Therefore I am required to set the unique transaction ID for each table for example header_id, detail_id, order_number, and lots more that require any running increment number. I am upgrading from SQL Server 2005 to SQL Server 2014 so that I can have a built-in SEQUENCE for the running number job.
My current situation (SQL Server 2005, VB.NET) is I am using a table to store all the running number and a stored procedure to execute the running numbers from my VB.NET program. For example in Sales Order, I will pass a hard-coded parameter to the stored procedure to find the value in the table and then increase the number by 1 and then insert it into the Sales Order table.
Before I start migrating the database and redesign the table structure, I would like to know if I am on the correct start, which means for each table I have to assign a specific sequence for it? Please guide.

Usually you do not need a SEQUENCE to generate unique, increasing identity values for single tables. Even with SQL Server 2005, you have two simpler options for that:
Define an IDENTITY column. For example:
CREATE TABLE Orders
(
OrderId INT IDENTITY(1,1) NOT NULL,
… -- ^^^^^^^^^^^^^
); -- very much like an unnamed sequence that START WITH 1 INCREMENT BY 1
When INSERT-ing into this table, you do not need to specify a value for OrderId, it will be chosen for you by the RDBMS. The resulting IDs will be unique (but there is the possibility of gaps).
Instead of using integer number IDs, use GUIDs:
CREATE TABLE Orders
(
OrderId UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL DEFAULT (NEWSEQUENTIALID()),
…
);
The DEFAULT constraint means you don't have to explicitly choose a value for OrderId when INSERT-ing; the RDBMS will generate a value for you.
P.S.: NEWSEQUENTIALID() ensures that the generated GUIDs are steadily increasing. This is important if the GUID column is used for clustering (i.e. when you have a PRIMARY KEY CLUSTERED (OrderId ASC) constraint), as mentioned in a comment below. If the column is not used for clustering and it's only important that GUIDs are unique, but not necessarily increasing, then you can also use NEWID() instead.
Of course you can also use a SEQUENCE, but such a one has no added benefit over the above two (simpler) solutions. This changes when you have to create unique IDs across several tables, for example:
CREATE SEQUENCE OrderIds START WITH 1 INCREMENT BY 1;
CREATE TABLE FlowerOrders
(
OrderId INT NOT NULL DEFAULT (NEXT VALUE FOR OrderIds),
…
);
CREATE TABLE FlowerPotOrders
(
OrderId INT NOT NULL DEFAULT (NEXT VALUE FOR OrderIds)
…
);
This way it should be impossible that FlowerOrders and FlowerPotOrders contain overlapping OrderIds.

Related

SQL Server non-clustered index is not being used

An application (the underlying code of which can't be modified) is running a select statement against a relatively large table (~1.8M rows, 2GB) and creating a huge performance bottleneck on the DB server.
The table itself has approx. 120 columns of varying datatype. The select statement is selecting about 100 of those columns based on values of 2 columns which have both been indexed individually and together.
e.g.
SELECT
column1,
column2,
column3,
column4,
column5,
and so on.....
FROM
ITINDETAIL
WHERE
(column23 = 1 AND column96 = 1463522)
However, SQL Server chooses to ignore the index and instead scans the clustered (PK) index which takes forever (this was cancelled after 42 seconds, it's been known to take over 8 minutes on a busy production DB).
If I simply change the select statement, replacing all columns with just a select count(*) , the index is used and results return in milliseconds.
EDIT: I believe ITINDETAIL_004 (where column23 and column96 have been indexed together) is the index that should be getting used for the original statement.
Confusingly, if I create a non-clustered index on the table for the two columns in the where clause like this:
CREATE NONCLUSTERED INDEX [ITINDETAIL20220504]
ON [ITINDETAIL] ([column23] ASC, [column96] ASC)
INCLUDE (column1, column2, column3, column4, column5,
and so on..... )
and include ALL other columns from the select statement in the index (in my mind, this is essentially just creating a TABLE!), then run the original select statement, the new HUGE index is used and results are returned very quickly:
However, I'm not sure this is the right way to address the problem.
How can I force SQL Server to use what I think is the correct index?
Version is: SQL Server 2017
Based on additional details from the comments, it appears that the index you want SQL Server to use isn't a covering index; this means that the index doesn't contain all the columns that are referenced in the query. As such, if SQL Server were to use said index, then it would need to first do a seek on the index, and then perform a key lookup on the clustered index to get the full details of the row. Such lookups can be expensive.
As a result of the index you want not being covering, SQL Server has determined that the index you want it to to use would produce an inferior query plan to simply scanning the entire clustered index; which is by definition covering as it INCLUDEs all other columns not in the CLUSTERED INDEX.
For your index ITINDETAIL20220504 you have INCLUDEd all the columns that are in your SELECT, which means that it is covering. This means that SQL Server can perform a seek on the index, and get all the information it needs from that seek; which is far less costly that a seek followed by a key lookup and quicker than a scan of the entire clustered index. This is why this information works.
We coould put this into some kind of analogy using a Library type scenario, which is full of Books, to help explain this idea more:
Let's say that the Clustered Index is a list of every book in the library sorted by it's ISBN number (The Primary Key). Along side that ISBN number you have the details of the Author, Title, Publication Date, Publisher, If it's hardcover or softcover, the colour of the spine, the section of the Library the Book is located in, the book case, and the shelf.
Now let's say you want to obtain any books by the the Author Brandon Sanderson published on or after 2015-01-01. If you then wanted to you could go through the entire list, one by one, finding the books by that author, checking the publication date, and then writing down it's location so you can go and visit each of those locations and collect the book. This is effectively a Clustered Index Scan.
Now let's say you have a list of all the books in the Library again. The list contains the Author, Publication Date, and the ISBN (The Primary Key), and is ordered by the Author and the Publication Date. You want to fulfil the same task; obtain any books by the the Author Brandon Sanderson published on or after 2015-01-01. Now you can easily go through that list and find all those books, but you don't know where they are. As a result even after you have gone straight to the Brandon Sanderson "section" of the list, you'll still need to write all the ISBNs down, and then find each of those ISBN in the original list, get their location and title. This is your index ITINDETAIL_004; you can easily find the rows you want to filter to, but you don't have all the information so you have to go somewhere else afterwards.
Lastly we have a 3rd list, this list is ordered by the author and then publication date (like the 2nd list), but also includes the Title, the section of the Library the Book is located in, the book case, and the shelf, as well as the ISBN (Primary key). This list is ideal for your task; it's in the right order, as you can easily go to Brandon Sanderson and then the first book published on or after 2015-01-01, and it has the title and location of the book. This is your INDEX ITINDETAIL20220504 would be; it has the information in the order you want, and contains all the information you asked for.
Saying all this, you can force SQL Server to choose the index, but as I said in my comment:
In truth, rarely is the index you think is the correct is the correct index if SQL Server thinks otherwise. Despite what some believe, SQL Server is very good at making informed choices about what index(es) it should be using, provided your statistics are up to date, and the cached plan isn't based on old and out of date information.
Let's, however, demonstrate what happens if you do with a simple set up:
CREATE TABLE dbo.SomeTable (ID int IDENTITY(1,1) PRIMARY KEY,
SomeGuid uniqueidentifier DEFAULT NEWID(),
SomeDate date)
GO
INSERT INTO dbo.SomeTable (SomeDate)
SELECT DATEADD(DAY,T.I,'19000101')
FROM dbo.Tally(100000,0) T;
Now we have a table with 100001 rows, with a Primary Key on ID. Now let's do a query which is an overly simplified version of yours:
SELECT ID,
SomeGuid
FROM dbo.SomeTable
WHERE SomeDate > '20220504';
No surprise, this results in an Clsutered Index Scan:
Ok, let's add an index on SomeDate and run the query again:
CREATE INDEX IX_NonCoveringIndex ON dbo.SomeTable (SomeDate);
GO
--Index Scan of Clustered index
SELECT ID,
SomeGuid
FROM dbo.SomeTable
WHERE SomeDate > '20220504';
Same result, and SSMS has even suggested an index:
Now, as I mentioned, you can force SQL Server to use a specific index. Let's do that and see what happens:
SELECT ID,
SomeGuid
FROM dbo.SomeTable WITH (INDEX(IX_NonCoveringIndex))
WHERE SomeDate > '20220504';
And this gives exactly the plan I suggested; A key lookup:
This is expensive. In fact, if we turn on the statistics for IO and Time, the query without the index hint took 40ms, the one with the hint took 107ms in the first run. Subsequent runs all had the second query taking around double the time of the first. IO wise the first query has a simple scan and 398 logical reads; the latter had 5 scans and 114403 logical reads!
Now, finally, let's add that covering Index and run:
CREATE INDEX IX_CoveringIndex ON dbo.SomeTable (SomeDate) INCLUDE (SomeGuid);
GO
SELECT ID,
SomeGuid
FROM dbo.SomeTable
WHERE SomeDate > '20220504';
Here we can see that seek we wanted:
If we look at the IO and times again compared to the prior 2, we get 1 scan, 202 logical reads, and it was running in about 25ms.

The row-limitation in compound primary key in SQL Server 2014

I am going to insert a 2.3 billion rows (2,300,000,000) from table_a into table_b. The schema of table_a and table_b are identical, the only difference is table_a doesn't have a primary key but table_b has set up a 4 columns compound primary key with 0 rows of data. I encounter the error message after 24 hours:
Msg 666, Level 16, State 2, Line 1
The maximum system-generated unique value for a duplicate group was exceeded for index with partition ID 422223771074560. Dropping and re-creating the index may resolve this; otherwise, use another clustering key.
This is my compound PK in table_b and the sample query code, any help will be thankful.
column1: varchar(10), not null
column2: nvarchar(50), not null
column3: nvarchar(100), not null
column4: int, not null
Sample code
insert into table_b
select *
from table_a
where date < '2017-01-01' -- some filters here
According to the SQL Server Documentation part of creating a primary key includes creating a unique index on that same table.
When you create a PRIMARY KEY constraint, a unique index on the
column, or columns, is automatically created. By default, this index
is clustered; however, you can specify a nonclustered index when you
create the constraint.
When a unique index is not on the table, each row gets what the docs are calling a "uniqueifier" which is 4 bytes in length (aka ~2.14 Billion combinations)
If the clustered index is not created with the UNIQUE property, the
Database Engine automatically adds a 4-byte uniqueifier column to the
table. When it is required, the Database Engine automatically adds a
uniqueifier value to a row to make each key unique. This column and
its values are used internally and cannot be seen or accessed by
users.
From this information and your error message we can tell two things:
There is a clustered index on the table
There is not a primary key on the table
Given the volume of the data you're dealing with, I'm betting you have a Clustered Columnstore Index on the table, which in SQL Server 2014 does not have the ability to have a primary key on.
One possible solution is to partition table_b based on particular column value (that has less than 15K unique values based on the limitations specified in the documentation). As a side-note, the same partitioning effort could have a significant impact on minimizing run time of any queries using table_b depending on which column is used in the partition function.
You know that:
If the clustered index is not created with the UNIQUE property, the
Database Engine automatically adds a 4-byte uniqueifier column to the
table. When it is required, the Database Engine automatically adds a
uniqueifier value to a row to make each key unique. This column and
its values are used internally and cannot be seen or accessed by
users.
While it´s unlikely that you will face an issue related with uniqueifiers, we have seen rare cases where customer reaches the uniqueifier limit of 2,147,483,648, generating error 666.
And from this topic about the issue we have:
As of February 2018, the design goal for the storage engine is to not
reset uniqueifiers during REBUILDs. As such, rebuild of the index
ideally would not reset uniquifiers and issue would continue to occur,
while inserting new data with a key value for which the uniquifiers
were exhausted. But current engine behavior is different for one
specific case, if you use the statement ALTER INDEX ALL ON
REBUILD WITH (ONLINE = ON), it will reset the uniqueifiers (across all
version starting SQL Server 2005 to SQL Server 2017).
So, if this is the cause if your issue, you can add additional integer column and build the index over it.

SQL Server Auto Incrementing Identity Per Customer(Tenant) With No Gaps

We have a multi-tenant database which holds multiple customers with each customer having a collection of users like so (Simplified example omitting foreign key specification from users to customers):
CREATE TABLE dbo.Customers
(
CustomerId INT NOT NULL IDENTITY(1, 1),
Name NVARCHAR(256) NOT NULL
)
CREATE TABLE dbo.Users
(
User INT NOT NULL IDENTITY(1, 1),
CustomerId INT NOT NULL,
)
As part of this design the users are required to have a membership number, when we designed this we decided to use the UserId as the membership number however as with all things this requirement has grown and this is no longer an option for two reasons:
After we upgraded to 2012 on each server restart the column is jumping by 1000 values, we have used the workaround shown here: http://www.codeproject.com/Tips/668042/SQL-Server-2012-Auto-Identity-Column-Value-Jump-Is (-t272) to stop that happening but has made us realise that IDENTITY(1, 1) isn't good enough.
What we really want now is to ensure that the number is incremented per customer but it has to be permanent and cannot change once assigned.
Obviously a sequence will not work as again it needs to be per customer we also need to enforce a unique constraint on this per customer/user and ensure that the value is never changed once assigned and does not change if a user is deleted (although this shouldn't happen as we don't delete users but mark them as archived, however I want to guarantee this won't affect it).
Below is a sample of what I wrote which can generate the number, but what is the best way to use this or something similar which ensures a unique, sequential value per customer/user without a chance of any issues as users could be created at the same time from different sessions.
ROW_NUMBER() OVER (ORDER BY i.UserId) + ISNULL((SELECT MAX(users.MembershipNumber)
FROM [User].Users users
WHERE users.Customers_CustomerId = i.Customers_CustomerId), 0)
EDIT: Clarification
I apologise I just re-read my question and I did not make this clear enough, we are not looking to replace UserId, we are happy with the gaps and unique per database identifier that is used on all foreign keys, what we are looking to add is a MembershipNumber that will be displayed to the User which is why it needs to be sequential per customer with no gaps as this membership number will be used on cards that are given to the user so needs to be unique.
Since you already found the problem with Identity columns and how to fix it, I wouldn't say it's not good enough.
However, it doesn't seem to suit your needs since you want the user number to increment per customer.
I would suggest keeping the User column as an Identity column and the primary key of the table, and add another column to specify the User number by customer. this column will also be an integer number with a default value of the result of a UDF that will calculate the next number per customer (see example in this post).
You can protect that value from ever changing by using an instead of update trigger on the users table.
This way to keep a single column primary key, any you have a unique, sequential user number per customer.
Update
Apparently, it is impossible to send column values to a default constraint.
But you can still use an instead of insert trigger to accomplish your goal.
It's because of the default caching sqlserver implements for the sequence objects. See this former thread
Identity increment is jumping in SQL Server database
If the gaps are an issue, sql-server2012 has introduced the Sequence object. These you can declare with NOCACHE, so restarting the Server doesn't create gaps.
I want to share my thoughts on it. Please see below.
Create seperate table which will holds CustomerID and Count columns like below.
CREATE TABLE dbo.CustomerSequence
(
#CustomerID int,
#Count int
);
Write some kind of stored proc like below.
CREATE PROC dbo.usp_GetNextValueByCustomerID
#CustomerID int,
#Count int OUTPUT
AS
BEGIN
UPDATE dbo.CustomerSequence
SET #Count = Count += Count
WHERE CustomerID = #CustomerID;
END
Just call the above stored proc by passing CustomerID and get the next Sequence value from it.
If you have several users adding new registers simultaneously, I think the best idea is to create a compound Primary key, where the user is a tiny byte (if you have less than 255 users) and the incremental number is an integer. Then, when adding a new register you create a string Primary Key, like 'NN.xxxxxx' . Assuming [Number] is your incremental number and [Code] is the user's code (or local machine assigned number), you assign the new UserId using the DMax function , as follows:
NextNumber = Nz(DMax("Number", "clients", "Code=" & Me!code, 0) + 1
UserId= code & "." & NextNumber
where
NN is the user's code
"." is used to separate both fields, and
XXXX is the new Number

SQL Server UNIQUE constraint with duplicate NULLs [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How do I create unique constraint that also allows nulls in sql server
I have a table where I need to force a column to have unique values.
This column must be nullable and by business logic multiple NULL values should be permitted, whereas other duplicate values are not.
SQL Server UNIQUE constraint is no good in this situation because it considers NULL as regular values, so it will reject duplicate NULLs.
Currently, value uniqueness is granted by the BLL so I'm not looking for a dirty hack to make it work.
I just would like to know if there is a clean solution to enforce this constraint in the DB.
And yeah, I know I can write a trigger to do that: is a trigger the only solution? (or the best solution anyway?)
If you're using SQL Server 2008 (won't work for earlier version) there is the concept of a filtered index. You can create the index on a filtered subset of the table.
CREATE UNIQUE INDEX indexName ON tableName(columns) INCLUDE includeColumns
WHERE columnName IS NOT NULL
Duplicate of this question?
The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:
CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)
Works on SQL Server 2000. You may need ARITHABORT on e.g.
ALTER DATABASE MyDatabase SET ARITHABORT ON
If you're using SQL Server 2008, have a look into Filtered Indexes to achieve what you want.
For older version of SQL Server, a possible alternative to a trigger involves a computed column:
Create a computed column which uses the value of your "unique" column if it's not NULL, otherwise it uses the value of the row's Primary Key column (or any column which will be unique).
Apply a UNIQUE constraint to the computed column.
http://www.sqlmag.com/article/articleid/98678/sql_server_blog_98678.html
will work only in Microsoft SQL Server 2008
You can create a view in which you select only not null values and create an index on it.
Here is the source - Creating Indexed Views
You should use UNIQUEIDENTIFIER in that column, can be NULL and also is unique by definition.
Hope that helps.

Can you limit the number of rows in a (database) table?

We have a database (SQL Server 2005) which we would like to get under source control. As part of that we are going to have a version table to store the current version number of the database. Is there a way to limit that table to only holding one row? Or is storing the version number in a table a bad idea?
Ended up using this approach:
CREATE TABLE [dbo].[DatabaseVersion]
(
[MajorVersionNumber] [int] NOT NULL,
[MinorVersionNumber] [int] NOT NULL,
[RevisionNumber] [int] NOT NULL
)
GO
Insert DataBaseVersion (MajorVersionNumber, MinorVersionNumber, RevisionNumber) values (0, 0, 0)
GO
CREATE TRIGGER DataBaseVersion_Prevent_Delete
ON DataBaseVersion INSTEAD OF DELETE
AS
BEGIN
RAISERROR ('DatabaseVersion must always have one Row. (source = INSTEAD OF DELETE)', 16, 1)
END
GO
CREATE TRIGGER DataBaseVersion_Prevent_Insert
ON DataBaseVersion INSTEAD OF INSERT
AS
BEGIN
RAISERROR ('DatabaseVersion must always have one Row. (source = INSTEAD OF INSERT)', 16, 1)
END
GO
Use a trigger.
Generalize the table to hold "settings" and make it a key/value pair
CREATE TABLE Settings (Key nvarchar(max), Value nvarchar(max))
Then make a unique index on Key.
CREATE UNIQUE INDEX SettingsIDX ON Settings (Key)
That will create a table with unique key value pairs, one of which can be Version.
INSERT INTO Settings (Key, Value) VALUES ('Version','1');
You can use Joe Celko's default+primary+check technique:
create table database_version (
lock char(1) primary key default 'x' check (lock='x'),
major_version_number int NOT NULL,
minor_version_number int NOT NULL,
revision_number int NOT NULL
);
Fiddle with it
Not at all. You can simply add another, ascending column to that table (date, id, whatever), and then order the query by that other column descendingly and limit the result to 1 row:
SELECT v.version FROM version v ORDER by v.date DESC LIMIT 1;
This way you even get a history of when each version was reached.
Edit:
The above sql query wouldn't work on SQL Server since it doesn't support the LIMIT statement. One would have to circumvent that deficiency, possibly as described in this "All Things SQL Server" blog entry.
Based on your comments to other responses, it seems that:
You don't want users to just modify the value.
You only ever want one value returned.
The value is static, and scripted.
So, might I suggest that you script a function that returns the static value? Since you'll have to script an update to the version number anyway, you'll simply drop and recreate the function in your script when you update the database.
This has the advantage of being usable from a view or a procedure, and since a function's return value is read-only, it can't be modified (without modifying the function).
EDIT: You also wouldn't have to worry about convoluted solutions for keeping a table constrained to one row.
Just a suggestion.
Keeping a version number for the database makes total sense. However I prefer to have a Version table that can contain multiple rows with fields for the version number, the time the update occured and the user that performed the upgrade.
That way you know which upgrade scripts have been run and can easily see if they have been run out of sequence.
When you want to read the current version number you can just read the most recent record.
If you only store one record you have know way of knowing if a script has been missed out. If you want to be really clever you can put checks in you upgrade scripts so they won't run unless the previous version of the database is correct.
By creating the one allowable original row as part of the database initialization script, and (also in that script) removing Insert permissions to that table for all logins (Only Updates will be allowed)
You might also want to disallow deletes as well...

Resources