Can you limit the number of rows in a (database) table? - sql-server

We have a database (SQL Server 2005) which we would like to get under source control. As part of that we are going to have a version table to store the current version number of the database. Is there a way to limit that table to only holding one row? Or is storing the version number in a table a bad idea?
Ended up using this approach:
CREATE TABLE [dbo].[DatabaseVersion]
(
[MajorVersionNumber] [int] NOT NULL,
[MinorVersionNumber] [int] NOT NULL,
[RevisionNumber] [int] NOT NULL
)
GO
Insert DataBaseVersion (MajorVersionNumber, MinorVersionNumber, RevisionNumber) values (0, 0, 0)
GO
CREATE TRIGGER DataBaseVersion_Prevent_Delete
ON DataBaseVersion INSTEAD OF DELETE
AS
BEGIN
RAISERROR ('DatabaseVersion must always have one Row. (source = INSTEAD OF DELETE)', 16, 1)
END
GO
CREATE TRIGGER DataBaseVersion_Prevent_Insert
ON DataBaseVersion INSTEAD OF INSERT
AS
BEGIN
RAISERROR ('DatabaseVersion must always have one Row. (source = INSTEAD OF INSERT)', 16, 1)
END
GO

Use a trigger.

Generalize the table to hold "settings" and make it a key/value pair
CREATE TABLE Settings (Key nvarchar(max), Value nvarchar(max))
Then make a unique index on Key.
CREATE UNIQUE INDEX SettingsIDX ON Settings (Key)
That will create a table with unique key value pairs, one of which can be Version.
INSERT INTO Settings (Key, Value) VALUES ('Version','1');

You can use Joe Celko's default+primary+check technique:
create table database_version (
lock char(1) primary key default 'x' check (lock='x'),
major_version_number int NOT NULL,
minor_version_number int NOT NULL,
revision_number int NOT NULL
);
Fiddle with it

Not at all. You can simply add another, ascending column to that table (date, id, whatever), and then order the query by that other column descendingly and limit the result to 1 row:
SELECT v.version FROM version v ORDER by v.date DESC LIMIT 1;
This way you even get a history of when each version was reached.
Edit:
The above sql query wouldn't work on SQL Server since it doesn't support the LIMIT statement. One would have to circumvent that deficiency, possibly as described in this "All Things SQL Server" blog entry.

Based on your comments to other responses, it seems that:
You don't want users to just modify the value.
You only ever want one value returned.
The value is static, and scripted.
So, might I suggest that you script a function that returns the static value? Since you'll have to script an update to the version number anyway, you'll simply drop and recreate the function in your script when you update the database.
This has the advantage of being usable from a view or a procedure, and since a function's return value is read-only, it can't be modified (without modifying the function).
EDIT: You also wouldn't have to worry about convoluted solutions for keeping a table constrained to one row.
Just a suggestion.

Keeping a version number for the database makes total sense. However I prefer to have a Version table that can contain multiple rows with fields for the version number, the time the update occured and the user that performed the upgrade.
That way you know which upgrade scripts have been run and can easily see if they have been run out of sequence.
When you want to read the current version number you can just read the most recent record.
If you only store one record you have know way of knowing if a script has been missed out. If you want to be really clever you can put checks in you upgrade scripts so they won't run unless the previous version of the database is correct.

By creating the one allowable original row as part of the database initialization script, and (also in that script) removing Insert permissions to that table for all logins (Only Updates will be allowed)
You might also want to disallow deletes as well...

Related

Creating a history table without using triggers

I have a TABLE A with 3000 records with 25 columns. I want to have a history table called Table A history holding all the changes updates and deletes for me to look up any day. I usually use cursors. Now thought using triggers which I was not asked to. Do you have any other suggestions? Many thanks!
If your using tsql /SQL server and you can't use triggers, which is the only sure way to get every change, maybe use a stored procedure that is scheduled in job to run every x amount of time, the stored procedure using a MERGE statement with the two tables to get new records or changes. I would not suggest this if you need every single change without question.
CREATE TABLE dbo.TableA (id INT, Column1 nvarchar(30))
CREATE TABLE dbo.TableA_History (id INT, Column1 nvarchar(30), TimeStamp DateTime)
(this code isn't production, just the general idea)
Put the following code inside a stored procedure and use a Sql Server Job with a schedule on it.
MERGE INTO dbo.TableA_History
USING dbo.TableA
ON TableA_History.id = TableA.id AND TableA_History.Column1 = TableA.Column1
WHEN NOT MATCHED BY TARGET THEN
INSERT (id,Column1,TimeStamp) VALUES (TableA.id,TableA.Column1,GETDATE())
So basically if the record either doesn't exist or doesn't match meaning a column changed, insert the record into the history table.
It is possible to create history without triggers in some case, even if you are not using SQL Server 2016 and system-versioned table are not available.
In some cases, when you can identify for sure which routines are modifying your table, you can create history using OUTPUT INTO clause.
For example,
INSERT INTO [dbo].[MainTable]
OUTPUT inserted.[]
,...
,'I'
,GETUTCDATE()
,#CurrentUserID
INTO [dbo].[HistoryTable]
SELECT *
FROM ... ;
In routines, when you are using MERGE I like that we can use $action:
Is available only for the MERGE statement. Specifies a column of type
nvarchar(10) in the OUTPUT clause in a MERGE statement that returns
one of three values for each row: 'INSERT', 'UPDATE', or 'DELETE',
according to the action that was performed on that row.
It's very handy that we can add the user which is modifying the table. Using triggers you need to use session context or session variable to pass the user. In versioning table you need to add additional column to the main table in order to log the user as it only logs the current table columns (at least for now).
So, basically it depends on your data and application. If you have many sources of CRUD over the table, the trigger is the most secure way. If your table is very big and heavily used, using MERGE is not good as it my cause blocking and harm performance.
In our databases we are using all of the methods depending on the situation:
triggers for legacy
system-versioning for new development
direct OUTPUT in the history, when sure that data is modified only by given set of routines

Updating identity column in SQL Server and setting the seed starting value

I have a table filled with data and one of the columns - TrackingNumber - is an integer value. I have gotten a request to change it to auto-increment and to have it start the identity seed at 1000000. I know that I cannot alter a column to be an identity column in an existing table with data, so I have two options: either create an entirely new table and then move data from the old table into that new table or add an new identity column and update it with data from the old column.
The problem is that I need to retain all the existing values in column TrackingNumber. I have tried the following:
1) ALTER TABLE [dbo].[Table1]
ADD [TrackingNumber2] [bigint] IDENTITY (1000000, 1) NOT NULL
2) UPDATE [dbo].[Table1]
SET [TrackingNumber2]=[TrackingNumber]
3) ALTER TABLE [dbo].[Table1]
DROP COLUMN [TrackingNumber]
GO
4) EXEC sp_rename 'Table1.TrackingNumber2', 'TrackingNumber','COLUMN'
I got an error on step 2 - Updating new column with the value from the old column: "Cannot update identity column 'TrackingNumber2'"
Can anyone recommend a workaround?
You just need to set identity_insert on for this table so you can update the values. Make sure you turn it back off when you complete the update. :)
https://msdn.microsoft.com/en-us/library/ms188059.aspx
Are you sure you need to use an identity column? There are alternatives. For example, since SQL Server 2012 (and azure, too), there are these things called sequences. You can define a sequence to start at any number you like:
create sequence dbo.TrackingSequence
as int
start with 1000000
increment by 1
no maxvalue
no cycle
no cache
Then, you can alter the table such that the default value for the column in question defaults from the sequence:
alter table dbo.MyTable
add constraint [MyTable.TrackingNumber.Default.TrackingSequence]
default( next value for dbo.TrackingSequence ) for TrackingNumber
(If the column already has a default value, you need to remove that first - in a separate statement.)
A sequence works a lot like an identity, but you don't have to disrupt existing values or the column definition per se.
The only trick is to remember to not specify the value for TrackingNumber, and let the DB do its thing.
Sequences are cool in that you can have one sequence that is used by multiple tables, giving you somewhat shorter db-wide unique IDs than alternatives in the past. For such an application, you'd probably be better off with a bigint column - or know that the tables in question aren't going to be terribly long.
I ended up creating a new table and moving data in there

SQL Server Auto Incrementing Identity Per Customer(Tenant) With No Gaps

We have a multi-tenant database which holds multiple customers with each customer having a collection of users like so (Simplified example omitting foreign key specification from users to customers):
CREATE TABLE dbo.Customers
(
CustomerId INT NOT NULL IDENTITY(1, 1),
Name NVARCHAR(256) NOT NULL
)
CREATE TABLE dbo.Users
(
User INT NOT NULL IDENTITY(1, 1),
CustomerId INT NOT NULL,
)
As part of this design the users are required to have a membership number, when we designed this we decided to use the UserId as the membership number however as with all things this requirement has grown and this is no longer an option for two reasons:
After we upgraded to 2012 on each server restart the column is jumping by 1000 values, we have used the workaround shown here: http://www.codeproject.com/Tips/668042/SQL-Server-2012-Auto-Identity-Column-Value-Jump-Is (-t272) to stop that happening but has made us realise that IDENTITY(1, 1) isn't good enough.
What we really want now is to ensure that the number is incremented per customer but it has to be permanent and cannot change once assigned.
Obviously a sequence will not work as again it needs to be per customer we also need to enforce a unique constraint on this per customer/user and ensure that the value is never changed once assigned and does not change if a user is deleted (although this shouldn't happen as we don't delete users but mark them as archived, however I want to guarantee this won't affect it).
Below is a sample of what I wrote which can generate the number, but what is the best way to use this or something similar which ensures a unique, sequential value per customer/user without a chance of any issues as users could be created at the same time from different sessions.
ROW_NUMBER() OVER (ORDER BY i.UserId) + ISNULL((SELECT MAX(users.MembershipNumber)
FROM [User].Users users
WHERE users.Customers_CustomerId = i.Customers_CustomerId), 0)
EDIT: Clarification
I apologise I just re-read my question and I did not make this clear enough, we are not looking to replace UserId, we are happy with the gaps and unique per database identifier that is used on all foreign keys, what we are looking to add is a MembershipNumber that will be displayed to the User which is why it needs to be sequential per customer with no gaps as this membership number will be used on cards that are given to the user so needs to be unique.
Since you already found the problem with Identity columns and how to fix it, I wouldn't say it's not good enough.
However, it doesn't seem to suit your needs since you want the user number to increment per customer.
I would suggest keeping the User column as an Identity column and the primary key of the table, and add another column to specify the User number by customer. this column will also be an integer number with a default value of the result of a UDF that will calculate the next number per customer (see example in this post).
You can protect that value from ever changing by using an instead of update trigger on the users table.
This way to keep a single column primary key, any you have a unique, sequential user number per customer.
Update
Apparently, it is impossible to send column values to a default constraint.
But you can still use an instead of insert trigger to accomplish your goal.
It's because of the default caching sqlserver implements for the sequence objects. See this former thread
Identity increment is jumping in SQL Server database
If the gaps are an issue, sql-server2012 has introduced the Sequence object. These you can declare with NOCACHE, so restarting the Server doesn't create gaps.
I want to share my thoughts on it. Please see below.
Create seperate table which will holds CustomerID and Count columns like below.
CREATE TABLE dbo.CustomerSequence
(
#CustomerID int,
#Count int
);
Write some kind of stored proc like below.
CREATE PROC dbo.usp_GetNextValueByCustomerID
#CustomerID int,
#Count int OUTPUT
AS
BEGIN
UPDATE dbo.CustomerSequence
SET #Count = Count += Count
WHERE CustomerID = #CustomerID;
END
Just call the above stored proc by passing CustomerID and get the next Sequence value from it.
If you have several users adding new registers simultaneously, I think the best idea is to create a compound Primary key, where the user is a tiny byte (if you have less than 255 users) and the incremental number is an integer. Then, when adding a new register you create a string Primary Key, like 'NN.xxxxxx' . Assuming [Number] is your incremental number and [Code] is the user's code (or local machine assigned number), you assign the new UserId using the DMax function , as follows:
NextNumber = Nz(DMax("Number", "clients", "Code=" & Me!code, 0) + 1
UserId= code & "." & NextNumber
where
NN is the user's code
"." is used to separate both fields, and
XXXX is the new Number

SQL server trigger question

I am by no means a sql programmer and I am trying to accomplish something that I am pretty sure has been done a million times before.
I am trying to auto generate a customer number in sql every time a new customer is inserted, but the trigger (or sp?) will only work if at least the first name, last name and another value called case number is entered. If any of these fields are missing, the system generates an error. If the criteria is met, the system generates and assigns a unique id to that customer that begins with letters GL- and then uses 5 digit number so a customer John Doe would be GL-00001 and Jane Doe would be GL-00002.
I am sorry if I am asking too much but I am basically a select insert update guy and nothing more so thanks in advance for any help.
If I were in this situation, I would:
--Alter the table(s) so that first name, last name and case number are required (NOT NULL) columns. Handle your checks for required fields on the application side before submitting the record to the database.
--If it doesn't already exist, add an identity column to the customer table.
--Add a persisted computed column to the customer table that will format the identity column into the desired GL-00000 format.
/* Demo computed column for customer number */
create table #test (
id int identity,
customer_number as 'GL-' + left('00000', 5-len(cast(id as varchar(5)))) + cast(id as varchar(5)) persisted,
name char(20)
)
insert into #test (name) values ('Joe')
insert into #test (name) values ('BobbyS')
select * from #test
drop table #test
This should satisfy your requirements without the need to introduce the overhead of a trigger.
So what do you want to do? generate a customer number even when these fields arn't populated?
Have you looked at the SQL for the trigger? You can do this in SSMS (SQL Server Managment Studio) by going to the table in question in the Object Explorer, expanding the table and then expanding triggers.
If you open up the trigger you'll see what it does to generate the customer number. If you are unsure on how this code works, then post the code for the trigger up.
If you are making changes to an existing system i'd advise you to find out any implications that changing the way data is inputted works.
For example, others parts of the application may depend on all of the initial values being populated, so after changing the trigger to allow incomplete data to be added, you may inturn break something else.
You have probably a unique constraint and/or NOT NULL constraints set on the table.
Remove/Disable these (for example with the SQL-Server Management Console in Design Mode) and then try again to insert the data. Keep in mind, that you will probably not be able to enable the constraints after your insert, since you are violating conditions after the insert. Only disable or reomve the constraints, if you are absolutely sure that they are unecessary.
Here's example syntax (you need to know the constraint names):
--disable
ALTER TABLE customer NOCHECK CONSTRAINT your_constraint_name
--enable
ALTER TABLE customer CHECK CONSTRAINT your_constraint_name
Caution: If I were you, I'd rather try to insert dummy values for the not null columns like this:
insert into customers select afield , 1 as dummyvalue, 2 as dummyvalue from your datasource
A very easy way to do this would be to create a table of this sort of structure:
CustomerID of type in that is a primary key and set it as identity
CustomerIDPrfix of type varchar(3) which stores GL- as a default value.
Then add your other fields and set them to NOT NULL.
If that way is not acceptable and you do need to write a trigger check out these two articles:
http://msdn.microsoft.com/en-us/library/aa258254(SQL.80).aspx
http://www.kodyaz.com/articles/sql-trigger-example-in-sql-server-2008.aspx
Basiclly it is all about getting the logic right to check if the fields are blank. Experiment with a test database on your local machine. This will help you get it right.

How do you add a NOT NULL Column to a large table in SQL Server?

To add a NOT NULL Column to a table with many records, a DEFAULT constraint needs to be applied. This constraint causes the entire ALTER TABLE command to take a long time to run if the table is very large. This is because:
Assumptions:
The DEFAULT constraint modifies existing records. This means that the db needs to increase the size of each record, which causes it to shift records on full data-pages to other data-pages and that takes time.
The DEFAULT update executes as an atomic transaction. This means that the transaction log will need to be grown so that a roll-back can be executed if necessary.
The transaction log keeps track of the entire record. Therefore, even though only a single field is modified, the space needed by the log will be based on the size of the entire record multiplied by the # of existing records. This means that adding a column to a table with small records will be faster than adding a column to a table with large records even if the total # of records are the same for both tables.
Possible solutions:
Suck it up and wait for the process to complete. Just make sure to set the timeout period to be very long. The problem with this is that it may take hours or days to do depending on the # of records.
Add the column but allow NULL. Afterward, run an UPDATE query to set the DEFAULT value for existing rows. Do not do UPDATE *. Update batches of records at a time or you'll end up with the same problem as solution #1. The problem with this approach is that you end up with a column that allows NULL when you know that this is an unnecessary option. I believe that there are some best practice documents out there that says that you should not have columns that allow NULL unless it's necessary.
Create a new table with the same schema. Add the column to that schema. Transfer the data over from the original table. Drop the original table and rename the new table. I'm not certain how this is any better than #1.
Questions:
Are my assumptions correct?
Are these my only solutions? If so, which one is the best? I f not, what else could I do?
I ran into this problem for my work also. And my solution is along #2.
Here are my steps (I am using SQL Server 2005):
1) Add the column to the table with a default value:
ALTER TABLE MyTable ADD MyColumn varchar(40) DEFAULT('')
2) Add a NOT NULL constraint with the NOCHECK option. The NOCHECK does not enforce on existing values:
ALTER TABLE MyTable WITH NOCHECK
ADD CONSTRAINT MyColumn_NOTNULL CHECK (MyColumn IS NOT NULL)
3) Update the values incrementally in table:
GO
UPDATE TOP(3000) MyTable SET MyColumn = '' WHERE MyColumn IS NULL
GO 1000
The update statement will only update maximum 3000 records. This allow to save a chunk of data at the time. I have to use "MyColumn IS NULL" because my table does not have a sequence primary key.
GO 1000 will execute the previous statement 1000 times. This will update 3 million records, if you need more just increase this number. It will continue to execute until SQL Server returns 0 records for the UPDATE statement.
Here's what I would try:
Do a full backup of the database.
Add the new column, allowing nulls - don't set a default.
Set SIMPLE recovery, which truncates the tran log as soon as each batch is committed.
The SQL is: ALTER DATABASE XXX SET RECOVERY SIMPLE
Run the update in batches as you discussed above, committing after each one.
Reset the new column to no longer allow nulls.
Go back to the normal FULL recovery.
The SQL is: ALTER DATABASE XXX SET RECOVERY FULL
Backup the database again.
The use of the SIMPLE recovery model doesn't stop logging, but it significantly reduces its impact. This is because the server discards the recovery information after every commit.
You could:
Start a transaction.
Grab a write lock on your original table so no one writes to it.
Create a shadow table with the new schema.
Transfer all the data from the original table.
execute sp_rename to rename the old table out.
execute sp_rename to rename the new table in.
Finally, you commit the transaction.
The advantage of this approach is that your readers will be able to access the table during the long process and that you can perform any kind of schema change in the background.
Just to update this with the latest information.
In SQL Server 2012 this can now be carried out as an online operation in the following circumstances
Enterprise Edition only
The default must be a runtime constant
For the second requirement examples might be a literal constant or a function such as GETDATE() that evaluates to the same value for all rows. A default of NEWID() would not qualify and would still end up updating all rows there and then.
For defaults that qualify SQL Server evaluates them and stores the result as the default value in the column metadata so this is independent of the default constraint which is created (which can even be dropped if no longer required). This is viewable in sys.system_internals_partition_columns. The value doesn't get written out to the rows until next time they happen to get updated.
More details about this here: online non-null with values column add in sql server 2012
Admitted that this is an old question. My colleague recently told me that he was able to do it in one single alter table statement on a table with 13.6M rows. It finished within a second in SQL Server 2012. I was able to confirm the same on a table with 8M rows. Something changed in later version of SQL Server?
Alter table mytable add mycolumn char(1) not null default('N');
I think this depends on the SQL flavor you are using, but what if you took option 2, but at the very end alter table table to not null with the default value?
Would it be fast, since it sees all the values are not null?
If you want the column in the same table, you'll just have to do it. Now, option 3 is potentially the best for this because you can still have the database "live" while this operation is going on. If you use option 1, the table is locked while the operation happens and then you're really stuck.
If you don't really care if the column is in the table, then I suppose a segmented approach is the next best. Though, I really try to avoid that (to the point that I don't do it) because then like Charles Bretana says, you'll have to make sure and find all the places that update/insert that table and modify those. Ugh!
I had a similar problem, and went for your option #2.
It takes 20 minutes this way, as opposed to 32 hours the other way!!! Huge difference, thanks for the tip.
I wrote a full blog entry about it, but here's the important sql:
Alter table MyTable
Add MyNewColumn char(10) null default '?';
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 0 and 1000000
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 1000000 and 2000000
go
update MyTable set MyNewColumn='?' where MyPrimaryKey between 2000000 and 3000000
go
..etc..
Alter table MyTable
Alter column MyNewColumn char(10) not null;
And the blog entry if you're interested:
http://splinter.com.au/adding-a-column-to-a-massive-sql-server-table
I had a similar problem and I went with modified #3 approach. In my case the database was in SIMPLE recovery mode and the table to which column was supposed to be added was not referenced by any FK constraints.
Instead of creating a new table with the same schema and copying contents of original table, I used SELECT…INTO syntax.
According to Microsoft (http://technet.microsoft.com/en-us/library/ms188029(v=sql.105).aspx)
The amount of logging for SELECT...INTO depends on the recovery model
in effect for the database. Under the simple recovery model or
bulk-logged recovery model, bulk operations are minimally logged. With
minimal logging, using the SELECT… INTO statement can be more
efficient than creating a table and then populating the table with an
INSERT statement. For more information, see Operations That Can Be
Minimally Logged.
The sequence of steps :
1.Move data from old table to new while adding new column with default
SELECT table.*, cast (‘default’ as nvarchar(256)) new_column
INTO table_copy
FROM table
2.Drop old table
DROP TABLE table
3.Rename newly created table
EXEC sp_rename 'table_copy', ‘table’
4.Create necessary constraints and indexes on the new table
In my case the table had more than 100 million rows and this approach completed faster than approach #2 and log space growth was minimal.
1) Add the column to the table with a default value:
ALTER TABLE MyTable ADD MyColumn int default 0
2) Update the values incrementally in the table (same effect as accepted answer). Adjust the number of records being updated to your environment, to avoid blocking other users/processes.
declare #rowcount int = 1
while (#rowcount > 0)
begin
UPDATE TOP(10000) MyTable SET MyColumn = 0 WHERE MyColumn IS NULL
set #rowcount = ##ROWCOUNT
end
3) Alter the column definition to require not null. Run the following at a moment when the table is not in use (or schedule a few minutes of downtime). I have successfully used this for tables with millions of records.
ALTER TABLE MyTable ALTER COLUMN MyColumn int NOT NULL
I would use CURSOR instead of UPDATE. Cursor will update all matching records in batch, record by record -- it takes time but not locks table.
If you want to avoid locks use WAIT.
Also I am not sure, that DEFAULT constrain changes existing rows.
Probably NOT NULL constrain use together with DEFAULT causes case described by author.
If it changes add it in the end
So pseudocode will look like:
-- without NOT NULL constrain -- we will add it in the end
ALTER TABLE table ADD new_column INT DEFAULT 0
DECLARE fillNullColumn CURSOR LOCAL FAST_FORWARD
SELECT
key
FROM
table WITH (NOLOCK)
WHERE
new_column IS NULL
OPEN fillNullColumn
DECLARE
#key INT
FETCH NEXT FROM fillNullColumn INTO #key
WHILE ##FETCH_STATUS = 0 BEGIN
UPDATE
table WITH (ROWLOCK)
SET
new_column = 0 -- default value
WHERE
key = #key
WAIT 00:00:05 --wait 5 seconds, keep in mind it causes updating only 12 rows per minute
FETCH NEXT FROM fillNullColumn INTO #key
END
CLOSE fillNullColumn
DEALLOCATE fillNullColumn
ALTER TABLE table ALTER COLUMN new_column ADD CONSTRAIN xxx
I am sure that there are some syntax errors, but I hope that this
help to solve your problem.
Good luck!
Vertically segment the table. This means you will have two tables, with the same primary key, and exactly the same number of records... One will be the one you already have, the other will have just the key, and the new Non-Null column (with default value) .
Modify all Insert, Update, and delete code so they keep the two tables in synch... If you want you can create a view that "joins" the two tables together to create a single logical combination of the two that appears like a single table for client Select statements...

Resources