How can I set a unique key constraint for the following table to ensure the date/time span between the Date/BeginTime and Date/EndTime do not overlap with another record? If I need to add a computed column, what data type and calculation?
Column Name Data Type
Date date
BeginTime time(7)
EndTime time(7)
Thanks.
I don't believe that you can do that using a UNIQUE constraint in SQL Server. Postgres has this capability, but to implement it in SQL Server you must use a trigger. Since your question was "how can I do this using a unique key constraint", the correct answer is "you can't". If you had asked "how can I enforce this non-overlapping constraint", there is an answer.
Alexander Kuznetsov shows one possible way. Storing intervals of time with no overlaps.
See also article by Joe Celko: Contiguous Time Periods
Here is the table and the first interval:
CREATE TABLE dbo.IntegerSettings(SettingID INT NOT NULL,
IntValue INT NOT NULL,
StartedAt DATETIME NOT NULL,
FinishedAt DATETIME NOT NULL,
PreviousFinishedAt DATETIME NULL,
CONSTRAINT PK_IntegerSettings_SettingID_FinishedAt
PRIMARY KEY(SettingID, FinishedAt),
CONSTRAINT UNQ_IntegerSettings_SettingID_PreviousFinishedAt
UNIQUE(SettingID, PreviousFinishedAt),
CONSTRAINT FK_IntegerSettings_SettingID_PreviousFinishedAt
FOREIGN KEY(SettingID, PreviousFinishedAt)
REFERENCES dbo.IntegerSettings(SettingID, FinishedAt),
CONSTRAINT CHK_IntegerSettings_PreviousFinishedAt_NotAfter_StartedAt
CHECK(PreviousFinishedAt <= StartedAt),
CONSTRAINT CHK_IntegerSettings_StartedAt_Before_FinishedAt
CHECK(StartedAt < FinishedAt)
);
INSERT INTO dbo.IntegerSettings
(SettingID, IntValue, StartedAt, FinishedAt, PreviousFinishedAt)
VALUES(1, 1, '20070101', '20070103', NULL);
Constraints enforce these rules:
There can be only one first interval for a setting
Next window must begin after the end of the previous one
Two different windows cannot refer to one and the same window as their previous one
-- this is a unique key that allows for null in EndTime field
-- This Unique Index could be clusteres optionally instead of the traditional primary key being clustered
CREATE UNIQUE NONCLUSTERED INDEX
[UNQ_IDX_Date_BeginTm_EndTm_UniqueIndex_With_Null_EndTime] ON [MyTableName]
(
[Date] ASC,
[BeginTime] ASC,
[EndTime] ASC
)
GO
-- this is a traditional PK Constraint that is clustered but EndTime is
--- Not Null
-- it is possible that this table would not have a traditional Primary Key
ALTER TABLE dbo.MyTable ADD CONSTRAINT
PK_Date_BeginTm_EndTm_EndTimeIsNotNull PRIMARY KEY CLUSTERED
(
Date,
BeginTime,
EndTime
)
GO
-- HINT - Control your BeginTime and EndTime secconds and milliseconds at
-- all insert and read points
-- you want 13:01:42.000 and 13:01:42.333 to evaluate and compare the
-- exact way you expect from a KEY perspective
Related
Hope for help because of the following problem. Assume we have a table
CREATE TABLE [dbo].[dummy](
[id] [char](36) NOT NULL,
[name] [varchar](50) NOT NULL
) ON [PRIMARY]
If I create a primary key like this (version 1)
ALTER TABLE dummy ADD CONSTRAINT PK_dummy PRIMARY KEY (ID);
I get a unique name. In this case PK_dummy.
But if I create a primary key like this (version 2)
ALTER TABLE dummy ADD PRIMARY KEY Clustered (ID);
The name changes with every recreation of this primary key.
The format is always PK__dummy__"a dynamic number"
What is the meaning of this number?
And how can I identify primary keys created with version 2 in a hugh database?
Thanks for hints.
What is the meaning of this number?
This depends on product version - it is either based on a unique id or generated randomly.
how can I identify primary keys created with version 2 in a huge database?
SELECT *
FROM sys.key_constraints
WHERE is_system_named = 1
If you don't define the name of a constraint, index, key, etc, SQL Server will give it a name. To ensure uniqueness across the database, it therefore will add "random" characters at the end.
If having a consistent name is important then define the name in your statement, as you did in the first statement.
I want to create the following 2 tables, in which the start date has to be before the end date:
CREATE TABLE ParentEntity
(
ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
StartDate date NOT NULL,
EndDate date NOT NULL,
Description varchar(255)
);
ALTER TABLE ParentEntity
ADD CHECK (DATEDIFF(day,StartDate ,EndDate) > 0)
CREATE TABLE ChildEntity
(
ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
Parent int NOT NULL FOREIGN KEY REFERENCES ParentEntity (ID),
StartDate date NOT NULL,
EndDate date NOT NULL,
Description varchar(255)
);
ALTER TABLE ChildEntity
ADD CHECK (DATEDIFF(day,StartDate ,EndDate) > 0)
Now I want to add this check: the StartDate and EndDate of each ChildEntity row must occur within the date interval between the StartDate and EndDate of the corresponding ParentEntity row.
How can I make this check? I do not know how to refer to the row which is defined by means of foreign key.
I'd use simple comparisons in the CHECKs rather than DATEDIFF.
If you want to reference values in columns of the parent table, you unfortunately need to duplicate the columns in the child. A FK with ON UPDATE CASCADE can take care of maintaining that data on an ongoing basis. You then need to decide whether you wish to expose the existence of these columns in the child table and how they get initially populated.
So, the basic one is:
CREATE TABLE ParentEntity (
ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
StartDate date NOT NULL,
EndDate date NOT NULL,
Description varchar(255),
constraint UQ_ParentEntity_Dates UNIQUE (ID,StartDate,EndDate)
);
ALTER TABLE ParentEntity
ADD CHECK (StartDate < EndDate)
CREATE TABLE ChildEntity (
ID int NOT NULL IDENTITY(1,1) PRIMARY KEY,
Parent int NOT NULL FOREIGN KEY REFERENCES ParentEntity (ID),
StartDate date NOT NULL,
EndDate date NOT NULL,
Description varchar(255),
ParentStart date NOT NULL,
ParentEnd date NOT NULL,
constraint FK_ChildParent FOREIGN KEY
(Parent,ParentStart,ParentEnd) references ParentEntity
(ID,StartDate,EndDate) ON UPDATE CASCADE
);
ALTER TABLE ChildEntity
ADD CHECK (StartDate <EndDate)
ALTER TABLE ChildEntity
ADD CHECK (StartDate >=ParentStart and EndDate <= ParentEnd)
If you wish to hide the existence of these additional columns in the child table, you can create a view and provide an INSTEAD OF trigger that populates these columns. You then have your callers use the view exclusively rather than the base table.
When doing this, you end up with a "super key" declared in the parent (here ID, StartDate and EndDate when just ID alone is a key) and a redundant foreign key constraint (the one on just ID). I usually leave this FK in place to document the "real" FK constraint between the tables. Some may choose to remove the redundant constraint.
(As a complete aside, I'd also recommend you use constraint clauses to introduce all of your constraints, PKs, FKs, UQs and CKs and take the opportunity to name them. It makes ongoing maintenance a lot easier. There's also no need to separate the CHECK constraints away from the initial CREATE TABLE)
I have two columns. SQL is:
CREATE TABLE tdegree
(
degree varchar(25) NOT NULL primary key
)
CREATE TABLE tstudy
(
study varchar(25) NOT NULL unique,
degree varchar(25) NOT NULL FOREIGN KEY REFERENCES tdegree (degree)
)
as shown in below figure I want to add record
master it
mphil it
but not
master it
master it
If I apply unique constraints on study field then it does not allowed me to add
master it
mphil it
Which constraints i have to apply?
If I understand correctly, you want the combination of study and degree to be unique, but values can be repeated in individual columns e.g. Row 1 is Master + IT and Row 2 is MPhil + IT, so that both rows are unique but study column has duplicate value IT. In that case, you need to add a unique constraint on both columns together like so:
alter table tstudy add constraint cunique unique(study,degree)
Note that you also need to change the definition of your tstudy table and remove the unique constraint on study column, otherwise you will get a unique constraint violation error, even after adding the above constraint:
CREATE TABLE tstudy
(
study varchar(25) NOT NULL,
degree varchar(25) NOT NULL FOREIGN KEY REFERENCES tdegree (degree)
)
I am trying to use the following:
CREATE TABLE [dbo].[Application] (
[ApplicationId] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (MAX) NULL,
timestamp
CONSTRAINT [PK_dbo.Application] PRIMARY KEY CLUSTERED ([ApplicationId] ASC)
);
Can someone confirm if this is the correct way. Also can or should I give that column a name of its own?
* Note that I am using Entity Framework. So is it okay to add a column like this but to not add it to the Application object?
I think that timestamp is a poor name for that datatype (it does not store time) and somewhere along the way Microsoft did too and has deprecated the use of timestamp since SQL Server 2008 in favor of rowversion introduced in SQL Server 2000.
Your code uses a behavior of timestamp that it gives the column a default name, rowversion does not do that so you have to give the column a name.
CREATE TABLE [dbo].[Application] (
[ApplicationId] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (MAX) NULL,
VerCol rowversion
CONSTRAINT [PK_dbo.Application] PRIMARY KEY CLUSTERED ([ApplicationId] ASC)
);
Ref:
rowversion (Transact-SQL)
timestamp SQL Server 2000
* Note that I know nothing about using Entity Framework.
CREATE TABLE [dbo].[Application] (
[ApplicationId] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (MAX) NULL,
timestamp DATETIME NULL DEFAULT GETDATE()
CONSTRAINT [PK_dbo.Application] PRIMARY KEY CLUSTERED ([ApplicationId] ASC)
);
To add the timestamp / rowversion to an existing table you can do this.
ALTER Table OrderAction ADD [RowVersion] rowversion not null
It will automatically assign timestamps, you don't need to anything like UPDATE rowversion = getdate()
Please note that if your table is large it can take a while since it needs to add a timestamp for every row. If you have a huge table and you're using a scalable database like Azure SQL you might want to increase capacity first and/or do it during off hours.
timestamp data type is identical to rowversion datatype - it's just up to you what you call the column.
It also doesn't need to be in your data model to be updated by an UPDATE or INSERT. However if it isn't in your data model then you won't actually benefit from the whole point of it which is to get a simplified UPDATE like this:
WHERE ([OrderId] = #p0) AND ([RowVersion] = #p1)
We have a table that is going to be say 100 million to a billion rows (Table name: Archive)
This table will be referenced from another table, Users.
We have 2 options for the primary key on the Archive table:
option 1: dataID (bigint)
option 2: userID + datetime (4 byte version).
Schema:
Users
- userID (int)
Archive
- userID
- datetime
OR
Archive
- dataID (big int)
Which one would be faster?
We are shying away from using Option#1 because bigint is 8 bytes and with 100 million rows that will add up to allot of storage.
Update
Ok sorry I forgot to mention, userID and datetime have to be regardless, so that was the reason for not adding another column, dataID, to the table.
Some thoughts, but there is probably not a clear cut solution:
If you have a billion rows, why not use int which goes from -2.1 billion to +2.1 billion?
Userid, int, 4 bytes + smalldatetime, 4 bytes = 8 bytes, same as bigint
If you are thinking of userid + smalldatetime then surely this is useful anyway.
If so, adding a surrogate "archiveID" column will increase space anyway
Do you require filtering/sorting by userid + smalldatetime?
Make sure your model is correct, worry about JOINs later...
Concern: Using UserID/[small]datetime carries with it a high risk of not being unique.
Here is some real schema. Is this what you're talking about?
-- Users (regardless of Archive choice)
CREATE TABLE dbo.Users (
userID int NOT NULL IDENTITY,
<other columns>
CONSTRAINT <name> PRIMARY KEY CLUSTERED (userID)
)
-- Archive option 1
CREATE TABLE dbo.Archive (
dataID bigint NOT NULL IDENTITY,
userID int NOT NULL,
[datetime] smalldatetime NOT NULL,
<other columns>
CONSTRAINT <name> PRIMARY KEY CLUSTERED (dataID)
)
-- Archive option 2
CREATE TABLE dbo.Archive (
userID int NOT NULL,
[datetime] smalldatetime NOT NULL,
<other columns>
CONSTRAINT <name> PRIMARY KEY CLUSTERED (userID, [datetime] DESC)
)
CREATE NONCLUSTERED INDEX <name> ON dbo.Archive (
userID,
[datetime] DESC
)
If this were my decision, I would definitely got with option 1. Disk is cheap.
If you go with Option 2, it's likely that you will have to add some other column to your PK to make it unique, then your design starts degrading.
What's with option 3: Making dataID a 4 byte int?
Also, if I understand it right, the archive table will be referenced from the users table, so it wouldn't even make much sense to have the userID in the archive table.
I recommend that you setup a simulation to validate this in your environment, but my guess would be that the single bigint would be faster in general; however when you query the table what are you going to be querying on?
If I was building an arhive, I might lean to having an autoincrement identity field, and then using a partioning scheme to partion based on DateTime and perhaps userid but that would depend on the circumstance.