Most efficient design to search for this data in my database? - sql-server

I have the following database tables and a view which represents that data. The tables are heirachial (if that is how u describe it) :-
EDIT: I've replace my 3 tables with
FAKE table names/data (for this post)
because I'm under NDA to not post
anything about out projects, etc. So
yeah.. I don't really save people
names like this :)
FirstNames
FirstNameId INT PK NOT NULL IDENTITY
Name VARCHAR(100)
MiddleNames
MiddleNameId INT PK NOT NULL IDENTITY
Name VARCHAR(100) NOT NULL
FirstNameId INT FK NOT NULL
Surnames
SurnameId INT PK NOT NULL IDENTITY
Name VARCHAR(100) NOT NULL
FirstNameId INT FK NOT NULL
So, the firstname is the parent table with the other two tables being children.
The view looks like...
PersonNames
FirstNameId
FirstName
MiddleNameId
MiddleName
SurnameId
Surname
Here's some sample data.
FNID FN MNID MN SNID SN
-----------------------------------
1 Joe 1 BlahBlah 1 Blogs
2 Jane - - 1 Blogs
3 Jon - - 2 Skeet
Now here's the problem. How can i efficiently search for names on the view? I was going to have a Full Text Search/Catalogue, but I can't put that on a view (or at least I can't get it working using the GUI against a View).
EDIT #2: Here are some sample search queries :-
exec uspSearchForPeople 'joe blogs' (1 result)
exec uspSearchForPeople 'joe' (1 result)
exec uspSearchForPeople 'blogs' (2 results)
exec uspSearchForPeople 'jon skeet' (1 result)
exec uspSearchForPeople 'skeet' (1 result)
Should i generate a new table with the full names? how would that look?
please help!

This doesn't seem like the most logical design decision. Why did you design it like this?
What's your indexing structure currently? A index on Name on each of the 3 tables should speed up the query?
Alternatively, normalizing further and creating a Name table and having NameID in each of the three, then indexing the Name table should also increase performance, but I think indexing the name field on the 3 tables would be easier and work as well.
What's the stats on updates vs selects, as adding these indexes might incur a performance hit.

crazy design, possibly the fake table names makes it stranger than it is.
create indexes based on select usage.
if you are searching on actual first names like "Joe" you need an index on FirstNames.Name
if you are searching on first name ids like 123, you have an index: FirstNames.FirstNameId
if you want to search on FirstNames.name and/or MiddleNames.name and/or Surnames.name you need to have indexes on the combinations that you will use, and the more you make, the harder for the query to pick the best one.
ditch the view and write a dedicated query for the purpose:
go after first/middle
select
FirstNames.name
,MiddleNames.name
,Surnames.name
FROM FirstNames
INNER JOIN MiddleNames ON FirstNames.FirstNameId=MiddleNames.FirstNameId
INNER JOIN Surnames ON FirstNames.FirstNameId=Surnames.FirstNameId
WHERE FirstNames.Name='John'
AND MiddleNames.Name='Q'
go after last
select
FirstNames.name
,MiddleNames.name
,Surnames.name
FROM Surnames
INNER JOIN FirstNames ON Surnames.FirstNameId =FirstNames.FirstNameId
INNER JOIN MiddleNames ON FirstNames.FirstNameId=MiddleNames.FirstNameId
WHERE Surnames.Name='Public'
just make sure you have indexes to cover your main table in the "where" clause
use SET SHOWPLAN_ALL ON to make sure you are using an index ("scans" are bad "seeks" are good")
EDIT
if possible break the names apart before searching for them:
exec uspSearchForPeople 'joe',null,'blogs' (1 result)
exec uspSearchForPeople 'joe',null,null (1 result)
exec uspSearchForPeople null,null,'blogs' (2 results)
exec uspSearchForPeople 'jon',null,'skeet' (1 result)
exec uspSearchForPeople null,null,'skeet' (1 result)
within the stored procedure, have three queries:
if #GivenFirstName is not null
--search from FirstNames where FirstNames.name=#value & join in other tables
else if #GivenMiddleName is not null
--search from MiddleNames where MiddleNames.name=#value & join in other tables
else if #GivenLastName is not null
--search from Surnames where Surnames.name=#value & join in other tables
else --error no names given
have an index on all three tables for Names.
if you can not split the names apart, I think you are out of luck and you will have to table scan every row in each table.
Just think of a phone book if you don't use the index and you are looking for a name, you will need to read the entire book

I would have just one table with a name type column (first, middle, last) and an FK onto itself with the clustered index on the name column.
CREATE TABLE [Name] (
NameID INT NOT NULL IDENTITY,
[Name] varchar(100) not null,
NameType varchar(1) not null,
FirstNameID int null,
)
ALTER TABLE [Name] ADD CONSTRAINT PK_Name PRIMARY KEY NONCLUSTERED (NameID)
ALTER TABLE [Name] ADD CONSTRAINT FK_Name_FirstNameID FOREIGN KEY (FirstNameID) REFERENCES [Name](NameID)
CREATE CLUSTERED INDEX IC_Name ON [Name] ([Name], NameType)
DECLARE #fid int
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('Joe', 'F', NULL)
SELECT #fid = scope_identity()
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('BlahBlah', 'M', #fid)
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('Blogs', 'L', #fid)
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('Jane', 'F', NULL)
SELECT #fid = scope_identity()
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('Blogs', 'L', #fid)
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('Jon', 'F', NULL)
SELECT #fid = scope_identity()
INSERT [Name] ([Name], NameType, FirstNameID) VALUES ('Skeet', 'L', #fid)
You could then build a dynamic but paramterized WHERE clause based on the number of values to search (or hard-code them for that matter assuming there are only at most 3) using sp_executsql in a stored proc, linq to sql, or even ugly string manipulation in code.

I think what you are wanting is an Index table. It doesn't matter how many tables and columns you have in those tables as stuff is inserted into the database it gets indexed. ex.
I would recommend one table for your names.
NameTable
----------
Id
FirstName
MiddleName
LastName
You can have as many normal tables as you want...
IndexTable
----------
Id
Text
You could use the text as the primary key but I always have a separate id column for the primary key (just habit).
IndexItemTable
----------
Id
IndexId // Has a foreign key reference to IndexTable Id
ReferenceId // The record Id of where the text occures
ReferenceTable // The table where the text occures
Then as you insert a name "Jim Barbarovich Fleming" you would also scan you index and find that its empty and create 3 new records for Jim, Barbarovic, and Fleming that would all have the same referenceId and the ReferenceTable would be "NameTable" then you insert another record like "Jim Bradley Fleming" you would scan the index table and see that you already have values for "Jim" and "Fleming" so you would just create IndexItem with referenceId of 2 and ReferenceTable of "NameTable".
By building and index you can search via a single textbox and find all records/fields in your database that have those values.
Note: you going to want to change everything when you insert it to the index to uppercase or lower case and then use equals(value, OrdinalIgnoreCase).
Edit:
I can't just upload the image. I have to host it somewhere I guess but It not any different than the table diagrams I put above. The only relationship IndexTable has is to IndexItemTable. I would do the rest in code. ex.
During Insert or Update of new record in Name table you would have to:
Scan IndexTable and see if each of the fields in the NameTable exist.
If they don't you would add a new record to the Index table with the text that wasn't found. If they do the go on to step 3.
Add a record in the IndexItemTable with the referenceId (the id of the record in the NameTable) and ReferenceTable (NameTable) and then the IndexId of the text found in the IndexTable.
Then when they do a search via your single text box you search for each word in the index table and return the Names from the NameTable that are referenced in the IndexTable.

Related

Using TSQLT FakeTable to test a table created by a Stored Procedure

I am learning to write unit tests for work. I was advised to use TSQLT FakeTable to test some aspects of a table created by a stored procedure.
In other unit tests, we create a temp table for the stored procedure and then test the temp table. I'm not sure how to work the FakeTable into the test.
EXEC tSQLt.NewTestClass 'TestThing';
GO
CREATE OR ALTER PROCEDURE TestThing.[test API_StoredProc to make sure parameters work]
AS
BEGIN
DROP TABLE IF EXISTS #Actual;
CREATE TABLE #Actual ----Do I need to create the temp table and the Fake table? I thought I might need to because I'm testing a table created by a stored procedure.
(
ISO_3166_Alpha2 NVARCHAR(5),
ISO_3166_Alpha3 NVARCHAR(5),
CountryName NVARCHAR(100),
OfficialStateName NVARCHAR(300),
sovereigny NVARCHAR(75),
icon NVARCHAR(100)
);
INSERT #Actual
(
ISO_3166_Alpha2,
ISO_3166_Alpha3,
CountryName,
OfficialStateName,
sovereigny,
icon
)
EXEC Marketing.API_StoredProc #Username = 'AnyValue', -- varchar(100)
#FundId = 0, -- int
#IncludeSalesForceInvestorCountry = NULL, -- bit
#IncludeRegisteredSalesJurisdictions = NULL, -- bit
#IncludeALLCountryForSSRS = NULL, -- bit
#WHATIF = NULL, -- bit
#OUTPUT_DEBUG = NULL -- bit
EXEC tsqlt.FakeTable #TableName = N'#Actual', -- nvarchar(max) -- How do I differentiate between the faketable and the temp table now?
#SchemaName = N'', -- nvarchar(max)
#Identity = NULL, -- bit
#ComputedColumns = NULL, -- bit
#Defaults = NULL -- bit
INSERT INTO #Actual
(
ISO_3166_Alpha2,
ISO_3166_Alpha3,
CountryName,
OfficialStateName,
sovereigny,
icon
)
VALUES
('AF', 'AFG', 'Afghanistan', 'The Islamic Republic of Afghanistan', 'UN MEMBER STATE', 'test')
SELECT * FROM #actual
END;
GO
EXEC tSQLt.Run 'TestThing';
What I'm trying to do with the code above is basically just to get FakeTable working. I get an error: "FakeTable couold not resolve the object name #Actual"
What I ultimately want to test is the paramaters in the stored procedure. Only certain entries should be returned if, say, IncludeSalesForceInvestorCountry is set to 1. What should be returned may change over time, so that's why I was advised to use FakeTable.
In your scenario, you don’t need to fake any temp tables, just fake the table that is referenced by Marketing.API_StoredProc and populate it with values that you expect to be returned, and some you don’t. Add what you expect to see in an #expected table, call Marketing.API_StoredProc dumping the results into an #actual table and compare the results with tSQLt.AssertEqualsTable.
A good starting point might be to review how tSQLT.FakeTable works and a real world use case.
As you know, each unit test runs within its own transaction started and rolled back by the tSQLT framework. When you call tSQLt.FakeTable within a unit test, it temporarily renames the specified table then creates an exactly named facsimile of that table. The temporary copy allows NULL in every column, has no primary or foreign keys, identity column, check, default or unique constraints (although some of those can be included in the facsimile table depending on parameters passed to tSQLt.FakeTable). For the duration of the test transaction, any object that references the name table will use the fake rather than the real table. At the end of the test, tSQLt rolls back the transaction, the fake table is dropped and the original table returned to its former state (this all happens automatically). You might ask, what is the point of that?
Imagine you have an [OrderDetail] table which has columns including OrderId and ProductId as the primary key, an OrderStatusId column plus a bunch of other NOT NULL columns. The DDL for this table might look something like this.
CREATE TABLE [dbo].[OrderDetail]
(
OrderDetailId int IDENTITY(1,1) NOT NULL
, OrderId int NOT NULL
, ProductId int NOT NULL
, OrderStatusId int NOT NULL
, Quantity int NOT NULL
, CostPrice decimal(18,4) NOT NULL
, Discount decimal(6,4) NOT NULL
, DeliveryPreferenceId int NOT NULL
, PromisedDeliveryDate datetime NOT NULL
, DespatchDate datetime NULL
, ActualDeliveryDate datetime NULL
, DeliveryDelayReason varchar(500) NOT NULL
/* ... other NULL and NOT NULL columns */
, CONSTRAINT PK_OrderDetail PRIMARY KEY CLUSTERED (OrderId, ProductId)
, CONSTRAINT AK_OrderDetail_AutoIncrementingId UNIQUE NONCLUSTERED (OrderDetailId)
, CONSTRAINT FK_OrderDetail_Order FOREIGN KEY (OrderId) REFERENCES [dbo].[Orders] (OrderId)
, CONSTRAINT FK_OrderDetail_Product FOREIGN KEY (ProductId) REFERENCES [dbo].[Product] (ProductId)
, CONSTRAINT FK_OrderDetail_OrderStatus FOREIGN KEY (OrderStatusId) REFERENCES [dbo].[OrderStatus] (OrderStatusId)
, CONSTRAINT FK_OrderDetail_DeliveryPreference FOREIGN KEY (DeliveryPreferenceId) REFERENCES [dbo].[DeliveryPreference] (DeliveryPreferenceId)
);
As you can see, this table has foreign key dependencies on the Orders, Product, DeliveryPreference and OrderStatus table. Product may in turn have foreign keys that reference ProductType, BrandCategory, Supplier among others. The Orders table has foreign key references to Customer, Address and SalesPerson among others. All of the tables in this chain have numerous columns defined as NOT NULL and/or are constrained by CHECK and other constraints. Some of these tables themselves have more foreign keys.
Now imagine you want to write a stored procedure (OrderDetailStatusUpdate) whose job it is to update the order status for a single row on the OrderDetail table. It has three input parameters #OrderId, #ProductId and #OrderStatusId. Think about what you would need to do to set up a test for this procedure. You would need to add at least two rows to the OrderDetail table including all the NOT NULL columns. You would also need to add parent records to all the FK-referenced tables, and also to any tables above that in the hierarchy, ensuring that all your inserts comply with all the nullability and other constraints on those tables too. By my count that is at least 11 tables that need to be populated, all for one simple test. And even if you bite the bullet and do all that set-up, at some point in the future someone may (probably will) come along and add a new NOT NULL column to one of those tables or change a constraint that will cause your test to fail - and that failure actually has nothing to do with your test or the stored procedure you are testing. One of the basic tenets of test-driven development is that a test should have only on reason to fail, I count dozens.
tSQLT.FakeTable to the rescue.
What is the minimum you actually need to do to in order to set up a test for that procedure? You need two rows to the OrderDetail table (one that gets updated, one that doesn’t) and the only columns you actually “need” to consider are OrderId and ProductId (the identifying key) plus OrderStatusId - the column being updated. The rest of the columns whilst important in the overall design, have no relevance to the object under test. In your test for OrderDetailStatusUpdate, you would follow these steps:
Call tSQLt.FakeTable ‘dbo.OrderDetail’
Create an #expected table (with OrderId, ProductId and OrderStatusId
columns) and populate it with the two rows you expect to end up with
(one will have the expected OrderStatusId the other can be NULL)
Add two rows to the now mocked OrderDetail table (OrderId and
ProductId only)
Call the procedure under test OrderDetailStatusUpdate passing the
OrderID and ProductID for one of the rows inserted plus the
OrderStatusId you are changing to.
Use tSQLt.AssertEqualsTable to compare the #expected table with the
OrderDetail table. This assertion will only compare the columns on
the #expected table, the other columns on OrderDetail will be ignored
Creating this test is really quick and the only reason it is ever likely to fail is because something pertinent to the code under test has changed in the underlying schema. Changes to any other columns on the OrderDetail table or any of the parent/grand-parent tables will not cause this test to break.
So the reason for using tSQLt.FakeTable (or any other kind of mock object) is to provide really robust test isolation and simply test data preparation.

SQL Server check constraints - only one particular value per group [duplicate]

How could I set a constraint on a table so that only one of the records has its isDefault bit field set to 1?
The constraint is not table scope, but one default per set of rows, specified by a FormID.
Use a unique filtered index
On SQL Server 2008 or higher you can simply use a unique filtered index
CREATE UNIQUE INDEX IX_TableName_FormID_isDefault
ON TableName(FormID)
WHERE isDefault = 1
Where the table is
CREATE TABLE TableName(
FormID INT NOT NULL,
isDefault BIT NOT NULL
)
For example if you try to insert many rows with the same FormID and isDefault set to 1 you will have this error:
Cannot insert duplicate key row in object 'dbo.TableName' with unique
index 'IX_TableName_FormID_isDefault'. The duplicate key value is (1).
Source: http://technet.microsoft.com/en-us/library/cc280372.aspx
Here's a modification of Damien_The_Unbeliever's solution that allows one default per FormID.
CREATE VIEW form_defaults
AS
SELECT FormID
FROM whatever
WHERE isDefault = 1
GO
CREATE UNIQUE CLUSTERED INDEX ix_form_defaults on form_defaults (FormID)
GO
But the serious relational folks will tell you this information should just be in another table.
CREATE TABLE form
FormID int NOT NULL PRIMARY KEY
DefaultWhateverID int FOREIGN KEY REFERENCES Whatever(ID)
From a normalization perspective, this would be an inefficient way of storing a single fact.
I would opt to hold this information at a higher level, by storing (in a different table) a foreign key to the identifier of the row which is considered to be the default.
CREATE TABLE [dbo].[Foo](
[Id] [int] NOT NULL,
CONSTRAINT [PK_Foo] PRIMARY KEY CLUSTERED
(
[Id] ASC
) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[DefaultSettings](
[DefaultFoo] [int] NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DefaultSettings] WITH CHECK ADD CONSTRAINT [FK_DefaultSettings_Foo] FOREIGN KEY([DefaultFoo])
REFERENCES [dbo].[Foo] ([Id])
GO
ALTER TABLE [dbo].[DefaultSettings] CHECK CONSTRAINT [FK_DefaultSettings_Foo]
GO
You could use an insert/update trigger.
Within the trigger after an insert or update, if the count of rows with isDefault = 1 is more than 1, then rollback the transaction.
CREATE VIEW vOnlyOneDefault
AS
SELECT 1 as Lock
FROM <underlying table>
WHERE Default = 1
GO
CREATE UNIQUE CLUSTERED INDEX IX_vOnlyOneDefault on vOnlyOneDefault (Lock)
GO
You'll need to have the right ANSI settings turned on for this.
I don't know about SQLServer.But if it supports Function-Based Indexes like in Oracle, I hope this can be translated, if not, sorry.
You can do an index like this on suposed that default value is 1234, the column is DEFAULT_COLUMN and ID_COLUMN is the primary key:
CREATE
UNIQUE
INDEX only_one_default
ON my_table
( DECODE(DEFAULT_COLUMN, 1234, -1, ID_COLUMN) )
This DDL creates an unique index indexing -1 if the value of DEFAULT_COLUMN is 1234 and ID_COLUMN in any other case. Then, if two columns have DEFAULT_COLUMN value, it raises an exception.
The question implies to me that you have a primary table that has some child records and one of those child records will be the default record. Using address and a separate default table here is an example of how to make that happen using third normal form. Of course I don't know if it's valuable to answer something that is so old but it struck my fancy.
--drop table dev.defaultAddress;
--drop table dev.addresses;
--drop table dev.people;
CREATE TABLE [dev].[people](
[Id] [int] identity primary key,
name char(20)
)
GO
CREATE TABLE [dev].[Addresses](
id int identity primary key,
peopleId int foreign key references dev.people(id),
address varchar(100)
) ON [PRIMARY]
GO
CREATE TABLE [dev].[defaultAddress](
id int identity primary key,
peopleId int foreign key references dev.people(id),
addressesId int foreign key references dev.addresses(id))
go
create unique index defaultAddress on dev.defaultAddress (peopleId)
go
create unique index idx_addr_id_person on dev.addresses(peopleid,id);
go
ALTER TABLE dev.defaultAddress
ADD CONSTRAINT FK_Def_People_Address
FOREIGN KEY(peopleID, addressesID)
REFERENCES dev.Addresses(peopleId, id)
go
insert into dev.people (name)
select 'Bill' union
select 'John' union
select 'Harry'
insert into dev.Addresses (peopleid, address)
select 1, '123 someplace' union
select 1,'work place' union
select 2,'home address' union
select 3,'some address'
insert into dev.defaultaddress (peopleId, addressesid)
select 1,1 union
select 2,3
-- so two home addresses are default now
-- try adding another default address to Bill and you get an error
select * from dev.people
join dev.addresses on people.id = addresses.peopleid
left join dev.defaultAddress on defaultAddress.peopleid = people.id and defaultaddress.addressesid = addresses.id
insert into dev.defaultaddress (peopleId, addressesId)
select 1,2
GO
You could do it through an instead of trigger, or if you want it as a constraint create a constraint that references a function that checks for a row that has the default set to 1
EDIT oops, needs to be <=
Create table mytable(id1 int, defaultX bit not null default(0))
go
create Function dbo.fx_DefaultExists()
returns int as
Begin
Declare #Ret int
Set #ret = 0
Select #ret = count(1) from mytable
Where defaultX = 1
Return #ret
End
GO
Alter table mytable add
CONSTRAINT [CHK_DEFAULT_SET] CHECK
(([dbo].fx_DefaultExists()<=(1)))
GO
Insert into mytable (id1, defaultX) values (1,1)
Insert into mytable (id1, defaultX) values (2,1)
This is a fairly complex process that cannot be handled through a simple constraint.
We do this through a trigger. However before you write the trigger you need to be able to answer several things:
do we want to fail the insert if a default exists, change it to 0 instead of 1 or change the existing default to 0 and leave this one as 1?
what do we want to do if the default record is deleted and other non default records are still there? Do we make one the default, if so how do we determine which one?
You will also need to be very, very careful to make the trigger handle multiple row processing. For instance a client might decide that all of the records of a particular type should be the default. You wouldn't change a million records one at a time, so this trigger needs to be able to handle that. It also needs to handle that without looping or the use of a cursor (you really don't want the type of transaction discussed above to take hours locking up the table the whole time).
You also need a very extensive tesing scenario for this trigger before it goes live. You need to test:
adding a record with no default and it is the first record for that customer
adding a record with a default and it is the first record for that customer
adding a record with no default and it is the not the first record for that customer
adding a record with a default and it is the not the first record for that customer
Updating a record to have the default when no other record has it (assuming you don't require one record to always be set as the deafault)
Updating a record to remove the default
Deleting the record with the deafult
Deleting a record without the default
Performing a mass insert with multiple situations in the data including two records which both have isdefault set to 1 and all of the situations tested when running individual record inserts
Performing a mass update with multiple situations in the data including two records which both have isdefault set to 1 and all of the situations tested when running individual record updates
Performing a mass delete with multiple situations in the data including two records which both have isdefault set to 1 and all of the situations tested when running individual record deletes
#Andy Jones gave an answer above closest to mine, but bearing in mind the Rule of Three, I placed the logic directly in the stored proc that updates this table. This was my simple solution. If I need to update the table from elsewhere, I will move the logic to a trigger. The one default rule applies to each set of records specified by a FormID and a ConfigID:
ALTER proc [dbo].[cpForm_UpdateLinkedReport]
#reportLinkId int,
#defaultYN bit,
#linkName nvarchar(150)
as
if #defaultYN = 1
begin
declare #formId int, #configId int
select #formId = FormID, #configId = ConfigID from csReportLink where ReportLinkID = #reportLinkId
update csReportLink set DefaultYN = 0 where isnull(ConfigID, #configId) = #configId and FormID = #formId
end
update
csReportLink
set
DefaultYN = #defaultYN,
LinkName = #linkName
where
ReportLinkID = #reportLinkId

unique index is not enforced if IsActive column is false [duplicate]

I have a situation where i need to enforce a unique constraint on a set of columns, but only for one value of a column.
So for example I have a table like Table(ID, Name, RecordStatus).
RecordStatus can only have a value 1 or 2 (active or deleted), and I want to create a unique constraint on (ID, RecordStatus) only when RecordStatus = 1, since I don't care if there are multiple deleted records with the same ID.
Apart from writing triggers, can I do that?
I am using SQL Server 2005.
Behold, the filtered index. From the documentation (emphasis mine):
A filtered index is an optimized nonclustered index especially suited to cover queries that select from a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. A well-designed filtered index can improve query performance as well as reduce index maintenance and storage costs compared with full-table indexes.
And here's an example combining a unique index with a filter predicate:
create unique index MyIndex
on MyTable(ID)
where RecordStatus = 1;
This essentially enforces uniqueness of ID when RecordStatus is 1.
Following the creation of that index, a uniqueness violation will raise an arror:
Msg 2601, Level 14, State 1, Line 13
Cannot insert duplicate key row in object 'dbo.MyTable' with unique index 'MyIndex'. The duplicate key value is (9999).
Note: the filtered index was introduced in SQL Server 2008. For earlier versions of SQL Server, please see this answer.
Add a check constraint like this. The difference is, you'll return false if Status = 1 and Count > 0.
http://msdn.microsoft.com/en-us/library/ms188258.aspx
CREATE TABLE CheckConstraint
(
Id TINYINT,
Name VARCHAR(50),
RecordStatus TINYINT
)
GO
CREATE FUNCTION CheckActiveCount(
#Id INT
) RETURNS INT AS BEGIN
DECLARE #ret INT;
SELECT #ret = COUNT(*) FROM CheckConstraint WHERE Id = #Id AND RecordStatus = 1;
RETURN #ret;
END;
GO
ALTER TABLE CheckConstraint
ADD CONSTRAINT CheckActiveCountConstraint CHECK (NOT (dbo.CheckActiveCount(Id) > 1 AND RecordStatus = 1));
INSERT INTO CheckConstraint VALUES (1, 'No Problems', 2);
INSERT INTO CheckConstraint VALUES (1, 'No Problems', 2);
INSERT INTO CheckConstraint VALUES (1, 'No Problems', 2);
INSERT INTO CheckConstraint VALUES (1, 'No Problems', 1);
INSERT INTO CheckConstraint VALUES (2, 'Oh no!', 1);
INSERT INTO CheckConstraint VALUES (2, 'Oh no!', 2);
-- Msg 547, Level 16, State 0, Line 14
-- The INSERT statement conflicted with the CHECK constraint "CheckActiveCountConstraint". The conflict occurred in database "TestSchema", table "dbo.CheckConstraint".
INSERT INTO CheckConstraint VALUES (2, 'Oh no!', 1);
SELECT * FROM CheckConstraint;
-- Id Name RecordStatus
-- ---- ------------ ------------
-- 1 No Problems 2
-- 1 No Problems 2
-- 1 No Problems 2
-- 1 No Problems 1
-- 2 Oh no! 1
-- 2 Oh no! 2
ALTER TABLE CheckConstraint
DROP CONSTRAINT CheckActiveCountConstraint;
DROP FUNCTION CheckActiveCount;
DROP TABLE CheckConstraint;
You could move the deleted records to a table that lacks the constraint, and perhaps use a view with UNION of the two tables to preserve the appearance of a single table.
You can do this in a really hacky way...
Create an schemabound view on your table.
CREATE VIEW Whatever
SELECT * FROM Table
WHERE RecordStatus = 1
Now create a unique constraint on the view with the fields you want.
One note about schemabound views though, if you change the underlying tables you will have to recreate the view. Plenty of gotchas because of that.
For those still searching for a solution, I came accross a nice answer, to a similar question and I think this can be still useful for many. While moving deleted records to another table may be a better solution, for those who don't want to move the record can use the idea in the linked answer which is as follows.
Set deleted=0 when the record is available/active.
Set deleted=<row_id or some other unique value> when marking the row
as deleted.
If you can't use NULL as a RecordStatus as Bill's suggested, you could combine his idea with a function-based index. Create a function that returns NULL if the RecordStatus is not one of the values you want to consider in your constraint (and the RecordStatus otherwise) and create an index over that.
That'll have the advantage that you don't have to explicitly examine other rows in the table in your constraint, which could cause you performance issues.
I should say I don't know SQL server at all, but I have successfully used this approach in Oracle.
Because, you are going to allow duplicates, a unique constraint will not work. You can create a check constraint for RecordStatus column and a stored procedure for INSERT that checks the existing active records before inserting duplicate IDs.

How do I create a multiple column unique constraint in SQL Server

I have a table that contains, for example, two fields that I want to make unique within the database. For example:
create table Subscriber (
ID int not null,
DataSetId int not null,
Email nvarchar(100) not null,
...
)
The ID column is the primary key and both DataSetId and Email are indexed.
What I want to be able to do is prevent the same Email and DataSetId combination appearing in the table or, to put it another way, the Email value must be unique for a given DataSetId.
I tried creating a unique index on the columns
CREATE UNIQUE NONCLUSTERED INDEX IX_Subscriber_Email
ON Subscriber (DataSetId, Email)
but I found that this had quite a significant impact on search times (when searching for an email address for example - there are 1.5 million rows in the table).
Is there a more efficient way of achieving this type of constraint?
but I found that this had quite a significant impact on search times
(when searching for an email address for example
The index you defined on (DataSetId, Email) cannot be used for searches based on email. If you would create an index with the Email field at the leftmost position, it could be used:
CREATE UNIQUE NONCLUSTERED INDEX IX_Subscriber_Email
ON Subscriber (Email, DataSetId);
This index would server both as a unique constraint enforcement and as a means to quickly search for an email. This index though cannot be used to quickly search for a specific DataSetId.
The gist of it if is that whenever you define a multikey index, it can be used only for searches in the order of the keys. An index on (A, B, C) can be used to seek values on column A, for searching values on both A and B or to search values on all three columns A, B and C. However it cannot be used to search values on B or on C alone.
I assume that only way to enter data into that table is through SPs, If that's the case you can implement some logic in your insert and update SPs to find if the values you are going to insert / update is already exists in that table or not.
Something like this
create proc spInsert
(
#DataSetId int,
#Email nvarchar(100)
)
as
begin
if exists (select * from tabaleName where DataSetId = #DataSetId and Email = #Email)
select -1 -- Duplicacy flag
else
begin
-- insert logic here
select 1 -- success flag
end
end
GO
create proc spUpdate
(
#ID int,
#DataSetId int,
#Email nvarchar(100)
)
as
begin
if exists
(select * from tabaleName where DataSetId = #DataSetId and Email = #Email and ID <> #ID)
select -1 -- Duplicacy flag
else
begin
-- insert logic here
select 1 -- success flag
end
end
GO

Can a SQL Server 2000 table have no PK, and therefore contain duplicate records?

I have an audit table and instead of defining an identity or ticketed column, I'm considering just pushing in the records of the recorded table (via triggers).
Can a SQL Server 2000 table have no PK, and therefore contain duplicate records?
If yes, does all I have to do consist of CREATING the TABLE without defining any constraint on it?
Yes, this is possible, but not necessarily a good idea. Replication and efficient indexing will be quite difficult without a primary key.
Yes a table without a primary key or Unique Constraint can have rows that are duplicated
for example
CREATE TABLE bla(ID INT)
INSERT bla (ID) VALUES(1)
INSERT bla (ID) VALUES(1)
INSERT bla (ID) VALUES(1)
SELECT * FROM bla
GO
Yes a SQL Server 2000 table can have no primary key and contain duplicate records and yes you can simply Create a table without defining any constraint on it. However I would not suggest this.
Instead, since you are creating an audit table for another table. Lets say for this example you have a Person Table and a Person Audit table that tracks changes in the person Table.
Create your Audit Table like this
CREATE TABLE dbo.PersonAuditID
(
PersonAuditID int NOT NULL IDENTITY (1, 1),
PersonId int NOT NULL,
FirstName nvarchar(50) NOT NULL,
LastName nvarchar(50) NOT NULL,
PersonWhoMadeTheChange nvarchar(100) NOT NULL,
TimeOfChange datetime NOT NULL,
ChangeAction int NOT NULL,
/* any other fields here*/
CONSTRAINT [PK_PersonAudit] PRIMARY KEY NONCLUSTERED
(
[PersonAuditID] ASC
)
) ON [PRIMARY]
This will give you a primary key, and keep records unique to the table. It also provides the ability to track who made the change, when the change was made, and if the change was an insert, update or delete.
Your triggers would look like the following
CREATE TRIGGER Insert_PERSON
ON PERSON
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO PERSONAUDIT
(PersonID,
FirstName,
LastName,
PersonWhoMadeTheChange,
TimeOfChange,
ChangeAction,
... other fields here
SELECT
PersonID,
FirstName,
LastName,
User(),
getDate(),
1,
... other fields here
FROM INSERTED
END
CREATE TRIGGER Update_PERSON
ON PERSON
AFTER UPDATE
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO PERSONAUDIT
(PersonID,
FirstName,
LastName,
PersonWhoMadeTheChange,
TimeOfChange,
ChangeAction,
... other fields here
SELECT
PersonID,
FirstName,
LastName,
User(),
getDate(),
2,
... other fields here
FROM INSERTED
END
CREATE TRIGGER Delete_PERSON
ON PERSON
AFTER DELETE
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO PERSONAUDIT
(PersonID,
FirstName,
LastName,
PersonWhoMadeTheChange,
TimeOfChange,
ChangeAction,
... other fields here
SELECT
PersonID,
FirstName,
LastName,
User(),
getDate(),
3,
... other fields here
FROM DELETED
END
SQL Server 2000+, can have tables without PK. And yes, you create them by no using a constraint.
For an audit table, you need to think of what you may be using the audit data for. And even if you are not doing auditing to spefically use to restore records when unfortunate changes were made, they are inevitably used for this. Will it be easier to identify the record you want to restore if you have a surrogate key that prevents you from accidentally restoring 30 other entries when you only want the most recent? Will a key value help you identify the 32,578 records that were deleted in one batch that needs to be restored?
What we do for auditing is have two tables for each table, one stores information about the batch of records changed, including an auto-incrementing id, the user, the application, the datetime, the number of affected records. The child table then used the ID as the fk and stored the details about the old and new values for each record inserted/updated/deleted. This really helps us when a process bug causes many records to be changed by accident.

Resources