Given two tables:
TableA
(
id : primary key,
type : tinyint,
...
)
TableB
(
id : primary key,
tableAId : foreign key to TableA.id,
...
)
There is a check constraint on TableA.type with permitted values of (0, 1, 2, 3). All other values are forbidden.
Due to the known limitations, records in TableB can exist only when TableB.TableAId references the record in TableA with TableA.type=0, 1 or 2 but not 3. The latter case is forbidden and leads the system into an invalid state.
How can I guarantee that in such case the insert to TableB will fail?
Cross-table constraint using an empty indexed view:
Tables
CREATE TABLE dbo.TableA
(
id integer NOT NULL PRIMARY KEY,
[type] tinyint NOT NULL
CHECK ([type] IN (0, 1, 2, 3))
);
CREATE TABLE dbo.TableB
(
id integer NOT NULL PRIMARY KEY,
tableAId integer NOT NULL
FOREIGN KEY
REFERENCES dbo.TableA
);
The 'constraint view'
-- This view is always empty (limited to error rows)
CREATE VIEW dbo.TableATableBConstraint
WITH SCHEMABINDING AS
SELECT
Error =
CASE
-- Error condition: type = 3 and rows join
WHEN TA.[type] = 3 AND TB.id = TA.id
-- For a more informative error
THEN CONVERT(bit, 'TableB cannot reference type 3 rows in TableA.')
ELSE NULL
END
FROM dbo.TableA AS TA
JOIN dbo.TableB AS TB
ON TB.id = TA.id
WHERE
TA.[type] = 3;
GO
CREATE UNIQUE CLUSTERED INDEX cuq
ON dbo.TableATableBConstraint (Error);
Online demo:
-- All succeed
INSERT dbo.TableA (id, [type]) VALUES (1, 1);
INSERT dbo.TableA (id, [type]) VALUES (2, 2);
INSERT dbo.TableA (id, [type]) VALUES (3, 3);
INSERT dbo.TableB
(id, tableAId)
VALUES
(1, 1),
(2, 2);
-- Fails
INSERT dbo.TableB (id, tableAId) VALUES (3, 3);
-- Fails
UPDATE dbo.TableA SET [type] = 3 WHERE id = 1;
This is similar in concept to the linked answer to Check constraints that ensures the values in a column of tableA is less the values in a column of tableB, but this solution is self-contained (does not require a separate table with more than one row at all times). It also produces a more informational error message, for example:
Msg 245, Level 16, State 1
Conversion failed when converting the varchar value 'TableB cannot reference type 3 rows in TableA.' to data type bit.
Important notes
The error condition must be completely specified in the CASE expression to ensure correct operation in all cases. Do not be tempted to omit conditions implied by the rest of the statement. In this example, it would be an error to omit TB.id = TA.id (implied by the join).
The SQL Server query optimizer is free to reorder predicates, and makes no general guarantees about the timing or number of evaluations of scalar expressions. In particular, scalar computations can be deferred.
Completely specifying the error condition(s) within a CASE expression ensures the complete set of tests is evaluated together, and no earlier than correctness requires. From an execution plan perspective, this means the Compute Scalar associated with the CASE tests will appear on the indexed view delta maintenance branch:
The light shaded area highlights the indexed view maintenance region; the Compute Scalar containing the CASE expression is dark-shaded.
Related
Yesterday suddenly a report occurred that someone was not able to get some data anymore because the issue Msg 2628, Level 16, State 1, Line 57 String or binary data would be truncated in table 'tempdb.dbo.#BC6D141E', column 'string_2'. Truncated value: '!012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678'. appeared.
I was unable to create a repro without our tables. This is the closest as I can get to:
-- Create temporary table for results
DECLARE #results TABLE (
string_1 nvarchar(100) NOT NULL,
string_2 nvarchar(100) NOT NULL
);
CREATE TABLE #table (
T_ID BIGINT NULL,
T_STRING NVARCHAR(1000) NOT NULL
);
INSERT INTO #table VALUES
(NULL, '0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'),
(NULL, '!0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789!');
WITH abc AS
(
SELECT
'' AS STRING_1,
t.T_STRING AS STRING_2
FROM
UT
INNER JOIN UTT ON UTT.UT_ID = UT.UT_ID
INNER JOIN MV ON MV.UTT_ID = UTT.UTT_ID
INNER JOIN OT ON OT.OT_ID = MV.OT_ID
INNER JOIN #table AS T ON T.T_ID = OT.T_ID -- this will never get hit because T_ID of #table is NULL
)
INSERT INTO #results
SELECT STRING_1, STRING_2 FROM abc
ORDER BY LEN(STRING_2) DESC
DROP TABLE #table;
As you can see the join of #table cannot yield any results because all T_ID are NULL nevertheless I am getting the error mentioned above. The result set is empty.
That would be okay if a text with more than 100 characters would be in the result set but that is not the case because it is empty. If I remove the INSERT INTO #results and display the results it does not contain any text with more than 100 characters. The ORDER BY was only used to determine the faulty text value (with the original data).
When I use SELECT STRING_1, LEFT(STRING_2, 100) FROM abc it does work but it does not contain the text either that is meant to be truncated.
Therefore: What am I missing? Is it a bug of SQL Server?
-- this will never get hit is a bad assumption. It is well known and documented that SQL Server may try to evaluate parts of your query before it's obvious that the result is impossible.
A much simpler repro (from this post and this db<>fiddle):
CREATE TABLE dbo.t1(id int NOT NULL, s varchar(5) NOT NULL);
CREATE TABLE dbo.t2(id int NOT NULL);
INSERT dbo.t1 (id, s) VALUES (1, 'l=3'), (2, 'len=5'), (3, 'l=3');
INSERT dbo.t2 (id) VALUES (1), (3), (4), (5);
GO
DECLARE #t table(dest varchar(3) NOT NULL);
INSERT #t(dest) SELECT t1.s
FROM dbo.t1
INNER JOIN dbo.t2 ON t1.id = t2.id;
Result:
Msg 2628, Level 16, State 1
String or binary data would be truncated in table 'tempdb.dbo.#AC65D70E', column 'dest'. Truncated value: 'len'.
While we should have only retrieved rows with values that fit in the destination column (id is 1 or 3, since those are the only two rows that match the join criteria), the error message indicates that the row where id is 2 was also returned, even though we know it couldn't possibly have been.
Here's the estimated plan:
This shows that SQL Server expected to convert all of the values in t1 before the filter eliminated the longer ones. And it's very difficult to predict or control when SQL Server will process your query in an order you don't expect - you can try with query hints that attempt to either force order or to stay away from hash joins but those can cause other, more severe problems later.
The best fix is to size the temp table to match the source (in other words, make it large enough to fit any value from the source). The blog post and db<>fiddle explain some other ways to work around the issue, but declaring columns to be wide enough is the simplest and least intrusive.
Say I'm trying to return some results where a column in a table matches a condition I set. But I only want to return the first result from a list of possible values in the condition. Is there a quick and easy way to do that? I'm thinking that I can use coalesce somehow, but not sure how I can structure it.
Something like:
select identifier,purpose from table
where identifier = 'letters'
and purpose = coalesce('A','B','C')
group by purpose
So in the table, if A purpose isn't there, then I only want the B purpose to show up. if it isn't there, then I want the C to show up, if none of them are there, then I would ideally like a null or no results to be returned. I'd rather not make several case statements where if A is null then look to B, then if B is null to look to C. Is there a quick way syntactically to do so?
Edit: I also want this to work if I have multiple identifiers I list, such as:
select identifier,purpose from table
where identifier in ('letters1', 'letters2')
and purpose = coalesce('A','B','C')
group by purpose
where I return two results if they exist - one purpose for each identifier, with the purpose in the order of importance for A first, then B, then C, or null if none exist.
Unforunately my reasoning for caolesce doesn't work above, as none of the variables are null so my query will just try to return all purposes of 'A' without the fallback that I intend my query to do. I want to try and avoid using temp tables if possible.
Sybase ASE does not have support for the row_number() function (else this would be fairly simple), so one option would be to use a #temp table to simulate (to some extent) row_number() functionality.
Some sample data:
create table mytab
(identifier varchar(30)
,purpose varchar(30)
)
go
insert mytab values ('letters1','A')
insert mytab values ('letters1','B')
insert mytab values ('letters1','C')
insert mytab values ('letters2','A')
insert mytab values ('letters2','B')
insert mytab values ('letters2','C')
go
The #temp table is created with an identity column plus a 2nd column to hold the items you wish to prioritize; priority is determined by the order in which the rows are inserted into the #temp table.
create table #priority
(id smallint identity
,purpose varchar(30))
go
insert #priority (purpose)
select 'A' -- first priority
union all
select 'B' -- second priority
union all
select 'C' -- last priority
go
select * from #priority order by id
go
id purpose
------ -------
1 A
2 B
3 C
We'll use a derived table to find the highest priority purpose (ie, minimal id value). We then join this minimal id back to #priority to generate the final result set:
select dt.identifier,
p.purpose
from (-- join mytab with #priority, keeping only the minimal priority id of the rows that exist:
select m.identifier,
min(p.id) as min_id
from mytab m
join #priority p
on p.purpose = m.purpose
group by m.identifier) dt
-- join back to #priority to convert min(id) into the actual purpose:
join #priority p
on p.id = dt.min_id
order by 1
go
Some test runs with different set of mytab data:
/* contents of mytab:
insert mytab values ('letters1','A')
insert mytab values ('letters1','B')
insert mytab values ('letters1','C')
insert mytab values ('letters2','A')
insert mytab values ('letters2','B')
insert mytab values ('letters2','C')
*/
identifier purpose
---------- -------
letters1 A
letters2 A
/* contents of mytab:
--insert mytab values ('letters1','A')
--insert mytab values ('letters1','B')
insert mytab values ('letters1','C')
--insert mytab values ('letters2','A')
insert mytab values ('letters2','B')
insert mytab values ('letters2','C')
*/
identifier purpose
---------- -------
letters1 C
letters2 B
Returning NULL if a row does not exist is not going to be easy since generating a NULL requires existence of a row ... somewhere ... with which to associate the NULL.
One idea would be to expand on the #temp table idea by creating another #temp table (eg, #identifiers) with the list of desired identifier values you wish to search on. You could then make use of a left (outer) join from #identifiers to mytab to ensure you always generate a result record for each identifier.
Suppose B and C are both subclass and A is a superclass. B and C can not have same id (disjoint)
CREATE TABLE a(id integer primary key);
CREATE TABLE b(id integer references a(id));
CREATE TABLE c(id integer references a(id));
insert into a values('1');
insert into a values('2');
insert into b values('1');
insert into c values('2');
Could I use a trigger to prevent the same id appearing in tables B and C?
"b and c can not have same id"
So you want to enforce a mutually exclusive relationship. In data modelling this is called an arc. Find out more.
We can implement an arc between tables without triggers by using a type column to distinguish the sub-types like this:
create table a (
id integer primary key
, type varchar2(3) not null check (type in ( 'B', 'C'))
, constraint a_uk unique (id, type)
);
create table b (
id integer
, type varchar2(3) not null check (type = 'B')
, constraint b_a_fk foreign key (id, type) references a (id, type)
);
create table b (
id integer
, type varchar2(3) not null check (type = 'C')
, constraint c_a_fk foreign key (id, type) references a (id, type)
);
The super-type table has a unique key in addition to its primary key; this provides a reference point for foreign keys on the sub-type tables. We still keep the primary key to insure uniqueness of id.
The sub-type tables have a redundant instance of the type column, redundant because it contains a fixed value. But this is necessary to reference the two columns of the compound unique key (and not the primary key, as is more usual).
This combination of keys ensures that if the super-type table has a record id=1, type='B' there can be no record in sub-type table C where id=1.
Design wise this is not good but we can do it using the below snippet. You can create a similar trigger on table b
CREATE TABLE a(id integer primary key);
CREATE TABLE b(id integer references a(id));
CREATE TABLE c(id integer references a(id));
create or replace trigger table_c_trigger before insert on c for each row
declare
counter number:=0;
begin
select count(*) into counter from b where id=:new.id;
if counter<>0 then
raise_application_error(-20001, 'values cant overlap between c and b');
end if;
end;
insert into a values('1');
insert into a values('2');
insert into b values('1');
insert into b values('2');
insert into c values('2');
You can use Oracle Sequence:
CREATE SEQUENCE multi_table_seq;
INSERT INTO A VALUE(1);
INSERT INTO A VALUE(2);
INSERT INTO B VALUE(multi_table_seq.NEXTVAL()); -- Will insert 1 in table B
INSERT INTO C VALUE(multi_table_seq.NEXTVAL()); -- Will insert 2 in table C
...
With trigger:
-- Table B
CREATE TRIGGER TRG_BEFORE_INSERT_B -- Trigger name
BEFORE INSERT -- When trigger is fire
ON A -- Table name
DECLARE
v_id NUMBER;
BEGIN
v_id := multi_table_seq.NEXTVAL();
BEGIN
SELECT TRUE FROM C WHERE id = v_id;
RAISE_APPLICATION_ERROR(-20010, v_id || ' already exists in table C');
EXCEPTION WHEN NO_DATA_FOUND -- Do nothing if not found
END;
END;
And same trigger for table C who check if id exists in table B
My query :
INSERT into PriceListRows (PriceListChapterId,[No])
SELECT TOP 250 100943 ,N'2'
FROM #AnyTable
This query works fine and the following exception raises as desired:
The INSERT statement conflicted with the CHECK constraint
"CK_PriceListRows_RowNo_Is_Not_Unqiue_In_PriceList". The conflict
occurred in database "TadkarWeb", table "dbo.PriceListRows".
but with changing SELECT TOP 250 to SELECT TOP 251 (yes! just changing 250 to 251!) the query runs successfully without any check constrain exception!
Why this odd behavior?
NOTES :
My check constraint is a function which checks some sort of uniqueness. It queries about 4 table.
I checked on both SQL Server 2012 SP2 and SQL Server 2014 SP1
** EDIT 1 **
Check constraint function:
ALTER FUNCTION [dbo].[CheckPriceListRows_UniqueNo] (
#rowNo nvarchar(50),
#rowId int,
#priceListChapterId int,
#projectId int)
RETURNS bit
AS
BEGIN
IF EXISTS (SELECT 1
FROM RowInfsView
WHERE PriceListId = (SELECT PriceListId
FROM ChapterInfoView
WHERE Id = #priceListChapterId)
AND (#rowID IS NULL OR Id <> #rowId)
AND No = #rowNo
AND (#projectId IS NULL OR
(ProjectId IS NULL OR ProjectId = #projectId)))
RETURN 0 -- Error
--It is ok!
RETURN 1
END
** EDIT 2 **
Check constraint code (what SQL Server 2012 produces):
ALTER TABLE [dbo].[PriceListRows] WITH NOCHECK ADD CONSTRAINT [CK_PriceListRows_RowNo_Is_Not_Unqiue_In_PriceList] CHECK (([dbo].[tfn_CheckPriceListRows_UniqueNo]([No],[Id],[PriceListChapterId],[ProjectId])=(1)))
GO
ALTER TABLE [dbo].[PriceListRows] CHECK CONSTRAINT [CK_PriceListRows_RowNo_Is_Not_Unqiue_In_PriceList]
GO
** EDIT 3 **
Execution plans are here : https://www.dropbox.com/s/as2r92xr14cfq5i/execution%20plans.zip?dl=0
** EDIT 4 **
RowInfsView definition is :
SELECT dbo.PriceListRows.Id, dbo.PriceListRows.No, dbo.PriceListRows.Title, dbo.PriceListRows.UnitCode, dbo.PriceListRows.UnitPrice, dbo.PriceListRows.RowStateCode, dbo.PriceListRows.PriceListChapterId,
dbo.PriceListChapters.Title AS PriceListChapterTitle, dbo.PriceListChapters.No AS PriceListChapterNo, dbo.PriceListChapters.PriceListCategoryId, dbo.PriceListCategories.No AS PriceListCategoryNo,
dbo.PriceListCategories.Title AS PriceListCategoryTitle, dbo.PriceListCategories.PriceListClassId, dbo.PriceListClasses.No AS PriceListClassNo, dbo.PriceListClasses.Title AS PriceListClassTitle,
dbo.PriceListClasses.PriceListId, dbo.PriceLists.Title AS PriceListTitle, dbo.PriceLists.Year, dbo.PriceListRows.ProjectId, dbo.PriceListRows.IsTemplate
FROM dbo.PriceListRows INNER JOIN
dbo.PriceListChapters ON dbo.PriceListRows.PriceListChapterId = dbo.PriceListChapters.Id INNER JOIN
dbo.PriceListCategories ON dbo.PriceListChapters.PriceListCategoryId = dbo.PriceListCategories.Id INNER JOIN
dbo.PriceListClasses ON dbo.PriceListCategories.PriceListClassId = dbo.PriceListClasses.Id INNER JOIN
dbo.PriceLists ON dbo.PriceListClasses.PriceListId = dbo.PriceLists.Id
The explanation is that your execution plan is using a "wide" (index by index) update plan.
The rows are inserted into the clustered index at step 1 in the plan. And the check constraints are validated for each row at step 2.
No rows are inserted into the non clustered indexes until all rows have been inserted into the clustered index.
This is because there are two blocking operators between the clustered index insert / constraints checking and the non clustered index inserts. The eager spool (step 3) and the sort (step 4). Both of these produce no output rows until they have consumed all input rows.
The plan for the scalar UDF uses the non clustered index to try and find matching rows.
At the point the check constraint runs no rows have yet been inserted into the non clustered index so this check comes up empty.
When you insert fewer rows you get a "narrow" (row by row) update plan and avoid the problem.
My advice is to avoid this kind of validation in check constraints. It is difficult to be sure that the code will work correctly in all circumstances (such as different execution plans and isolation levels) and additionally they block parellelism in queries against the table. Try to do it declaratively (a unique constraint that needs to join onto other tables can often be achieved with an indexed view).
A simplified repro is
CREATE FUNCTION dbo.F(#Z INT)
RETURNS BIT
AS
BEGIN
RETURN CASE WHEN EXISTS (SELECT * FROM dbo.T1 WHERE Z = #Z) THEN 0 ELSE 1 END
END
GO
CREATE TABLE dbo.T1
(
ID INT IDENTITY PRIMARY KEY,
X INT,
Y CHAR(8000) DEFAULT '',
Z INT,
CHECK (dbo.F(Z) = 1),
CONSTRAINT IX_X UNIQUE (X, ID),
CONSTRAINT IX_Z UNIQUE (Z, ID)
)
--Fails with check constraint error
INSERT INTO dbo.T1 (Z)
SELECT TOP (10) 1 FROM master..spt_values;
/*I get a wide update plan for TOP (2000) but this may not be reliable
across instances so using trace flag 8790 to get a wide plan. */
INSERT INTO dbo.T1 (Z)
SELECT TOP (10) 2 FROM master..spt_values
OPTION (QUERYTRACEON 8790);
GO
/*Confirm only the second insert succceed (Z=2)*/
SELECT * FROM dbo.T1;
DROP TABLE dbo.T1;
DROP FUNCTION dbo.F;
It's possible that you are encountering an incorrect optimization of a query, but without having the data in all the involved tables, we cannot reproduce the bug.
However, for this kind of checks, I recommend using triggers instead of check constraints based on functions. In a trigger, you could use a SELECT statement to debug why it's not working as expected. For example:
CREATE TRIGGER trg_PriceListRows_CheckUnicity ON PriceListRows
FOR INSERT, UPDATE
AS
IF ##ROWCOUNT>0 BEGIN
/*
SELECT * FROM inserted i
INNER JOIN RowInfsView r
ON r.PriceListId = (
SELECT c.PriceListId
FROM ChapterInfoView c
WHERE c.Id = i.priceListChapterId
)
AND r.Id <> i.Id
AND r.No = i.No
AND (r.ProjectId=i.ProjectId OR r.ProjectId IS NULL AND i.ProjectId IS NULL)
*/
IF EXISTS (
SELECT * FROM inserted i
WHERE EXISTS (
SELECT * FROM RowInfsView r
WHERE r.PriceListId = (
SELECT c.PriceListId
FROM ChapterInfoView c
WHERE c.Id = i.priceListChapterId
)
AND r.Id <> i.Id
AND r.No = i.No
AND (r.ProjectId=i.ProjectId OR r.ProjectId IS NULL AND i.ProjectId IS NULL)
)
) BEGIN
RAISERROR ('Duplicate rows!',16,1)
ROLLBACK
RETURN
END
END
This way, you can see what is being checked and correct your views and/or existing data.
I have a strange problem that when performing an aggregate function on a type cast varchar column I receive an "Msg 8114, Level 16, State 5, Line 1. Error converting data type nvarchar to bigint." The queries where clause should filter out the non-numeric values.
Table structure is similar to this:
IF EXISTS (SELECT * FROM sys.all_objects ao WHERE ao.name = 'Identifier' AND ao.type = 'U') BEGIN DROP TABLE Identifier END
IF EXISTS (SELECT * FROM sys.all_objects ao WHERE ao.name = 'IdentifierType' AND ao.type = 'U') BEGIN DROP TABLE IdentifierType END
CREATE TABLE IdentifierType
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[Style] [int] NULL,
CONSTRAINT [PK_IdentifierType_ID] PRIMARY KEY CLUSTERED ([ID] ASC)
) ON [PRIMARY]
CREATE TABLE Identifier
(
[ID] [int] IDENTITY(1,1) NOT NULL,
[IdentifierTypeID] [int] NOT NULL,
[Value] [nvarchar](4000) NOT NULL,
CONSTRAINT [PK_Identifier_ID] PRIMARY KEY CLUSTERED ([ID] ASC)
) ON [PRIMARY]
ALTER TABLE Identifier WITH CHECK ADD CONSTRAINT [FK_Identifier_IdentifierTypeID] FOREIGN KEY([IdentifierTypeID]) REFERENCES IdentifierType ([ID])
GO
Identifier.Value is a VARCHAR column, it can and does contain non-numeric data. Filtering the query to IdentifierType.Style = 0 should mean that 'Value' only returns string representations of integers. The query below fails with "Msg 8114, Level 16, State 5, Line 1. Error converting data type nvarchar to bigint."
SELECT
MAX(CAST(Value AS BIGINT))
FROM
Identifier i,
IdentifierType it
WHERE
i.IdentifierTypeID = it.ID AND
it.Style = 0
If i extend the WHERE clause to include a 'AND ISNUMERIC(i.Value) = 1' it will return the maximum integer value. That to me implies that there is a non-numeric string in my result set. Yet i get no rows returned from this:
SELECT
*
FROM
Identifier i,
IdentifierType it
WHERE
i.IdentifierTypeID = it.ID AND
it.Style = 0 AND
ISNUMERIC(i.Value) <> 1
I've been unable to identity the row(s) that are tripping the type cast. The above query should have exposed the exceptional rows. In addition, there are no empty or extremely long strings either (the largest string is 6 character long)
Is it possible that MSSQL is attempting to do the CAST on all rows rather than filtering via the WHERE clause first?
Or has anyone else seen anything similar?
There is a second work around which is instantiating the component of the query into a temp table, and then selecting the MAX value from that.
SELECT
Value
INTO
IdentifierClone
FROM
Identifier i,
IdentifierType it
WHERE
i.IdentifierTypeID = it.ID AND
it.Style = 0
SELECT MAX(CAST(Value as BIGINT)) FROM IdentifierClone
A subquery doesn't work however.
Any help or thoughts would be appreciated.
Try using a REGEX expression to find the problem record. Here's an example where ISNUMERIC does not detect the problem but the regex expression does
CREATE TABLE tst (value nvarchar(4000))
INSERT INTO tst select '£'
-- Record found ...
SELECT * FROM tst WHERE value NOT LIKE '%[0-9]%'
-- No record found ...
SELECT * from tst where isnumeric(value) <> 1