How to use a recursive CTE in a check constraint? - sql-server

I'm trying to create a check constraint on a table so that ParentID is never a descendant of current record
For instance, I have a table Categories, with the following fields ID, Name, ParentID
I have the following CTE
WITH Children AS (SELECT ID AS AncestorID, ID, ParentID AS NextAncestorID FROM Categories UNION ALL SELECT Categories.ID, Children.ID, Categories.ParentID FROM Categories JOIN Children ON Categories.ID = Children.NextAncestorID) SELECT ID FROM Children where AncestorID =99
The results here are correct, but when I try to add it as a constraint to the table like this:
ALTER TABLE dbo.Categories ADD CONSTRAINT CK_Categories CHECK (ParentID NOT IN(WITH Children AS (SELECT ID AS AncestorID, ID, ParentID AS NextAncestorID FROM Categories UNION ALL SELECT Categories.ID, Children.ID, Categories.ParentID FROM Categories JOIN Children ON Categories.ID = Children.NextAncestorID) SELECT ID FROM Children where AncestorID =ID))
I get the following error:
Incorrect syntax near the keyword 'with'. If this statement is a
common table expression, an xmlnamespaces clause or a change tracking
context clause, the previous statement must be terminated with a
semicolon.
Adding a semicolon before the WITH, didn't help.
What would be the correct way to do this?
Thanks!

Per the SQL Server documentation on column constraints:
CHECK
Is a constraint that enforces domain integrity by limiting the possible values that can be entered into a column or columns.
logical_expression
Is a logical expression used in a CHECK constraint and returns TRUE or FALSE. logical_expression used with CHECK constraints cannot reference another table but can reference other columns in the same table for the same row. The expression cannot reference an alias data type.
(The above was quoted from the SQL Server 2017 version of the documentation, but the general principle applies to all previous versions as well, and you didn't state what version you are working with.)
The important part here is the "cannot reference another table, but can reference other columns in the same table for the same row" (emphasis added). A CTE would count as another table.
As such, you can't have a complex query like a CTE used for a CHECK constraint.
Like Saman suggested, if you want to check against existing data in other rows, and it must be in the DB layer, you could do it as a trigger.
However, triggers have their own drawbacks (e.g. issues with discoverability, behavior that is unexpected by those who are unaware of the trigger's presence).
As Sami suggested in their comment, another option is a UDF, but that's not w/o its own issues, potentially with both performance and stability according to the answers on this question about this approach in SQL Server 2008. It's probably still applicable to later versions as well.
If possible, I would say it's usually best to move that logic into the application layer. I see from your comment that you already have "client-side" validation. If there is an app server between that client and the database server (such as in a web app), I would suggest putting that additional validation there (in the app server) instead of in the database.

Related

Workaround for reproducable SQL Server bug when updating a CTE

I've encountered an issue with SQL Server when using an updatable CTE when combining a view with a derived column and a table using system versioning.
It causes a stack dump and disconnects the session with the error:
Msg 596 Level 21 State 1 Line 0
Cannot continue the execution because the session is in the kill state.
Msg 0 Level 20 State 0 Line 0
A severe error occurred on the current command. The results, if any, should be discarded.
I've spent some time getting to the bottom of the cause and am able to reproduce the error on any version of SQL Server.
My query is quite complex however I've boiled it down to the following few requirements:
Create two tables, one will be the target of an update, the other a source of data.
Create a view on the table containing source data.
The view must include a derived column eg select 0 as columnName
The table to update must have system versioning on
Define a CTE to select columns from the view and join to the target table
Update the CTE to set column in target table to the value of the derived column in view
BOOM
If the derived column in the view is replaced with a physical column, or system versioning is disabled, the update works.
It's reproducable and I can demonstrate it with this simple DB<>Fiddle
I'm looking to try and find a workaround. My actual situation is using the updatable CTE to select top N rows from the view of a staging table in order to batch-update a target table (avoiding lock escalation) with the staging table containing 500k - 1m+ rows.
Has anyone encountered this or can maybe think of a clever workaround / hack?
Thanks to some help from the comments, #lptr's suggestion to apply some sort of function to the offending columns turned out to be a valid workaround.
In the CTE that was selecting columns from the view which contained some derived column values I implemented a 1 * columnname as columnname and this made SQL Server happy.
The issue was just having these column in the view, regardless of whether they were used in an update or not.

I dont understand this Update query can you explain

so here is a little ms sql server query:
update tblSaleReturnMain set sync=0
from tblSaleOrderMain s join tblSaleReturnMain r on r.ID=s.intReturnOrderId
where s.sync=0
it updates my "tblSaleReturnMain" table just fine, also I wrote this query myself, but I dont know why it works. My question is, with all the many join-ed tables that I could reference after the "from" clause, and all the possible data that can be produced, how does this query know that the tblSaleReturnMain mentioned in "update tblSaleReturnMain .." is the same that is being filtered in join statement? Is that always like a protocol, like we mention a table before the "set" keyword and do not give it an alias, but then go on filtering/joining its data any way we like, and what remains in a resultset is what the "set" statement will apply to?
My question is specifically for Update statements that have JOINS after the FROM keyword.
Also this question is not about "how to use join in Update statement", because I already did that successfully above.
Yes, as long as you only use the target table once in the FROM SQL Server will assume it is the same table reference. From the docs (emphasis mine):
FROM <table_source>
Specifies that a table, view, or derived table source is used to provide the criteria for the update operation. For more information, see FROM (Transact-SQL).
If the object being updated is the same as the object in the FROM clause and there is only one reference to the object in the FROM clause, an object alias may or may not be specified. If the object being updated appears more than one time in the FROM clause, one, and only one, reference to the object must not specify a table alias. All other references to the object in the FROM clause must include an object alias.
If you reference the same table more than once and try to update it using just the table name rather than the alias, you'll get an error along the lines of:
Msg 8154 Level 16 State 1 Line 2
The table 'tblSaleReturnMain ' is ambiguous.
You can reference the same table more than once, but if doing so you must use the alias as the table_or_view_name, e.g.
UPDATE Alias
SET Col = 1
FROM dbo.T1 AS Alias
INNER JOIN dbo.T1 AS Alias2
ON Alias.ID = Alias2.ID;
Examples on DB<>Fiddle
I personally always use the alias regardless of whether the full table reference would be ambiguous.
SQL Server's UPDATE ... FROM syntax is non-standard and confusing.
Much better to write a CTE, examine the results, and then update the CTE, eg:
with q as
(
select r.sync
from tblSaleOrderMain s
join tblSaleReturnMain r
on r.ID=s.intReturnOrderId
where s.sync=0
)
update q set sync = 0

What does "a" do at the end of a sql statement?

I have a code here in the screenshot. At the end of the code you see a "a"
When i try to remove the "a" and run the code, it fails but it works with the "a"
what is the significance of this ?
Edit: Question was originally tagged MySQL. However, the explanation below should still apply for all the major RDBMS.
It is an Aliasname for the Derived Table. A Derived table is basically a sub-select query. In MySQL, every Derived Table should have its own Alias, so that outer Select queries can refer to the columns/expressions from the Derived Table. Without a table name/alias, MySQL cannot determine the origin of a column value unambiguously.
From Docs:
The [AS] tbl_name clause is mandatory because every table in a FROM
clause must have a name. Any columns in the derived table must have
unique names.

Hierarchical SQL select-query

I'm using MS SqlServer 2008. And I have a table 'Users'. This table has the key field ID of bigint. And also a field Parents of varchar which encodes all chain of user's parent IDs.
For example:
User table:
ID | Parents
1 | null
2 | ..
3 | ..
4 | 3,2,1
Here user 1 has no parents and user 4 has a chain of parents 3->2->1. I created a function which parses the user's Parents field and returns result table with user IDs of bigint.
Now I need a query which will select and join IDs of some requested users and theirs parents (order of users and theirs parents is not important). I'm not an SQL expert so all I could come up with is the following:
WITH CTE AS(
SELECT
ID,
Parents
FROM
[Users]
WHERE
(
[Users].Name = 'John'
)
UNION ALL
SELECT
[Users].Id,
[Users].Parents
FROM [Users], CTE
WHERE
(
[Users].ID in (SELECT * FROM GetUserParents(CTE.ID, CTE.Parents) )
))
SELECT * FROM CTE
And basically it works. But performance of this query is very poor. I believe WHERE .. IN .. expression here is a bottle neck. As I understand - instead of just joining the first subquery of CTE (ID's of found users) with results of GetUserParents (ID's of user parents) it has to enumerate all users in the Users table and check whether the each of them is a part of the function's result (and judging on execution plan - Sql Server does distinct order of the result to improve performance of WHERE .. IN .. statement - which is logical by itself but in general is not required for my goal. But this distinct order takes 70% of execution time of the query). So I wonder how this query could be improved or perhaps somebody could suggest some another approach to solve this problem at all?
Thanks for any help!
The recursive query in the question looks redundant since you already form the list of IDs needed in GetUserParents. Maybe change this into SELECT from Users and GetUserParents() with WHERE/JOIN.
select Users.*
from Users join
(select ParentId
from (SELECT * FROM Users where Users.Name='John') as U
cross apply [GetDocumentParents](U.ID, U.Family, U.Parents))
as gup
on Users.ID = gup.ParentId
Since GetDocumentParents expects scalars and select... where produces a table, we need to apply the function to each row of the table (even if we "know" there's only one). That's what apply does.
I used indents to emphasize the conceptual parts of the query. (select...) as gup is the entity Users is join'd with; (select...) as U cross apply fn() is the argument to FROM.
The key knowledge to understanding this query is to know how the cross apply works:
it's a part of a FROM clause (quite unexpectedly; so the syntax is at FROM (Transact-SQL))
it transforms the table expression left of it, and the result becomes the argument for the FROM (i emphasized this with indent)
The transformation is: for each row, it
runs a table expression right of it (in this case, a call of a table-valued function), using this row
adds to the result set the columns from the row, followed by the columns from the call. (In our case, the table returned from the function has a single column named ParentId)
So, if the call returns multiple rows, the added records will be the same row from the table appended with each row from the function.
This is a cross apply so rows will only be added if the function returns anything. If this was the other flavor, outer apply, a single row would be added anyway, followed by a NULL in the function's column if it returned nothing.
This "parsing" thing violates even the 1NF. Make Parents field contain only the immediate parent (preferably, a foreign key), then an entire subtree can be retrieved with a recursive query.

SQL Updatable View with joined tables

I have a view that looks similar to this,
SELECT dbo.Staff.StaffId, dbo.Staff.StaffName, dbo.StaffPreferences.filter_type
FROM dbo.Staff LEFT OUTER JOIN
dbo.StaffPreferences ON dbo.Staff.StaffId = dbo.StaffPreferences.StaffId
I'm trying to update StaffPreferences.filter_type using,
UPDATE vw_Staff SET filter_type=1 WHERE StaffId=25
I have read this in an MSDN article,
Any modifications, including UPDATE, INSERT, and DELETE statements,
must reference columns from only one base table.
Does this mean that I can only update fields in dbo.Staff (which is all I can currently achieve) In this context does the definition of 'base table' not extend to any subsequently joined tables?
Your statement should work just fine since you are only modifying column(s) from one table (StaffPreferences).
If you tried to update a columns from different tables in the same update statement you would get an error.
Msg 4405, Level 16, State 1, Line 7
View or function 'v_ViewName' is not updatable because the modification affects multiple base tables.
The rules for updatable join views are as follows:
General Rule
Any INSERT, UPDATE, or DELETE operation on a join view can modify only
one underlying base table at a time.
UPDATE Rule All updatable columns of a join view must map to
columns of a key-preserved table. See "Key-Preserved Tables" for a
discussion of key-preserved tables. If the view is defined with the
WITH CHECK OPTION clause, then all join columns and all columns of
repeated tables are non-updatable.
DELETE Rule
Rows from a join view can be deleted as long as there is exactly one
key-preserved table in the join. If the view is defined with the WITH
CHECK OPTION clause and the key preserved table is repeated, then the
rows cannot be deleted from the view.
INSERT Rule An INSERT statement must not explicitly or
implicitly refer to the columns of a nonkey preserved table. If the
join view is defined with the WITH CHECK OPTION clause, INSERT
statements are not permitted.
http://download.oracle.com/docs/cd/B10501_01/server.920/a96521/views.htm#391
I think you can see some of the problems that might occur if there's a row in Staff with StaffId 25, but no matching row in StaffPreferences. There are various right things you could do (preserve the appearance that this is a table, perform an insert in StaffPreferences; reject the update; etc).
I think at this point, the SQL Server engine will give up, and you'll have to write a trigger that implements the behaviour you want (whatever that may be. You need to consider all of the cases for the join working/not working)
Here is how I solved it.
In my case it was table, not a view, but I needed to find the schema id that referenced the table in the data construction in a reference table, say called our_schema.
I ran the following:
select schemaid from our_schema where name = "MY:Form"
This gave me the id as 778 (example)
Then I looked where this ID was showing up with a prefix of T, B, or H.
In our case we have Table, Base and History tables where the data is stored.
I then ran:
delete from T778
delete from B778
delete from H778
This allowed me to delete the data and bypass that restriction.

Resources