Enable SYSTEM_VERSIONING Error - Overlapping Dates in History Table - sql-server

I recently migrated my SQL 2019 database from a VM into Azure SQL.
I used the MS Data Migration tool, but unfortunately, it wouldn't migrate data from Temporal Tables.
So. I just used the tool to create the table schemas and then used SSIS to move the data.
Since my existing history table had data in it, I wanted to keep the SysStartDate and SysEndDate fields. In order to do this, I had to disable SYSTEM_VERSIONING in my Azure SQL database as well as DROP the PERIOD on the table.
The data migration was a success so I re-created my PERIOD on the table but when I tried to enable SYSTEM_VERSIONING with a specified history table, I get the following error:
Msg 13573, Level 16, State 0, Line 34
Setting SYSTEM_VERSIONING to ON failed because history table 'xxxxxHistory' contains overlapping records.
I find this odd because the existing tables were originally joined as a temporal table so I don't understand why there would be a conflict now.
ALTER TABLE xxx.xxx
ADD PERIOD FOR SYSTEM_TIME(SysStartTime, SysEndTime)
ALTER TABLE xxx.xxx
SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE=xxx.xxxHistory))
I expect to get a successful temporal table. Instead, I get the following error:
Msg 13573, Level 16, State 0, Line 34
Setting SYSTEM_VERSIONING to ON failed because history table 'xxxxxHistory' contains overlapping records.
I ran the following query to identify the overlaps but I don't get any:
SELECT
xxxxKeyNumeric
,SysStartTime
,SysEndTime
FROM
xxxx.xxxxhistory o
WHERE EXISTS
(
SELECT
1
FROM
xxxx.xxxxhistory o2
WHERE
o2.xxxxKeyNumeric = o.xxxxKeyNumeric
AND o2.SysStartTime <= o.SysEndTime
AND o.SysStartTime <= o2.SysEndTime
AND o2.xxxxPK != o.xxxxPK
)
ORDER BY
o.xxxxKeyNumeric,
o.SysStartTime

I found this explanation for the error:
"There are multiple records for the same record with overlapping start and end dates. The end date for the last row in the history table should match the start date for the active record in the parent table" blog of a DBA
This happened to me after switching the historic table, touching a few rows, then trying to go back to the old historic table.
UPDATE: Happened again, and this time the table had millions of rows. I had to write a query, comparing the start date and end date of every row in the history table.
Possible causes:
For every PK, the start dates and end dates of the history rows must not overlap. The query below will find this specific issue.
the end date of the latest row in the history for that PK, has a later end date than the start date of the PK in the main table. It is possible to modify the above query to do this
in the rows with a same PK, 2 rows cover the same time interval. If they overlap by a single millisecond, and someone requests that exact millisecond, it won't know which of the 2 versions is the correct one.
For the first issue:
select ant.*,post.* , DATEDIFF(day,ant.end_date,post.start_date)
from
(SELECT
PK_column
, start_date
, end_date
, ROW_NUMBER() OVER(PARTITION BY PK_column ORDER BY end_date desc, start_date desc) AS current
,(ROW_NUMBER() OVER(PARTITION BY PK_column ORDER BY end_date desc, start_date desc))-1 AS previous
FROM huge_table_HIST
) ant
inner join
(SELECT
key_column
, start_date
, end_date
, ROW_NUMBER() OVER(PARTITION BY PK_column ORDER BY end_date desc, start_date desc ) AS current
FROM huge_table_HIST
) post
ON ant.PK_column=post.PK_column AND ant.previous=post.current
WHERE ant.end_date > post.start_date
Surprisingly, it doesn't fail if:
you have multiple rows with exactly the same start end and end date, for the same PK. SQL Server seems to consider them a single point in space, instead of an interval. They will only appear if you request the exact millisecond in which they exist.
there are gaps between the end date of a history row, and the start end of the next one. SQL server considers that the PK just didn't exist in that time interval.

Temporal tables depend on the temporal table's primary key values combined with the SysStartTime do determine uniqueness in the history table.
This can very easily happen if you make changes to the primary key definition. Also, if your history table's fields corresponding to the temporal table's PK are not populated, or many / all are populated with a default value, overlaps are detected and you get that error.
Check that your PK is defined on the system versioned temporal table, then check that the corresponding values in your history table's primary key fields are correct (i.e. unique for any given PK & SysStartTime value.)
You may have to update the history table accordingly before applying the system versioning relationship again.

This error can also occur when there are multiple records per Primary Key for any given
GENERATED ALWAYS AS ROW START or GENERATED ALWAYS AS ROW END columns.
The following queries will help identify those records.
select ID
from dbo.HistoryTable
group by ID, SysStartTime
having count(*) > 1
select ID
from dbo.HistoryTable
group by ID, SysEndTime
having count(*) > 1

Related

SQL Server: Slowly Changing Dimension Type 2 on historical records

I am trying to set up a SCD of Type 2 for historical records within my Customer table. Attached is how the Customer table is set up alongside the expected outcome. Note that the Customer table in practice has 2 million distinct Customer IDs. I tried to use the query below, but the Start_Date and End_Date are repeating for each row.
SELECT t.Customer_ID, t.Lifecyle_ID, t.Date As Start_Date,
LEAD(t.Date) OVER (ORDER BY t.Date) AS End_Date
FROM Customer AS t
I think a three step query is likely needed.
Use LEAD and LAG, partitioned by Customer and ordered by date, to peek at the next row's values for both Date and Lifecycle.
Use a CASE statement to emit a value for End Date when the current row's Lifecycle <> the next row's lifecycle (otherwise emit NULL). Now do the same using LAG for the Effective Date.
Group By or Distinct on the output from Step #2.
Hopefully that makes sense. I'll try to post a code example later today, but hopefully that's enough to get you started.

How to get the the most recent queries in Oracle DB

I have a web application and I doubt some others have deleted some records manually. Upon enquiry nobody is admitting the mistakes. How to find out at what time those records were deleted ?? Is it possible to get the history of delete queries ?
If you have access to v$ view then you can use the following query to get it. It contains the time as FIRST_LOAD_TIME column.
select *
from v$sql v
where upper(sql_text) like '%DELETE%';
If flashback query is enabled for your database (try it with select * from table as of timestamp sysdate - 1) then it may be possible to determine the exact time the records were deleted. Use the as of timestamp clause and adjust the timestamp as necessary to narrow down to a window where the records still existed and did not exist anymore.
For example
select *
from table
as of timestamp to_date('21102016 09:00:00', 'DDMMYYYY HH24:MI:SS')
where id = XXX; -- indicates record still exists
select *
from table
as of timestamp to_date('21102016 09:00:10', 'DDMMYYYY HH24:MI:SS')
where id = XXX; -- indicates record does not exist
-- conclusion: record was deleted in this 10 second window

T-SQL Select where Subselect or Default

I have a SELECT that retrieves ROWS comparing a DATETIME field to the highest available value of another TABLE.
The Two Tables have the following structure
DeletedRecords
- Id (Guid)
- RecordId (Guid)
- TableName (varchar)
- DeletionDate (datetime)
And Another table which keep track of synchronizations using the following structure
SynchronizationLog
- Id (Guid)
- SynchronizationDate (datetime)
In order to get all the RECORDS that have been deleted since the last synchronization, I run the following SELECT:
SELECT
[Id],[RecordId],[TableName],[DeletionDate]
FROM
[DeletedRecords]
WHERE
[TableName] = '[dbo].[Person]'
AND [DeletionDate] >
(SELECT TOP 1 [SynchronizationDate]
FROM [dbo].[SynchronizationLog]
ORDER BY [SynchronizationDate] DESC)
The problem occurs if I do not have synchronizations available yet, the T-SQL SELECT does not return any row while it should returns all the rows cause there are no synchronization records available.
Is there a T-SQL function like COALESCE that I can use with DateTime?
Your subquery should look like something like this:
SELECT COALESCE(MAX([SynchronizationDate]), '0001-01-01')
FROM [dbo].[SynchronizationLog]
It says: Get the last date, but if there is no record (or all values are NULL), then use the '0001-01-01' date as start date.
NOTE '0001-01-01' is for DATETIME2, if you are using the old DATETIME data type, it should be '1753-01-01'.
Also please note (from https://msdn.microsoft.com/en-us/library/ms187819(v=sql.100).aspx)
Use the time, date, datetime2 and datetimeoffset data types for new work. These types align with the SQL Standard. They are more portable. time, datetime2 and datetimeoffset provide more seconds precision. datetimeoffset provides time zone support for globally deployed applications.
EDIT
An alternative solution is to use NOT EXISTS (you have to test it if its performance is better or not):
SELECT
[Id],[RecordId],[TableName],[DeletionDate]
FROM
[DeletedRecords] DR
WHERE
[TableName] = '[dbo].[Person]'
AND NOT EXISTS (
SELECT 1
FROM [dbo].[SynchronizationLog] SL
WHERE DR.[DeletionDate] <= SL.[SynchronizationDate]
)

MSSql Batch Update of table on Primary Keys

I have migrated an Access DB to MSSql server 2008 and found some anomalies from the old database. On both DBs IDs are auto incremental and should be in line with Date. But as shown below, some have been saved in the wrong chronological order.
**Access:**
ID FileID DateOfTransaction SectionID
64490 95900 02/12/1997 100
64491 95900 04/04/1996 80
64492 95900 25/03/1996 90
**Desired Correct Format:**
ID FileID DateOfTransaction SectionID
64492 95900 02/12/1997 100
64491 95900 04/04/1996 80
64490 95900 25/03/1996 90
The PK (ID) table is linked to several other tables with update Cascade set.
I need to group by FileID and sort by DateOfTransaction and update IDs accordingly.
I need some suggestions on how best to tackle this as data is quite sensitive. I have about 50K records to update.
Thanks for reading!
try this query
with cte as
(select * from (
select *,ROW_NUMBER() over (partition by FileID
order by DateOfTransaction) as row_num
from t_Transactions) A
join
(select ID B_ID, FileID B_FileID,ROW_NUMBER()
over (partition by FileID order by ID) as B_row_num
from t_Transactions) B
on A.row_num=B.B_row_num)
select T.ID [Old_ID], CTE.B_ID [New_ID],
T.FileID,T.DateOfTransaction,T.SectionID
--update T set T.ID=CTE.B_ID
from t_Transactions T join cte
on T.ID=CTE.ID
and CTE.B_FileID=T.FileID
Before updating , you can select and conform the result
This query updates the table as per your requirement. You have mentioned that ID column is linked to several other tables. Please be very careful about this and make sure that updating ID column doesn't break anything else
SQL Fiddle Demo
Designing a database to rely on the order of an artificially-generated key to match the date order of another column is a terrible anti-pattern, NOT best practice in the slightest.
Stop relying on it to represent insertion order. That is the answer. If you need that data, it should be another column separate from your PK. Can't you order by date, anyway? If not, create a new column.
It is always a mistake to invest internal database identifiers with meaning of any kind besides relating rows to each other.
I've seen this exact problem before at a former employer--and the database was rife with all sorts of other design problems as well. FK columns were actually named "frnkeyColumnName" to match the "keyColumnName" they pointed to. Never mind a PK that was also an FK...
Stop the madness!
I would seriously consider whether you need to do this at all. Is there any logic that depends on higher IDs having a later date? Was the data out of order in the Access database, in which case, it doesn't matter.
If you do decide to proceed, back up the data first. You're probably going to make mistakes.

Inserting the values with condition

Using SQL Server 2005
When i insert the date it should compare the date in the table.
If it is equal with other date, it should display a error message and also it should allow only to insert the next date.
For Example
Table1
Date
20091201
20091202
Insert into table1 values('20091202')
The above query should not allow to insert the same value
Insert into table1 values('20091204')
The above query also should not allow to insert the long gap date.
The query should allow only the next date.
It should not allow same date and long gap date.
How to insert a query with this condition.
Is Possible in SQL or VB.Net
Need SQL Query or VB.Net code Help
You could use a where clause to ensure that the previous day is present in the table, and the current day is not:
insert into table1 ([dateColumn])
select '20091204'
where exists (
select * from table1 where [dateColumn] = dateadd(d,-1,'20091204')
)
and not exists (
select * from table1 where [dateColumn] = '20091204'
)
if ##rowcount <> 1
raiserror ('Oops', 16, 1)
If the insert succeeds, ##rowcount will be set to 1. Otherwise, an error is returned to VB using raiserror.
Why not just have a table of dates set up in advance, and update a row once you want to "insert" that date?
I'm not sure I understand the point of inserting a new date only once, and never allowing a gap. Could you describe your business problem in a little more detail?
Of course you could use an IDENTITY column, and then have a computed column or a view that calculates the date from the number of days since (some date). But IDENTITY columns do not guarantee contiguity, nor do they even guarantee uniqueness on their own (unless you set up suc a constraint separately).
Preventing duplicates should be done at the table level with a unique constraint, not with a query. You can check for duplicates first so that you can handle errors in your own way (rather than let the engine raise an exception for you), but that shouldn't be your only check.
Sounds like your date field should just be unique with auto-increment.

Resources