Let's say I have this table :
Car
----------------------
Name|Date|Color
The primary key is a combination of Name and Date.
On the update, if the initial Color of the updated row is Blue and the new one is Red, I want to keep a trace of this update.
This is what I did :
ALTER TRIGGER TraceTrigger
ON Car
FOR UPDATE
AS
BEGIN
INSERT INTO TraceTable
SELECT
del.Name,
del.Date,
del.Color,
ins.Name,
ins.Date,
ins.Color
FROM deleted as del
INNER JOIN inserted as ins
ON del.Name = ins.Name AND del.Date = ins.Date
WHERE del.color = 'Blue' AND ins.Color = 'Red'
END
This example is pretty simple. It show that I need to keep a trace of X old value and X new value from the updated row.
But imagine if the Name can be modified (I know we should not modify PK, but in this situation, it is possible). Given that the primary key can change, sometimes, the relation between the INSERTED and DELETED table's will just not work.
So, it is possible to keep the relation between the deleted row and the inserted row when the PK can be updated to a different value ?
You needn't bother recording both INSERTED and DELETED. Just INSERTED is what I usually do, otherwise you'd end up with 2 of every bit of information. You'll record it when its inserted, then you'll record the identical data when its deleted.
Say you've got a table that just has an ID and a Name field, the trace for that recording both INSERTED and DELETED would look like:
OldID OldName NewID NewName
1 Harry 1 Henry
1 Henry 1 James
1 James 1 Thomas
As you can see, you're doubling up data. The left 2 columns are identical to the right columns except shifted up a row.
In terms of the primary key, if you know you might have to change the PK whilst wanting to maintain a history, I'd strongly recommend adding a surrogate key to the table (e.g ID) that you NEVER change, that way you are free to alter the name column as you wish.
You never really change a primary key; logically, you actually create a new entity (record / row ). It is, in effect, a completely new thing.
There are a number of ways to keep track of this change, but here are two:
Create a row identifier like an IDENTITY column. It's not really a surrogate key, because a surrogate key should always be 1-1 with the proper natural key. Use this if name + date is not really the primary key and you can't create one (yuck - you have a database design issue).
Update the data in your trace table to match the new value anytime a value in the PK changes. This is the proper solution if your database design is correct. You may be able to implement this with an ON UPDATE CASCADE foreign key constraint.
Related
A very simple example. I have web API with a table in the database
Employees
---------
Id
---------
Name
and for example, I have 50 records.
Now I have to Implement a feature to add extra info about the department. Because I have one to many relationships the new database schema is with department id
Employees Department
---------- -----------
Id Id
--------- -----------
Name Name
---------
DepartmentId
for this, I run the query (i use SQL server)
alter table Employees add constraint fk_employees_departmentid
foreign key (DepartmentId) references Department(Id);
But now I have some issues to handle
1)Now I have the 50 existing records without departmentId. However, I must add manually this value? What is the best practice? For 50 records it is possible but for 2000 records and more?
2) when I add departmentId column I set this column to have null values(is correct?), but as a foreign key, I don't want to allow null values. Can I change it or how can I handle it?
1)Now I have the 50 existing records without departmentId. However, I must add manually this value? What is the best practice? For 50 records it is possible but for 2000 records and more?
It depends. You could set up a new department for "unassigned" and assign them all to that; you could send out a spreadsheet to HR saying "the following employees don't have an assigned department; what department are they in? ps; don't remove the EmployeeID column from the sheet before you send it back; i need it to update the DB". It's very much a business contextual question, not a technical one. X thousand records is easy to handle.. It'll just take a bit of time to work through if you (or someone else) is doing it manually. This information is likely to be available somewhere else; you could perhaps send a list out to all department heads saying "are any of these guys yours? Please remove all the names you don't have in your team from this spreadsheet and send it back to me" then update the DB based on what you get back
As this is a one time operation you don't need anything particularly whizz for it - you can just get your Excel sheet back and in an empty column put:
="UPDATE emp SET departmentID = 5 WHERE id = " & A1
And fill it down to generate a bunch of update statements, copy the text into your query tool and hit go; don't need to get all fancy loading the sheet into a table, doing update joins etc - just hacky style sling together something in excel that will write the SQL for you, copy/paste/run. If HR have sent back the sheet with a list of department names, then put the dept name and id somewhere else on the sheet and use VLOOKUP or XLOOKUP to turn the name into the department number, then compose your SQL based on that
2) when I add departmentId column I set this column to have null values(is correct?), but as a foreign key, I don't want to allow null values. Can I change it or how can I handle it?
Foreign keyed columns are allowed to have NULL values - it isn't the FK that imposes a "No Nulls" restriction, it's the nullability of the column (alter the column to departmantid INT NOT NULL) that imposes that. A FK references a primary key and the primary key may not be null (or in some DB, at most one record can have a [partly] null PK), but you could just leave those departments null. If you do alter the column to be not null, you'll need to correct the NULL values first or the change will fail
I have a table in Google Cloud Spanner.
CREATE TABLE test_id (
Id STRING(MAX) NOT NULL,
KeyColumn STRING(MAX) NOT NULL,
parent_id INT64 NOT NULL,
Updated TIMESTAMP NOT NULL OPTIONS (allow_commit_timestamp=true),
) PRIMARY KEY (Id)
And, I am trying to perform transaction.insert_or_update through a python script.
For each row in a pandas dataframe, I am doing:
transaction.insert_or_update(
'test_id', columns=['Id','KeyColumn', 'parent_id', 'Updated'],
values=[(uuid.uuid4().hex, row["KeyColumn"], row["parent_id"], spanner.COMMIT_TIMESTAMP)],
)
What I want is that if the row["KeyColumn"] is already present in KeyColumn of the table, update its parent_id column, otherwise insert a new row in the Spanner table corresponding to that KeyColumn.
But since, my primary key is Id which is generated randomly by uuid.uuid4().hex, it every time inserts a new row.
If I understand you correctly, the following is the situation:
ID is the primary key of your table.
There is a unique index defined for the table on the column KeyColumn.
You want to insert_or_update a row using KeyColumn as the column that should be used to determine whether the row already exists.
That is unfortunately not possible. insert_or_update will always use the primary key of the table to determine whether the row exists. I can think of three possible solutions to this problem, but they all have their drawbacks:
You could change the table definition and make KeyColumn the primary key and set a unique index on the Id column. The problem with this is of course that any other code that depends on Id being the primary key also needs to change. It is also a rather cumbersome change, because Cloud Spanner does not allow you to change the primary key of a table, so you would have to create a copy of the test_id table and then drop the old table.
You could fetch the row from Cloud Spanner before updating it by reading it using the KeyColumn value that you have. The big problem with this is obviously performance. You will need to do a read for each row that you want to update.
You could use a DML statement (UPDATE test_id SET parent_id=#parent WHERE KeyColumn=#key) to execute the update and check whether it actually updated a row by checking the returned update count. If it did not update anything, you could then execute the insert. This will obviously also be slower than an insert_or_update mutation.
Here there is a way to query the Cloud Spanner with a specific index.
You should use something like this in the end of your query : FROM test_id#{FORCE_INDEX=KeyColumnIndex} .
Even though this is the way to execute queries on secondary indexes and the answer for the question in the title, I do not know how much it can be applied in your use case.
1 Employee has N Address. Here I need to maintain the historical information of Employee and Address changes if any changes is done by any users in these two table.
Table Employee:
Employee(
EmpID BIGINT PRIMARY KEY IDENTITY(1,1),
Name varchar(200),
EmpNumber varchar(200),
Createddate Datetime2)
Address Table :
Address(
AddID BIGINT PRIMARY KEY IDENTITY(1,1),
AddressLine1 varchar(300),
AddressLine2 varchar(300),
EmpID BIGINT NULL,
AddressType varchar(100),
Createddate Datetime2)
Above,EmpID is a foreign Key to the Employee table
Scenario I have to satisfy :
I should be able to track the changes of an individual address(Child table records) record of any employee.
I should be able to track the track the changes of a Employee(Parent table records) with child address record.
I thought following way:
Suppose, Initially it is in the state shown in image below
Solution 1:
Case : when child table gets updated
Now, I update a Add0001 Address Record, So i insert a new record in address table making previous record inactive as:
Case : when Parent Table gets updated
Now, When Parent Table gets update, I have history table for the Parent Table and i am moving old data to the history table and update the current records into the parent table as shown:
Solution 2 :
Case : When child table gets updated
Same as in solution 1
Case : When Parent Table gets updated
We insert a new record in the parent table making previous records inactive. In this case we get a new ID and that ID, we update as foreign key to the child tables as shown below:
Is this the best way of maintaining historical data of parent-child table together?
or is there any way i can keep the design so that i should be able to track the changes altogether of parent and child records data ?
There are quite a few ways to go about this sort of thing and what you're proposing is a perfectly valid approach... At least you appear to be pointed in the right direction.
There are a couple of changes that I would suggest...
1) Get rid of the "status" flag and use "begin" and "end" dates. The
specific names don't matter so long as you have them.
2) Both the begin and end date columns should be defined as "NOT
NULL" and begin should have a default constraint of GETDATE() or
CURRENT_TIMESTAMP. The end date should be defaulted to '99991231'.
Trust me and fight the urge to make the end date NULLable and giving "active" rows NULL end dates. '99991231' is, for all
practical purposes, the end of time. and can be used to to easily
identify the currently active rows.
3) I would suggest adding a trigger to the following:
a) prevent updates and/or deletes. Ideally this would be an insert
only table.
b) When new rows are inserted, update (yea I know what
"a)" says) the the "existing current" rows end date with the "new
current" rows begin date. By doing this, you will have a continuous,
gap free history.
Hope this helps. :)
Are you able to use Temporal tables and history tables introduced with SQL Server 2016?
These enable data professionals to keep history of data on related table, so you don't need to think about parent or child, etc.
If the parent data changes are not that frequent then you can maintain the history record of the parent also in the same table and update the foreign keys of the child tables.
Before Changes to Parent
Now if you change the name of the employee and add a new address, then update the employee id in the child table(Address).
After Changes to Parent
You can always get the addresses of the employee before the name has changed using the valid time. This way, we need not create an additional history table. But it may be little complex to fetch the history doing all the date comparisons.
Any suggestions are welcome.
I have a script for microsoft sql server database which has hundreds of tables and tables contains data as well. This is the database of a web application.what I want to do is to delete the previous records and reset the primary key to 1 or 0.
I have tried
`DBCC CHECKIDENT ('dbo.tbl',RESEED,0); `
but it does not work for me as in most of the tables the primary key is not identity.
I can not truncate the table as its primary key is being used as FK in many other tables.
I have also tried to add the identity specification in the primary key of the table and run the checkident query and then changing it back to non-identity spec, but after adding the record again it starts from where it left.
Making changes in the code is not an option for me.
please help.
According with your question I am not sure about the main objective, Why? If you need truncate a lot of tables and change their structures to have an Identity property why you can't disabled the FK? . In the past I have used an standard process for rebuild a table and migrate all the information, this represent a group of steps, I would try to help you but you should follow the next steps.
Steps:
1) Disable FK for alter the structure of your tables. You can get the solution for this task in the next link:
Temporarily disable all foreign key constraints
2) Alter the table with the new property Identity, this is a classic process of ALTER TABLE xxxxxx.
3) Execute the syntax that previously posted :
DBCC CHECKIDENT ('dbo.tbl',RESEED,0);
Try to follow this path and if you have any problem only ask us.
You can not truncate table that have relation. You shoud remove relation firstly.
My understanding of this question:
You have a database with tables that you want to empty and next have them use primary key values starting at 0 or 1.
Some of these tables use an identity value and you already have a solution for those (you know you can find out which columns have an identity by using the sys.columns view? Look for the is_identity column).
Some tables do not use an identity but get their pk values from an unknown source, which we can't modify.
The only solution I see, is creating an after insert trigger (or modifying) on those tables that subtracts from the new pk value.
E.g.: your "hidden generator" will generate a next value 5254, but you want the next pk value to become one:
CREATE TRIGGER trg_sometable_ai
ON sometable
AFTER INSERT
AS
BEGIN
UPDATE st
SET st.pk_col = st.pk_col - 5253
FROM sometable AS st
INNER JOIN INSERTED AS i
ON i.pk_col = th.pk_col
END
You'll have to determine the next value and thus the "subtract value" for each table.
If the code also inserts child records into tables with a foreign key to this table, and uses the previously generated value, you have to modify those triggers as well...
This is a "last resort" solution and something I would recommend against in any scenario that has other options. Manipulating primary key values is generally not a good idea.
i have two tables Table1 and Table2. Where table1 column primary key is referred to table2 column as foreign key.
Now i have to display a error message of constraint violation when ever i delete records from table2 which is having foreign key column of table1.
If I get it right your column A (say) in table 1 references column B (say) in table 2.
What you can do is set the ON DELETE to NO ACTION which will prevent deletion of records from table 2 if any children of it still exists in table 1.
You can can do this by:
ALTER TABLE TABLE1 ADD FOREIGN KEY (A) REFERENCES TABLE2 (B) ON DELETE NO ACTION;
You don't have a constraint violation if you delete records from the child table and not the parent. It is normal to delete child records. For instance if I have a user table and a photos tables that contains the userid from the users table, why would I want to stop that action and throw an error if I want to delete a photo? Deleting a child record doesn't also delete the parent.
If you really want to do that, then you must do it through a trigger (make sure to handle multiple record deletes) or if the FK is a required field, then simply don't grant permissions to delete to the table. Be aware that this may mean you can never delete any records even when you try to delete. A simple method may be to not have a delete function available in the application.
I suspect what you really need to a to get a better definition of what is needed in the requirements document. In over 30 years of dealing with hundreds of databases, I have never seen anyone need this functionality.