What is the best way to track changes in a database table?
Imagine you got an application in which users (in the context of the application not DB users ) are able to change data which are store in some database table. What's the best way to track a history of all changes, so that you can show which user at what time change which data how?
In general, if your application is structured into layers, have the data access tier call a stored procedure on your database server to write a log of the database changes.
In languages that support such a thing aspect-oriented programming can be a good technique to use for this kind of application. Auditing database table changes is the kind of operation that you'll typically want to log for all operations, so AOP can work very nicely.
Bear in mind that logging database changes will create lots of data and will slow the system down. It may be sensible to use a message-queue solution and a separate database to perform the audit log, depending on the size of the application.
It's also perfectly feasible to use stored procedures to handle this, although there may be a bit of work involved passing user credentials through to the database itself.
You've got a few issues here that don't relate well to each other.
At the basic database level you can track changes by having a separate table that gets an entry added to it via triggers on INSERT/UPDATE/DELETE statements. Thats the general way of tracking changes to a database table.
The other thing you want is to know which user made the change. Generally your triggers wouldn't know this. I'm assuming that if you want to know which user changed a piece of data then its possible that multiple users could change the same data.
There is no right way to do this, you'll probably want to have a separate table that your application code will insert a record into whenever a user updates some data in the other table, including user, timestamp and id of the changed record.
Make sure to use a transaction so you don't end up with cases where update gets done without the insert, or if you do the opposite order you don't end up with insert without the update.
One method I've seen quite often is to have audit tables. Then you can show just what's changed, what's changed and what it changed from, or whatever you heart desires :) Then you could write up a trigger to do the actual logging. Not too painful if done properly...
No matter how you do it, though, it kind of depends on how your users connect to the database. Are they using a single application user via a security context within the app, are they connecting using their own accounts on the domain, or does the app just have everyone connecting with a generic sql-account?
If you aren't able to get the user info from the database connection, it's a little more of a pain. And then you might look at doing the logging within the app, so if you have a process called "CreateOrder" or whatever, you can log to the Order_Audit table or whatever.
Doing it all within the app opens yourself up a little more to changes made from outside of the app, but if you have multiple apps all using the same data and you just wanted to see what changes were made by yours, maybe that's what you wanted... <shrug>
Good luck to you, though!
--Kevin
In researching this same question, I found a discussion here very useful. It suggests having a parallel table set for tracking changes, where each change-tracking table has the same columns as what it's tracking, plus columns for who changed it, when, and if it's been deleted. (It should be possible to generate the schema for this more-or-less automatically by using a regexed-up version of your pre-existing scripts.)
Suppose I have a Person Table with 10 columns which include PersonSid and UpdateDate. Now, I want to keep track of any updates in Person Table.
Here is the simple technique I used:
Create a person_log table
create table person_log(date datetime2, sid int);
Create a trigger on Person table that will insert a row into person_log table whenever Person table gets updated:
create trigger tr on dbo.Person
for update
as
insert into person_log(date, sid) select updatedDTTM, PersonSID from inserted
After any updates, query person_log table and you will be able to see personSid that got updated.
Same you can do for Insert, delete.
Above example is for SQL, let me know in case of any queries or use this link :
https://web.archive.org/web/20211020134839/https://www.4guysfromrolla.com/webtech/042507-1.shtml
A trace log in a separate table (with an ID column, possibly with timestamps)?
Are you going to want to undo the changes as well - perhaps pre-create the undo statement (a DELETE for every INSERT, an (un-) UPDATE for every normal UPDATE) and save that in the trace?
Let's try with this open source component:
https://tabledependency.codeplex.com/
TableDependency is a generic C# component used to receive notifications when the content of a specified database table change.
If all changes from php. You may use class to log evry INSERT/UPDATE/DELETE before query. It will be save action, table, column, newValue, oldValue, date, system(if need), ip, UserAgent, clumnReference, operatorReference, valueReference. All tables/columns/actions that need to log are configurable.
Related
I have two SQL Server databases.
One is being used as the back-end for a Ruby-On-Rails system that we are transitioning from but is still in use because of the Ruby apps we are rewriting in ASP.NET MVC.
The databases have similar tables, but not identical, for the Users, Roles and Roles-Users tables.
I want to create some type of trigger to update the user and roles-users tables on each database when a modification is made on the other database of the same table.
I can't just use the users table on the original database because Ruby has a different hash function for the passwords, but I want to ensure that changes on one system are reflected on the other instanter.
I also want to avoid the obvious problem that an update on the one database triggers an update on the other which triggers an update on the first and the process repeats itself until the server crashes or something similarly undesirable happens or a deadlock occurs.
I do not want to use database replication.
Is there a somewhat simple way to do this on a transaction per transaction basis?
EDIT
The trigger would be conceptually something like this:
USE Original;
GO
CREATE TRIGGER dbo.user_update
ON dbo.user WITH EXECUTE AS [cross table user identity]
AFTER UPDATE
AS
BEGIN
UPDATE Another.dbo.users SET column1=value1, etc., WHERE inserted.ID = Another.dbo.users.ID;
END
The problem I am trying to avoid is a recursive call.
Another.dbo.users will have a similar trigger in place on it because the two databases have different types of applications, Ruby-On-Rails on the one and ASP.NET MVC on the other that may be working on data that should be the same on the two databases.
I would add a field to both tables if possible. When adding or updating a table the 'check' field would be set to 0. The trigger would look at this field and if it is 0, having been generated by an application event, then the trigger fires the insert/update into the second table but the check field would have a 1 instead of 0.
So when the trigger fires on the second table it will skip the insert back into table one.
This will solve the recursive problem.
If for some reason you can not add the check field, you can use a separate table with the primary key to the table and the check field. This need more coding but would work also.
I have a database with multiple tables
and the user can change the data in the table.
my problems is that I wont that nothing changes in the database until the user click the button "save", and even when he do - it submit only the table he decide to save
but in the meantime it is necessary that the user can see all the changes that he did. and every "select" must give him the modified data ,and not the base data.
how I can on the one hand not submit the data in the database, and On the other hand show the data modified to the user?
I thought to do a transaction and don't submit, (and use read uncommitted) but for that I must don't close the connection (if I close without submit - all the changes are canceled) and I don't wont leave several of connection open.
I also thought to build a list of all the change, and whenever the user make a select - first searching from the list. but it is very complicated , and I prefer a simple solution
Thank you
This is going to be very tricky to handle as you've insisted that you cannot use transactions.
Best I can suggest is to add columns to each table to represent the state - but even then that's going to be tricky on how you'd ensure userA see's the pre-change and userB the post but not yet committed.
Perhaps you could look at using two tables and have a view selecting the pertinent data from both depending on the requirements.
Either way it's a nasty way to go about it and not very performant.
The moment you insisted you couldn't use a transaction is the moment you took away any chance of a simple answer.
A temporary table won't help here (as suggested above) as it's tied to the connection which you state will be closed. The only alternative temp table solution is a global temporary table but that also leads to issues (who creates it, what if you're the last connection to use it, check to see if it exists etc.)
You can use temporary tables to store a temporary data and then move them when it will need.
Can I find out when the last INSERT, UPDATE or DELETE statement was performed on a table in an Oracle database and if so, how?
A little background: The Oracle version is 10g. I have a batch application that runs regularly, reads data from a single Oracle table and writes it into a file. I would like to skip this if the data hasn't changed since the last time the job ran.
The application is written in C++ and communicates with Oracle via OCI. It logs into Oracle with a "normal" user, so I can't use any special admin stuff.
Edit: Okay, "Special Admin Stuff" wasn't exactly a good description. What I mean is: I can't do anything besides SELECTing from tables and calling stored procedures. Changing anything about the database itself (like adding triggers), is sadly not an option if want to get it done before 2010.
I'm really late to this party but here's how I did it:
SELECT SCN_TO_TIMESTAMP(MAX(ora_rowscn)) from myTable;
It's close enough for my purposes.
Since you are on 10g, you could potentially use the ORA_ROWSCN pseudocolumn. That gives you an upper bound of the last SCN (system change number) that caused a change in the row. Since this is an increasing sequence, you could store off the maximum ORA_ROWSCN that you've seen and then look only for data with an SCN greater than that.
By default, ORA_ROWSCN is actually maintained at the block level, so a change to any row in a block will change the ORA_ROWSCN for all rows in the block. This is probably quite sufficient if the intention is to minimize the number of rows you process multiple times with no changes if we're talking about "normal" data access patterns. You can rebuild the table with ROWDEPENDENCIES which will cause the ORA_ROWSCN to be tracked at the row level, which gives you more granular information but requires a one-time effort to rebuild the table.
Another option would be to configure something like Change Data Capture (CDC) and to make your OCI application a subscriber to changes to the table, but that also requires a one-time effort to configure CDC.
Ask your DBA about auditing. He can start an audit with a simple command like :
AUDIT INSERT ON user.table
Then you can query the table USER_AUDIT_OBJECT to determine if there has been an insert on your table since the last export.
google for Oracle auditing for more info...
SELECT * FROM all_tab_modifications;
Could you run a checksum of some sort on the result and store that locally? Then when your application queries the database, you can compare its checksum and determine if you should import it?
It looks like you may be able to use the ORA_HASH function to accomplish this.
Update: Another good resource: 10g’s ORA_HASH function to determine if two Oracle tables’ data are equal
Oracle can watch tables for changes and when a change occurs can execute a callback function in PL/SQL or OCI. The callback gets an object that's a collection of tables which changed, and that has a collection of rowid which changed, and the type of action, Ins, upd, del.
So you don't even go to the table, you sit and wait to be called. You'll only go if there are changes to write.
It's called Database Change Notification. It's much simpler than CDC as Justin mentioned, but both require some fancy admin stuff. The good part is that neither of these require changes to the APPLICATION.
The caveat is that CDC is fine for high volume tables, DCN is not.
If the auditing is enabled on the server, just simply use
SELECT *
FROM ALL_TAB_MODIFICATIONS
WHERE TABLE_NAME IN ()
You would need to add a trigger on insert, update, delete that sets a value in another table to sysdate.
When you run application, it would read the value and save it somewhere so that the next time it is run it has a reference to compare.
Would you consider that "Special Admin Stuff"?
It would be better to describe what you're actually doing so you get clearer answers.
How long does the batch process take to write the file? It may be easiest to let it go ahead and then compare the file against a copy of the file from the previous run to see if they are identical.
If any one is still looking for an answer they can use Oracle Database Change Notification feature coming with Oracle 10g. It requires CHANGE NOTIFICATION system privilege. You can register listeners when to trigger a notification back to the application.
Please use the below statement
select * from all_objects ao where ao.OBJECT_TYPE = 'TABLE' and ao.OWNER = 'YOUR_SCHEMA_NAME'
Is there an automatic way of knowing which rows are the latest to have been added to an OpenEdge table? I am working with a client and have access to their database, but they are not saving ids nor timestamps for the data.
I was wondering if, hopefully, OpenEdge is somehow doing this out of the box. (I doubt it is but it won't hurt to check)
Edit: My Goal
My goal from this is to be able to only import the new data, i.e. the delta, of a specific table. Without having which rows are new, I am forced to import everything because I have no clue what was aded.
1) Short answer is No - there's no "in the box" way for you to tell which records were added, or the order they were added.
The only way to tell the order of creation is by applying a sequence or by time-stamping the record. Since your application does neither, you're out of luck.
2) If you're looking for changes w/out applying schema changes, you can capture changes using session or db triggers to capture updates to the db, and saving that activity log somewhere.
3) If you're just looking for a "delta" - you can take a periodic backup of the database, and then use queries to compare the current db with the backup db and get the differences that way.
4) Maintain a db on the customer site with the contents of the last table dump. The next time you want to get deltas from the customer, compare that table's contents with the current table, dump the differences, then update the db table to match the current db's table.
5) Personally. I'd talk to the customer and see if (a) they actually require this functionality, (b) find out what they think about adding some fields and a bit of code to the system to get an activity log. Adding a few fields and some code to update them shouldn't be that big of a deal.
You could use database triggers to meet this need. In order to do so you will need to be able to write and deploy trigger procedures. And you need to keep in mind that the 4GL and SQL-92 engines do not recognize each other's triggers. So if updates are possible via SQL, 4GL triggers will be blind to those updates. And vice-versa. (If you do not use SQL none of this matters.)
You would probably want to use WRITE triggers to catch both insertions and updates to data. Do you care about deletes?
Simple-minded 4gl WRITE trigger:
TRIGGER PROCEDURE FOR WRITE OF Customer. /* OLD BUFFER oldCustomer. */ /* OLD BUFFER is optional and not needed in this use case ... */
output to "customer.dat" append.
export customer.
output close.
return.
end.
We want to know what rows in a certain table is used frequently, and which are never used. We could add an extra column for this, but then we'd get an UPDATE for every SELECT, which sounds expensive? (The table contains 80k+ rows, some of which are used very often.)
Is there a better and perhaps faster way to do this? We're using some old version of Microsoft's SQL Server.
This kind of logging/tracking is the classical application server's task. If you want to realize your own architecture (there tracking architecture) do it on your own layer.
And in any case you will need application server there. You are not going to update tracking field it in the same transaction with select, isn't it? what about rollbacks? so you have some manager who first run select than write track information. And what is the point to save tracking information together with entity info sending it back to DB? Save it into application server file.
You could either update the column in the table as you suggested, but if it was me I'd log the event to another table, i.e. id of the record, datetime, userid (maybe ip address etc, browser version etc), just about anything else I could capture and that was even possibly relevant. (For example, 6 months from now your manager decides not only does s/he want to know which records were used the most, s/he wants to know which users are using the most records, or what time of day that usage pattern is etc).
This type of information can be useful for things you've never even thought of down the road, and if it starts to grow large you can always roll-up and prune the table to a smaller one if performance becomes an issue. When possible, I log everything I can. You may never use some of this information, but you'll never wish you didn't have it available down the road and will be impossible to re-create historically.
In terms of making sure the application doesn't slow down, you may want to 'select' the data from within a stored procedure, that also issues the logging command, so that the client is not doing two roundtrips (one for the select, one for the update/insert).
Alternatively, if this is a web application, you could use an async ajax call to issue the logging action which wouldn't slow down the users experience at all.
Adding new column to track SELECT is not a practice, because it may affect database performance, and the database performance is one of major critical issue as per Database Server Administration.
So here you can use one very good feature of database called Auditing, this is very easy and put less stress on Database.
Find more info: Here or From Here
Or Search for Database Auditing For Select Statement
Use another table as a key/value pair with two columns(e.g. id_selected, times) for storing the ids of the records you select in your standard table, and increment the times value by 1 every time the records are selected.
To do this you'd have to do a mass insert/update of the selected ids from your select query in the counting table. E.g. as a quick example:
SELECT id, stuff1, stuff2 FROM myTable WHERE stuff1='somevalue';
INSERT INTO countTable(id_selected, times)
SELECT id, 1 FROM myTable mt WHERE mt.stuff1='somevalue' # or just build a list of ids as values from your last result
ON DUPLICATE KEY
UPDATE times=times+1
The ON DUPLICATE KEY is right from the top of my head in MySQL. For conditionally inserting or updating in MSSQL you would need to use MERGE instead