What is the best approach to synchronizing a DataSet with data in a database? Here are the parameters:
We can't simply reload the data because it's bound to a UI control which a user may have configured (it's a tree grid that they may expand/collapse)
We can't use a changeflag (like a UpdatedTimeStamp) in the database because changes don't always flow through the application (e.g. a DBA could update a field with a SQL statement)
We cannot use an update trigger in the database because it's a multi-user system
We are using ADO.NET DataSets
Multiple fields can change of a given row
I've looked at the DataSet's Merge capability, but this doesn't seem to keep the notion of an "ID" column. I've looked at DiffGram capability but the issue here is those seem to be generated from changes within the same DataSet rather than changes that occured on some external data source.
I've been running from this solution for a while but the approach I know would work (with a lot of ineffeciency) is to build a separate DataSet and then iterate all rows applying changes, field by field, to the DataSet on which it is bound.
Has anyone had a similar scenario? What did you do to solve the problem? Even if you haven't run into a similar problem, any recommendation for a solution is appreciated.
Thanks
DataSet.Merge works well for this if you have a primary key defined for each DataTable; the DataSet will raise changed events to any databound GUI controls
if your table is small you can just re-read all of the rows and merge periodically, otherwise limiting the set to be read with a timestamp is a good idea - just tell the DBAs to follow the rules and update the timestamp ;-)
another option - which is a bit of work - is to keep a changed-row queue (timestamp, row ID) using a trigger or stored procedure, and base the refresh queries off of the timestamp in the queue; this will be more efficient if the base table has a lot of rows in it, allowing you (via an inner join on the queue record) to pull only the changed rows since the last poll time.
I think it would be easier to store a list of the nodes that the user has expanded (assuming you can uniquely identify each one), then re-load the data and re-bind it to the tree view, and then expand all the nodes previously expanded.
Related
Is there an automatic way of knowing which rows are the latest to have been added to an OpenEdge table? I am working with a client and have access to their database, but they are not saving ids nor timestamps for the data.
I was wondering if, hopefully, OpenEdge is somehow doing this out of the box. (I doubt it is but it won't hurt to check)
Edit: My Goal
My goal from this is to be able to only import the new data, i.e. the delta, of a specific table. Without having which rows are new, I am forced to import everything because I have no clue what was aded.
1) Short answer is No - there's no "in the box" way for you to tell which records were added, or the order they were added.
The only way to tell the order of creation is by applying a sequence or by time-stamping the record. Since your application does neither, you're out of luck.
2) If you're looking for changes w/out applying schema changes, you can capture changes using session or db triggers to capture updates to the db, and saving that activity log somewhere.
3) If you're just looking for a "delta" - you can take a periodic backup of the database, and then use queries to compare the current db with the backup db and get the differences that way.
4) Maintain a db on the customer site with the contents of the last table dump. The next time you want to get deltas from the customer, compare that table's contents with the current table, dump the differences, then update the db table to match the current db's table.
5) Personally. I'd talk to the customer and see if (a) they actually require this functionality, (b) find out what they think about adding some fields and a bit of code to the system to get an activity log. Adding a few fields and some code to update them shouldn't be that big of a deal.
You could use database triggers to meet this need. In order to do so you will need to be able to write and deploy trigger procedures. And you need to keep in mind that the 4GL and SQL-92 engines do not recognize each other's triggers. So if updates are possible via SQL, 4GL triggers will be blind to those updates. And vice-versa. (If you do not use SQL none of this matters.)
You would probably want to use WRITE triggers to catch both insertions and updates to data. Do you care about deletes?
Simple-minded 4gl WRITE trigger:
TRIGGER PROCEDURE FOR WRITE OF Customer. /* OLD BUFFER oldCustomer. */ /* OLD BUFFER is optional and not needed in this use case ... */
output to "customer.dat" append.
export customer.
output close.
return.
end.
We want to know what rows in a certain table is used frequently, and which are never used. We could add an extra column for this, but then we'd get an UPDATE for every SELECT, which sounds expensive? (The table contains 80k+ rows, some of which are used very often.)
Is there a better and perhaps faster way to do this? We're using some old version of Microsoft's SQL Server.
This kind of logging/tracking is the classical application server's task. If you want to realize your own architecture (there tracking architecture) do it on your own layer.
And in any case you will need application server there. You are not going to update tracking field it in the same transaction with select, isn't it? what about rollbacks? so you have some manager who first run select than write track information. And what is the point to save tracking information together with entity info sending it back to DB? Save it into application server file.
You could either update the column in the table as you suggested, but if it was me I'd log the event to another table, i.e. id of the record, datetime, userid (maybe ip address etc, browser version etc), just about anything else I could capture and that was even possibly relevant. (For example, 6 months from now your manager decides not only does s/he want to know which records were used the most, s/he wants to know which users are using the most records, or what time of day that usage pattern is etc).
This type of information can be useful for things you've never even thought of down the road, and if it starts to grow large you can always roll-up and prune the table to a smaller one if performance becomes an issue. When possible, I log everything I can. You may never use some of this information, but you'll never wish you didn't have it available down the road and will be impossible to re-create historically.
In terms of making sure the application doesn't slow down, you may want to 'select' the data from within a stored procedure, that also issues the logging command, so that the client is not doing two roundtrips (one for the select, one for the update/insert).
Alternatively, if this is a web application, you could use an async ajax call to issue the logging action which wouldn't slow down the users experience at all.
Adding new column to track SELECT is not a practice, because it may affect database performance, and the database performance is one of major critical issue as per Database Server Administration.
So here you can use one very good feature of database called Auditing, this is very easy and put less stress on Database.
Find more info: Here or From Here
Or Search for Database Auditing For Select Statement
Use another table as a key/value pair with two columns(e.g. id_selected, times) for storing the ids of the records you select in your standard table, and increment the times value by 1 every time the records are selected.
To do this you'd have to do a mass insert/update of the selected ids from your select query in the counting table. E.g. as a quick example:
SELECT id, stuff1, stuff2 FROM myTable WHERE stuff1='somevalue';
INSERT INTO countTable(id_selected, times)
SELECT id, 1 FROM myTable mt WHERE mt.stuff1='somevalue' # or just build a list of ids as values from your last result
ON DUPLICATE KEY
UPDATE times=times+1
The ON DUPLICATE KEY is right from the top of my head in MySQL. For conditionally inserting or updating in MSSQL you would need to use MERGE instead
I'm building a quite complex database and in order to simplify some queries I made a view that involves a lot of joined tables. I need to receive a notification whenever a row is added / modified / deleted in the view. Using postgresql 9.2 I first though about the new feature: triggers on view, introduced in pg 9.1. However this doesn't do what I need, since this feature only offers a trigger that is fired when an insert/update/delete is performed directly on the view.
Long story short: I need something (triggers or else) that, looking directly on the view, will notify me when my view is updated (I mean indirectly: when one of the tables which composes the view is modified). Is there something easy to use or I have to manually set a trigger for each table that is involved in the creation of the view?
Thanks in advance!
Feature what you want doesn't exist. You need to write your own a set of triggers on interesting tables.
I have high volume of data normalized into more than 100 tables. There are multiple applications which change underlying data in those tables and I want to raise events on those changes. Possible options that I know of are:
Change Data Capture
Change Tracking
Using Triggers on each table (bad option but possible)
Can someone share the best way of doing this if someone has already done this before?
What I really want in the end is if there is one transaction that affected 12 tables off 100 I should be able to bubble one event up instead of 12. Assume there are concurrent users change these tables.
Two options I can think of:
Triggers ARE the right way to capture change events in the DB layer
Codewise, I make sure in my app that each table is changed through only one place in the code, regardless what the change is (I call it a hub for that table, as it channels many different pathways into one place), it becomes very easy to catch change events that way in the code layer
One possibility is SQL Server Query Notifications: Using Query Notifications
As long as you want to 'batch' multiple changes, I think you should follow the route of Change Data Capture or Change Tracking (depending on whether you just want to know that something changed or what changes happened).
They should be used by a 'polling' procedure, where you poll for changes every few minutes (seconds, miliseconds???) and raise events. The nice thing about this is that as long as you store the last rowversion of the previous poll -for each table- you can check whenever you like for changes since the last poll. You don't rely on a real time triggers approach, that if halted you would loose all events forever. The procedure could be easily created inside a procedure that checks each table and you would need only 1 more table to store last rowversion per table.
Also, the overhead of this approach would be controlled by you and by how frequently the polling happens.
What is the best way to track changes in a database table?
Imagine you got an application in which users (in the context of the application not DB users ) are able to change data which are store in some database table. What's the best way to track a history of all changes, so that you can show which user at what time change which data how?
In general, if your application is structured into layers, have the data access tier call a stored procedure on your database server to write a log of the database changes.
In languages that support such a thing aspect-oriented programming can be a good technique to use for this kind of application. Auditing database table changes is the kind of operation that you'll typically want to log for all operations, so AOP can work very nicely.
Bear in mind that logging database changes will create lots of data and will slow the system down. It may be sensible to use a message-queue solution and a separate database to perform the audit log, depending on the size of the application.
It's also perfectly feasible to use stored procedures to handle this, although there may be a bit of work involved passing user credentials through to the database itself.
You've got a few issues here that don't relate well to each other.
At the basic database level you can track changes by having a separate table that gets an entry added to it via triggers on INSERT/UPDATE/DELETE statements. Thats the general way of tracking changes to a database table.
The other thing you want is to know which user made the change. Generally your triggers wouldn't know this. I'm assuming that if you want to know which user changed a piece of data then its possible that multiple users could change the same data.
There is no right way to do this, you'll probably want to have a separate table that your application code will insert a record into whenever a user updates some data in the other table, including user, timestamp and id of the changed record.
Make sure to use a transaction so you don't end up with cases where update gets done without the insert, or if you do the opposite order you don't end up with insert without the update.
One method I've seen quite often is to have audit tables. Then you can show just what's changed, what's changed and what it changed from, or whatever you heart desires :) Then you could write up a trigger to do the actual logging. Not too painful if done properly...
No matter how you do it, though, it kind of depends on how your users connect to the database. Are they using a single application user via a security context within the app, are they connecting using their own accounts on the domain, or does the app just have everyone connecting with a generic sql-account?
If you aren't able to get the user info from the database connection, it's a little more of a pain. And then you might look at doing the logging within the app, so if you have a process called "CreateOrder" or whatever, you can log to the Order_Audit table or whatever.
Doing it all within the app opens yourself up a little more to changes made from outside of the app, but if you have multiple apps all using the same data and you just wanted to see what changes were made by yours, maybe that's what you wanted... <shrug>
Good luck to you, though!
--Kevin
In researching this same question, I found a discussion here very useful. It suggests having a parallel table set for tracking changes, where each change-tracking table has the same columns as what it's tracking, plus columns for who changed it, when, and if it's been deleted. (It should be possible to generate the schema for this more-or-less automatically by using a regexed-up version of your pre-existing scripts.)
Suppose I have a Person Table with 10 columns which include PersonSid and UpdateDate. Now, I want to keep track of any updates in Person Table.
Here is the simple technique I used:
Create a person_log table
create table person_log(date datetime2, sid int);
Create a trigger on Person table that will insert a row into person_log table whenever Person table gets updated:
create trigger tr on dbo.Person
for update
as
insert into person_log(date, sid) select updatedDTTM, PersonSID from inserted
After any updates, query person_log table and you will be able to see personSid that got updated.
Same you can do for Insert, delete.
Above example is for SQL, let me know in case of any queries or use this link :
https://web.archive.org/web/20211020134839/https://www.4guysfromrolla.com/webtech/042507-1.shtml
A trace log in a separate table (with an ID column, possibly with timestamps)?
Are you going to want to undo the changes as well - perhaps pre-create the undo statement (a DELETE for every INSERT, an (un-) UPDATE for every normal UPDATE) and save that in the trace?
Let's try with this open source component:
https://tabledependency.codeplex.com/
TableDependency is a generic C# component used to receive notifications when the content of a specified database table change.
If all changes from php. You may use class to log evry INSERT/UPDATE/DELETE before query. It will be save action, table, column, newValue, oldValue, date, system(if need), ip, UserAgent, clumnReference, operatorReference, valueReference. All tables/columns/actions that need to log are configurable.