Detecting and Publishing Changes to Data in SQL Server in Real-time

Detecting and Publishing Changes to Data in SQL Server in Real-time - sql-server

I have an ERP System (Navision) where product data and stock numbers are frequently updated. Every time an attribute of a product is updated I want this change to be pushed to another SQL Server using Service Broker. I was considering using triggers for the detection, but I am unsure if that is the best way, and whether this is scalable. I expect updates to happen approx. once per second, but this number might double or triple.
Any feedback would be appreciated.

Add a column for Last Modified Date for each record and update this column using the trigger each time a record is being updated. Then Run a scheduled job at a specific time each day (Off-business hours preferred) So that all records that are updated after the last scheduled run is processed.
So The following items need to be done
Add a new column LastModifiedDate in the table with DATETIME data type.
Create a Trigger to update the ModifiedDate each time the record is updated
Create a new table to store the schedule run date and time
Create a scheduled job on Database that will run at a specified time every day.
This job will pick all the records that have the value greater than the date in the Table Create on Step#4.
So Since only 1 column is being updated in the trigger, it won't affect the performance of the table. Also since we are running the update job only once a day, It will also reduce the Database Traffic.

Related

SpringBatch application periodically pulling data from DB

I am working on a spring batch service that pulls data from a db on a schedule. (e.g. every day at 12pm)
I am using JdbcPagingItemReader to read the data and a scheduler (#Scheduled provided by spring batch) to launch the job. The problem that I have now is: every time the job runs, it will just pull all the data from the beginning and not from the "last read" row.
The data from the db is changing everyday(deleting old ones and adding new ones) and all I have is a timestamp column to track them.
Is there a way to "remember" the last row read from the last execution of the job and read data only later than that row?

Since you need to pull data on a daily basis, and your records have a timestamp, then you can design your job instances to be based on a given date (ie using the date as an identifying job parameter). With this approach, you do not need to "remember" the last processed record. All you need to do is process records for a given date by using the correct SQL query. For example:
Job instance ID
Date
Job parameter
SQL
1
2021-03-22
date=2021-03-22
Select c1, c2 from table where date = 2021-03-22
2
2021-03-23
date=2021-03-23
Select c1, c2 from table where date = 2021-03-23
...
...
...
...
With that in place, you can use any cursor-based or paging-based reader to process records of a given date. If a job instance fails, you can restart it without a risk to interfere with other job instances. The restart could be done even several days after the failure since the job instance will always process the same data set. Moreover, in case of failure and job restart, Spring Batch will reprocess records from the last check point in the previous (failed) run.

Just want to post an update to this question.
So in the end I created two more steps to achieve what I wanted to do initially.
Since I don't have the privilege to modify the table where I read the data from, I couldn't use the "process indicator pattern" which involves having a column to mark if a record is processed or not. I created another table to store the last-read record's timestamp, and use it to update the sql query.
step 0: a tasklet that reads the bookmark from a table, pass it in the job context
step 1: a chunk step, get the bookmark from the context, use jdbcPagingItemReader to read the data
step 2: a tasklet to update the bookmark
But doing this I have to be very cautious with the bookmark table. If I lose that I lose everything

Is there a way to set up SQL Server to automatically delete some rows on certain condition

Is there a way to set up SQL Server to automatically delete some rows based on certain conditions?
For example I have a table TblNote with a column createDate to store date that row was created, and a column deleteDate to store date so that this row will be deleted when deleteDate matches current date.
How can I set up server to do that?

You could probably use SQL jobs that will run on daily basis at a certain time, pick those records which have delete-date less than or equal to current date and will perform delete operation on those records.
You can see this link to learn how to schedule sql jobs.

Yes there is :
Add a trigger for column insert or update. However this will work only if a record DMQ takes place.
or
create a procedure that checks it and place it in a job monitor(if you have licensed SQL server) or Task scheduler.

SQL Server trigger to update data automatically

I'm designing a library management database schema, let's say there is a table "Borrow"
Borrow
id
user_id
book_id
borrow_date
due_date
isExpired
expired_day (number of days after the book is expired)
fine
Can the SQL Trigger implement the following circumstances?
1.Compare the due_date with Today, if it's same-->send email-->mark isExpired to true
2.If isExpired is marked to true-->compare the difference between today and due_date, and update expired_day--->update fine (expired_days * 5)

A trigger only fires when something happens on the table or row. It won't fire continuously (or daily). If nothing happens to the table then your trigger will never fire so your checks can't be done.
So, the trigger you describe would work when you first insert a record into the row, but there's no automatic way with a trigger for it to fire after the due date period to check for the expiry and fine.
You would most likely need to setup a stored procedure that contained your code and find a way to run that on scheduled basis.
The following link goes over how to set that up:
Scheduled run of stored procedure on SQL server

Since you want to check all the records of the library daily and want them to be updated accordingly, it is better to make a daily job and schedule an agent and set a particular time so that this daily job would be executed everyday automatically.
pls Note : You should keep in mind to choose that time when you feel your application would be least used during the entire day.
Creation of Agent : http://msdn.microsoft.com/en-us/library/ms181153(v=sql.105).aspx

How can I get a list of modified records from a SQL Server database?

I am currently in the process of revamping my company's management system to run a little more lean in terms of network traffic. Right now I'm trying to figure out an effective way to query only the records that have been modified (by any user) since the last time I asked.
When the application starts it loads the job information and caches it locally like the following: SELECT * FROM jobs.
I am writing out the date/time a record was modified ala UPDATE jobs SET Widgets=#Widgets, LastModified=GetDate() WHERE JobID=#JobID.
When any user requests the list of jobs I query all records that have been modified since the last time I requested the list like the following: SELECT * FROM jobs WHERE LastModified>=#LastRequested and store the date/time of the request to pass in as #LastRequest when the user asks again. In theory this will return only the records that have been modified since the last request.
The issue I'm running into is when the user's date/time is not quite in sync with the server's date/time and also of server load when querying an un-indexed date/time column. Is there a more effective system then querying date/time information?

I don't know that I would rely on Date-Time since it is external to SQL Server.
If you have an Identity column, I would use that column in a table UserId, LastQueryDateTime, LastIdRetrieved
Every time you query the base table, insert new row for user (or update if exists) the max id into this table. Also, the query should read the row from this table to get the LastIdRetrieved and use that in the where clause.
All this could be eliminated if all of your code chooses to insert GetDate() from SQL Server instead of from the client machines, but that task is pretty labor intensive.

The easiest solution seems to settle on one time as leading.
One way would be to settle on the server time. After updating the row, store the value returned by select LastModified where JobID = #JobID on the client side. That way, the client can effectively query using only the server time as reference.

Use an update sequence number (USN) much like Active Directory and DNS use to keep track of the objects that have changed since their last replication. Pick a number to start with, and each time a record in the Jobs table is inserted or modified, write the most recent USN. Keep track of the USN when the last Select query was executed, and you then always know what records were altered since the last query. For example...
Set LastQryUSN = 100
Update Jobs Set USN=101, ...
Update Jobs Set USN=102, ...
Insert Jobs (USN, ...) Values (103, ...)
Select * From Jobs Where USN > LastQryUSN
Set LastQryUSN = 103
Update Jobs Set USN=104
Insert Jobs (USN, ...) Values (105, ...)
Select * From Jobs Where USN > LastQryUSN
Set LastQryUSN = 105
... and so on

When you get the Jobs, get the server time too:
DECLARE #now DATETIME = GETUTCDATE();
SELECT #now AS [ServerTime], * FROM Jobs WHERE Modified >= #LastModified;
First time you pass in a minimum date as #LastModified. On each subsequent call, you pass in the ServerTime returned last call. This way the client time is taken out of the equation.
The answer to the server load is, I hope, obvious: add an index on Modified column.
And one more adice: never use local time, not even on server. Always use UTC times, and store UTC time in Modified. As it is right now, your program is completely screwed two times a year, when the daylight savings time changes are set in or when they are removed.

Current SQL Server has change tracking you can use for exactly that. Just enable change tracking on the tables you want to track.

Monitoring a calculated data (A person's age) in SQL Server

I have a table, tblClient, which stored a client's date of birth in a field of type datetime, DOB.
The goal here is that, when a client reaches 65 years old (need to be calculated thru DOB), I need to insert a new record into another table.
But since the age of a client does not change due to a database transaction (INSERT,UPDATE,DELETE), trigger is out of question.
What would be a good idea to montior such changes?

create a sql agent job that runs daily or hourly that will do this calculation with T-SQL and then if someone reaches 65 it will do the insert

Keep it as self-contained to SQL Server as possible - a SQL Server Agent job that periodically executes a stored procedure should do nicely.

A scheduled task or SQL Server maintenance plan that runs a stored procedure as often as required, updating the required rows.

What about a nightly job using SSIS with a stored procedure that checks and if it happens that they are 65 it enters a new row in the table?

you can create a SQL Server Agent Job within the database using SQL Server Management Studio for this:
http://www.databasedesign-resource.com/sql-server-jobs.html
Set up a daily job to EXEC BirthdayProcessingProcedure or whatever you want to name it.
As long as the database is up and running, the JOB will run according to the schedule you set up (from within the database).

I'm going to propose another approach - run something every time a DOB is updated (or added) that calculates the period from now until the first person reaches 65. Then (re-)schedule a job to run at that time.
Also, I can't believe you need to insert that row the second they reach 65, so a once-a-day procedure that calculates today's new 65year olds would seem good enough?

How about a new field that is age65 date. Calculate it once on record insert, then you can query to your hearts content on this field. You will need to do this is a trigger (and account for updates, they are rare for DOB fields but possible when they are mistyped.) Now that I think about it some, a calculted filed will probably work instead of a trigger.
Then run a daily job to catch anyone who turned 65 since the last time the job was run successfully. Make sure to handle this so that if the job fails one day, the people from that daty are picked up the next run.
The reason why I suggest this is that calculating the age of every person inyour database every day is such a waste of resources for a calculation that really only needs to be done once. Ok not a big deal when you have 100 people, big problem when you have a million. Doindthis kindof calc on a million records to identify the three you need is painful. Doing it once on data entry, not so bad.