Merge Replication - Triggers firing on Both Publisher and Subscriber - sql-server

Server Version: SQL Server 2008R2
Client Version: SQL Server Express 2008R2
I have been encountering what appears to be locking issues when I run my merge replication process. It appears to be when a change is made on the subscriber and sync'd with the publisher. I am positive is coming from the triggers because it is appearing that they are firing on the publisher again and probably trying to send data down to the subscribers again. I have added "NOT FOR REPLICATION" to the triggers, but that doesn't seem to be helping. I also researched and tried adding the below clause as well.
DECLARE #is_mergeagent BIT
SELECT #is_mergeagent = convert(BIT, sessionproperty('replication_agent'))
IF #is_mergeagent = 0 --IF NOT FROM REPLICATION
That didn't seem to help either. How do you handle Merge Replication with Insert / Update triggers? Can I prevent them from "Double" firing?
Always appreciate the info.
--S

Not sure about triggers firing but SESSIONPROPERTY will give NULL here. So the subsequent test always fails.
<Any other string> [gives] NULL = Input is not valid.
You probably mean APP_NAME
This should at least assist troubleshooting...

i'd add a bit field to the table that's causing the issue and call it "processed" or something like that. have it default to false and then set to true when the trigger updates that record, and have the trigger check for a false value before it does anything, otherwise have it do nothing.

Related

SQLWatch - notifications not being sent

I’m wondering if someone with knowledge/experience of SQLWatch could help me out with something.
We have SQLWatch set up on 2 DEV servers and 1 Central monitoring server, its working fine and the data from the 2 DEV servers is coming over to the central server, I can see alerts are being recorded in the table - [dbo].[sqlwatch_logger_check].
However, our issue that we are not being notified by any means (email, Powershell script running).
What’s interesting is that if we drop a row into the table [dbo].[sqlwatch_meta_action_queue] then alert notification does happen.
So our issue seems to be for some reason alerts are being raised but the record is not being inserted into the queue table. I suspect some sort of mapping issue but as it stands now it all looks ok, I use the following to check
SELECT C.check_id,check_name,check_description,check_enabled,A.action_description,A.action_exec_type,A.action_exec
FROM [dbo].[sqlwatch_config_check] C
LEFT JOIN [dbo].[sqlwatch_config_check_action] CA ON C.check_id = CA.check_id
LEFT JOIN [dbo].[sqlwatch_config_action] A ON CA.action_id = A.action_id
WHERe C.check_id = -1
And it shows the failed job is set to run our PowerShell script, which it does when the row is manually inserted.
Any ideas on what the cause may be here?
Thanks,
Nic
I am the creator of SQLWATCH.
Firstly, just to clarify, default notifications that come with SQLWATCH only work in a local scope i.e. they will happen on each monitored instance where ##SERVERNAME = sql_instance. If you are expecting the default notifications to fire from the central server for a remote instance this will not happen. The default notifications on the central server will only fire for the central server itself and not for data imported from the remote instances. This is done to avoid a situation where pull into the central repository is rare and thus notifications could be well delayed.
However, there is nothing stopping you from creating Check Rules or Reports to fire on the back of the imported data.
Secondly, the checks are not alerts per se. Checks are just... well, checks... that run periodically and make sure everything is in order. Checks can trigger an action to send an email. For this, as you have worked out, there is an association table that links together checks and actions.
As for your problem, is the actual action enabled? All actions that are not associated with report are disabled by default as they need to be configured first:
Add a column to your query to bring action_enabled column:
SELECT C.check_id, check_name, check_description, check_enabled, A.action_description, A.action_exec_type, A.action_exec, [action_enabled]
FROM [dbo].[sqlwatch_config_check] C
LEFT JOIN [dbo].[sqlwatch_config_check_action] CA ON C.check_id = CA.check_id
LEFT JOIN [dbo].[sqlwatch_config_action] A ON CA.action_id = A.action_id
WHERE C.check_id = -1
Or, there is already a view that should provide you with the complete mapping:
SELECT *
FROM [dbo].[vw_sqlwatch_report_config_check_action]
WHERE check_id = -1
The application log table [dbo].[sqlwatch_app_log] should also contain valuable information. Did you look in there for anything out of ordinary?
Summarising
In order to enable alerts in a brand new install of SQLWATCH, all it's needed is setting up action_exec with your email details and action_enabled set to 1. If you have made some other changes it may be easier to reinstall back to default.

SQL 2008 All records in column in table updated to NULL

About 5 times a year one of our most critical tables has a specific column where all the values are replaced with NULL. We have run log explorers against this and we cannot see any login/hostname populated with the update, we can just see that the records were changed. We have searched all of our sprocs, functions, etc. for any update statement that touches this table on all databases on our server. The table does have a foreign key constraint on this column. It is an integer value that is established during an update, but the update is identity key specific. There is also an index on this field. Any suggestions on what could be causing this outside of a t-sql update statement?
I would start by denying any client side dynamic SQL if at all possible. It is much easier to audit stored procedures to make sure they execute the correct sql including a proper where clause. Unless your sql server is terribly broken, they only way data is updated is because of the sql you are running against it.
All stored procs, scripts, etc. should be audited before being allowed to run.
If you don't have the mojo to enforce no dynamic client sql, add application logging that captures each client sql before it is executed. Personally, I would have the logging routine throw an exception (after logging it) when a where clause is missing, but at a minimum, you should be able to figure out where data gets blown out next time by reviewing the log. Make sure your log captures enough information that you can trace it back to the exact source. Assign a unique "name" to each possible dynamic sql statement executed, e.g., each assign a 3 char code to each program, and then number each possible call 1..nn in your program so you can tell which call blew up your data at "abc123" as well as the exact sql that was defective.
ADDED COMMENT
Thought of this later. You might be able to add / modify the update trigger on the sql table to look at the number of rows update prevent the update if the number of rows exceeds a threshhold that makes sense for your. So, did a little searching and found someone wrote an article on this already as in this snippet
CREATE TRIGGER [Purchasing].[uPreventWholeUpdate]
ON [Purchasing].[VendorContact]
FOR UPDATE AS
BEGIN
DECLARE #Count int
SET #Count = ##ROWCOUNT;
IF #Count >= (SELECT SUM(row_count)
FROM sys.dm_db_partition_stats
WHERE OBJECT_ID = OBJECT_ID('Purchasing.VendorContact' )
AND index_id = 1)
BEGIN
RAISERROR('Cannot update all rows',16,1)
ROLLBACK TRANSACTION
RETURN;
END
END
Though this is not really the right fix, if you log this appropriately, I bet you can figure out what tried to screw up your data and fix it.
Best of luck
Transaction log explorer should be able to see who executed command, when, and how specifically command looks like.
Which log explorer do you use? If you are using ApexSQL Log you need to enable connection monitor feature in order to capture additional login details.
This might be like using a sledgehammer to drive in a thumb tack, but have you considered using SQL Server Auditing (provided you are using SQL Server Enterprise 2008 or greater)?

Databaselink & Instead of Trigger

i've the following task to manage.
We have a database link between server 'A' and server 'B'.
I created tables on Server 'A' and Views on server 'B' pointing to these tables.
I.Ex.
a table customers on server 'A'
and a view customers on server 'B' pointing to the table on server 'A'.
To provide update capability on the view I created an Instead of Update trigger on the view:
PROMPT CREATE OR REPLACE TRIGGER tudb_customers
CREATE OR REPLACE TRIGGER tudb_customers instead of update or delete on customers
REFERENCING NEW AS NEW OLD AS OLD
for each row
declare
proc_typ_old char;
proc_typ char;
begin
if updating then
proc_typ := 'U';
else
proc_typ := 'D';
end if;
if proc_typ = 'U' then
update customers#db_link set customersname=:new.customersname
where customersid = :old.customersid;
else
delete from customers#db_link where customersid = :old.customersid;
end if;
end TUDB_MOB_ZUG;
/
If I try to update the view on server 'B' (update customers set customersname = 'Henry' where customersid = 1) the :old.customersid is always null. So update fails.
Oracleversion is 10.2.0.1.0
Can anyone help me in this matter? Any ideas?
Greetings,
Chris
This may be a bug, since it seems to work OK in 10.2.0.5. Bug 4386090 ('OLD VALUE RETURN NULL IN "INSTEAD OF" TRIGGER BASED ON DBLINK) sounds from the diagnostic analysis like :old values are null within the trigger if it has a DB link; that seems to have been closed as a duplicate of 4771052 ('INSTEAD-OF trigger does not update tables correctly over dblink', but can't see more details), which is listed in the 10.2.0.3 patchset notes.
You will need to raise an SR with Oracle to confirm this is the same issue, though if it is I suspect they won't do more than advise you to patch up since 10g has been out of support for a while. No workarounds are listed unfortunately.
If the view is of a single table, which seems to be the case from your initial description, I'm not sure you even need the trigger; updating and deleting work directly. Does your view require an INSTEAD OF trigger?
A agree with #AlexPoole, on that this well may be a bug and you're probably will be advised to apply a patch upon contacting to Oracle.
Good point also, updating via the view may not be necessary in your case.
However at this point if I were you, I would contemplate whether this is a good way to establish connection between the clients and the database. I mean connecting an Oracle instance (server 'B') via dblink to the real db instance (server 'A') and letting clients connect to real server indirectly via server 'B'. I think it is kind of a hack that at a certain moment seems to be an easy way to solve something which is probably a networking issue, but later cases further problems, like this time.

Is the SQL Server IsShutdown property useful to determine whether a database is in a good state?

My company has a tool that monitors statuses on servers, services, databases, etc. We monitor a number of on-site servers for our customers. One particular simple check performed is to determine whether a SQL Server database is in a 'good' state by querying for the value of certain database properties. The four database properties we monitor are:
IsSuspect
IsOffline
IsEmergencyMode
IsShutdown
This is the query we use:
SELECT name AS [SuspectDB],
DATABASEPROPERTY(name, N'IsSuspect') AS [Suspect],
DATABASEPROPERTY(name, N'IsOffline') AS [Offline],
ISNULL(DATABASEPROPERTY(name, N'IsShutdown'), 1) AS [Shutdown],
DATABASEPROPERTY(name, N'IsEmergencyMode') AS [Emergency]
FROM sysdatabases
WHERE (DATABASEPROPERTY(name, N'IsSuspect') = 1)
OR (DATABASEPROPERTY(name, N'IsOffline') = 1)
OR (ISNULL(DATABASEPROPERTY(name, N'IsShutdown'), 1) = 1)
OR (DATABASEPROPERTY(name, N'IsEmergencyMode') = 1)
In testing an upgrade to SQL Server 2008, it seems that quite a few of our databases are returning a 1 (true) value for the IsShutdown property. This was never the case previously with SQL Server 2005. The MSDN documentation for the property simply states "Database encountered a problem at startup".
As far as I can tell, the databases are perfectly fine. They are up, can be queried, etc. No issues.
Does the IsShutdown property really matter for my monitoring purposes, i.e., does it indicate that the database is in a bad state? Or should I just remove it from my query?
NOTE: In talking to one of our resident DBAs, they found that on some of our new SQL Server 2008 databases, the fact that the IsAutoClose property is enabled might have something to do with the reason for these databases having an IsShutdown of true. Disabling IsAutoClose seems to "fix" the IsShutdown being true.
Ok, after much investigation into this, this is my conclusion:
Short story: The IsShutdown property is not important for monitoring the status of my databases. Even when set to True, the database is still in a good state.
Long story:
The MSDN definition for the IsShutdown property is incorrect. It reads:
Database encountered a problem at startup
That definition does not make much sense given the name of the property. In practice, the IsShutdown property seems to be directly related to the IsAutoClose property. If Auto Close has cleanly shut down the database after no connections are active any longer, IsShutdown gets set to True. Once the database spins back up (so to speak), IsShutdown is set back to False.
This theory is backed up by the fact that there is an is_cleanly_shutdown column in sys.databases. The value of that column is always the same as the value of IsShutdown, i.e., they are the same.
The bad definition of the property is likely what caused the developer I inherited this query from to include it in his database status check. I have now removed the check of that property from the query.
IsShutDown, which is set if SQL Server is unable to open a database's files during startup
So this would be a good issue to know if you rebooted your server and somebody had moved your database files or they didn't open because of a disk I/O problem.
I would say that ONLINE and MULTI_USER is the only good state you'd want your database to be in.
select state_desc,user_access_desc from sys.databases
Did you read the footnote that is linked to specifically for NULL for that property, on the MSDN page you linked to?
Returned value is also NULL if the database has never been started or has been autoclosed.
And so yes, if you change the AutoClose property, you'll get different results. Coalescing NULL to 1 on this property seems like a bad decision - I'd remove your ISNULL() and ignore NULL values.

SQL Replication "Row Not Found" Error

I have transactional replication running between two databases. I fear they have fallen slightly out of sync, but I don't know which records are affected. If I knew, I could fix it manually on the subscriber side.
SQL Server is giving me this message:
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
I've looked around to try to find out what table, or even better what record is causing the issue, but I can't find that information anywhere.
The most detailed data I've found so far is:
Transaction sequence number: 0x0003BB0E000001DF000600000000, Command ID: 1
But how do I find the table and row from that? Any ideas?
This gives you the table the error is against
use distribution
go
select * from dbo.MSarticles
where article_id in (
select article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)
And this will give you the command (and the primary key (ie the row) the command was executing against)
exec sp_browsereplcmds
#xact_seqno_start = '0x0003BB0E000001DF000600000000',
#xact_seqno_end = '0x0003BB0E000001DF000600000000'
I'll answer my own question with a workaround I ended up using.
Unfortunately, I could not figure out which table was causing the issue through the SQL Server replication interface (or the Event Log for that matter). It just didn't say.
So the next thing I thought of was, "What if I could get replication to continue even though there is an error?" And lo and behold, there is a way. In fact, it's easy. There is a special Distribution Agent profile called "Continue on data consistency errors." If you enable that, then these types of errors will just be logged and passed on by. Once it is through applying the transactions and potentially logging the errors (I only encountered two), then you can go back and use RedGate SQL Data Compare (or some other tool) to compare your two databases, make any corrections to the subscriber and then start replication running again.
Keep in mind, for this to work, your publication database will need to be "quiet" during the part of the process where you diff and fix the subscriber database. Luckily, I had that luxury in this case.
If your database is not prohibitively large, I would stop replication, re-snapshot and then re-start replication. This technet article describes the steps.
If it got out of sync due to a user accidently changing data on the replica, I would set the necessary permissions to prevent this.
This replication article is worth reading.
Use this query to find out the article that is out of sync:
USE [distribution]
select * from dbo . MSarticles
where article_id IN ( SELECT Article_id from MSrepl_commands
where xact_seqno = 0x0003BB0E000001DF000600000000)
of course if you check the error when the replication fails it also tells you which record is at fault and you could extract that data from the core system and just insert it on the subscriber.
This is better than skipping errors as with the SQL Data Compare it will lock the table for the comparison and if you have millions of rows this can take a long time to run.
Tris
Changing the profile to "Continue on data consistency errors" won't always work. Obviously it reduces or nullifies an error, but you won't get the whole proper data. It will skip the rows by which an error occurs, and hence you fail to get accurate data.
the following checks resolve my problem
check that all the replication SQL Agents jobs are working fine and if not start them.
in my case it was stopped because of some killed session occurred a few hours before by Some DBA because of blocking issue
after a very short time all data in subscription were updated and no
other error in replication monitor
in my case all above queries did not returned nothing
This error usually comes when particular record does not exists on subscriber and a update or delete command executed for same record on primary server and which got replicated on subscriber as well.
As this records does not exists on subscriber, replication throws an error " Row Not Found"
Solution of this error to make replication work back to the normal running state:
We can check with following query, whether request at publisher was of update or delete statement:
USE [distribution]
SELECT *
FROM msrepl_commands
WHERE publisher_database_id = 1
AND command_id = 1
AND xact_seqno = 0x00099979000038D6000100000000
We can get artical id information from above query, which can be passed to below proc:
EXEC Sp_browsereplcmds
#article_id = 813,
#command_id = 1,
#xact_seqno_start = '0x00099979000038D60001',
#xact_seqno_end = '0x00099979000038D60001',
#publisher_database_id = 1
Above query will give information about, whether it was a update statement or delete statement.
In Case of Delete Statement
That record can be directly deleted from msrepl_commands objects so that replication wont make retry attempts for the record
DELETE FROM msrepl_commands
WHERE publisher_database_id = 1
AND command_id =1
AND xact_seqno = 0x00099979000038D6000100000000
In case of update statement:
You need to insert that record manually from publisher DB to subscriber DB:

Resources