I encountered a situation today from a client who had a replication failure in their MSSQL database. Short version, they had a record deleted manually from the Subscriber, then realized it was the wrong location and deleted it from the Publisher. And if you know anything about replication, you know exactly what happened next…
I was able to direct them to a link to help troubleshoot the issues, over on SQL Server Replication Troubleshooting, but as I was waiting to hear if they resolved the issue, I thought “I should just script this out to pull the most recent error…” So, I threw together this little snippet for them: [Note: the issue with this script mentioned at the end of the post has been fixed – you can access the updated version of the script here.]
USE distribution GO DECLARE @comandID INT DECLARE @publisherID INT DECLARE @articleID INT DECLARE @error_seqno VARBINARY(16) DECLARE @command_seqno VARBINARY(16) --Select top 10 * from msrepl_errors --order by id desc SELECT TOP 1 @error_seqno = xact_seqno , @comandID = command_id FROM msrepl_errors WHERE error_code > '' ORDER BY id DESC --select * --from msrepl_commands (nolock) --where command_id = @comandID and xact_seqno = @error_seqno SELECT TOP 1 @publisherID = publisher_database_id , @articleID = article_id , @command_seqno = xact_seqno FROM msrepl_commands (NOLOCK) WHERE command_id = @comandID AND xact_seqno = @error_seqno --EXEC sp_browsereplcmds @article_id = @articleID -- , @command_id = @comandID -- , @xact_seqno_start = @command_seqno -- , @xact_seqno_end = @command_seqno -- , @publisher_database_id = @publisherID SELECT 'Insert this value into the @xact_seqno_start and @xact_seqno_end properties below, then review the command field output: ' , @command_seqno EXEC sp_browsereplcmds @article_id = @articleID , @command_id = @comandID , @xact_seqno_start = '0x0002BC6D000125120035' , @xact_seqno_end = '0x0002BC6D000125120035' , @publisher_database_id = @publisherID GO
Now, admittedly, that last bit is a little ugly, but when I used the varbinary(16) variable, the command returned a blank. I suspect it is probably a type conversion problem in the sp_browsereplcmds call, but I didn’t want to spend a lot of time debugging this particular snippet, since they were under a little time pressure to fix the problem.
Using this script, they were able to identify the IDs of the missing records, and manuall insert some temporary records to allow the replication job to execute the failing delete steps.