There
can be many reasons for the size of Transaction Log growing large for a
sql server database such as waiting on checkpoint, backups in Full
recovery model, active long running transaction.
The
simplest and the smartest way to determine the cause of the Transaction
Log in sql server 2005 is with the use of columns log_reuse_wait,
log_reuse_wait_desc in sys.databases view in Sql Server 2005 and 2008.
So we fire the following query to find the cause of the Transaction Log growth and observe the following
select log_reuse_wait_desc,name from sys.databases
log_reuse_wait_desc name
------------------------------------------------------------ -------------
REPLICATION test
And to
our surprise the log_reuse_wait_desc column reflects REPLICATION for a
particular database which has snapshot replication configured on it.
So we
conclude the Log growth is caused due to Snapshot Replication which is
kind of ironical since Snapshot Replication does not use Transaction
logs to replicate data to the remote site.
So to
confirm this we fire the DBCC OPENTRAN which gives the undistributed LSN
for the Transaction log and we get an undistributed LSN value as shown
below
DBCC OPENTRAN(13)
Transaction information for database 'test’.
Replicated Transaction Information:
Oldest distributed LSN : (0:0:0)
Oldest non-distributed LSN : (43831:51:1)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
So this
confirms that above undistributed transaction is causing the Log
Growth. But in Snapshot Replication the changes are replicated using
Snapshots and not using Transaction Logs.
We have
a Known issue with Snapshot Replication in Sql Server 2005 which causes
DDL changes on the tables which have SCHEMA CHANGES marked for
replication where in the DDL statements are marked for Replication in
Transaction log of the database but they are not unmarked when the
changes are actually propagated.
In order to confirm whether we are hitting the issue mentioned above we need to check whether that undistributed LSN ((43831:51:1) in above case)
provided in DBCC OPENTRAN output is pointing to DDL operations of type ALTER TABLE.
The
issue is yet to be fixed and hence the workaround to avoid the issue is
to either remove the REPLICATION OF SCHEMA CHANGES for the tables
involved in the replication
Or
Use the following commands after running Snapshot Agent and applying the Snapshot on the subscriber.
sp_repldone null, null, 0,0,1
Before
running the above command we need to ensure that SCHEMA changes are
actually being replicated in order to avoid any inconsistency
Note:
This
post should not be treated as the Microsoft’s Recommendation or
Resolution to the problem, it is only a workaround which worked in our
environment and so we would like to share it.