Monday, August 6, 2012

Ghost Buster


Scenario:

A few months ago a client asked me to have a look at some of their scheduled tasks because they were running for more than 24 hours, and they should be running for less than 1 minute. We stop and restarted these tasks but they just kept on running. The SP_WHO2 procedure revealed that the tasks we BLOCKED by another task but this was a system task, “TASK MANAGER” that was accessing the “MSDB” system database. We’ve tried everything, from rebooting the server, trying to KILL the task etc. but were unsuccessful. The CPU usage was 100% all the time (even after restarting the server).


Research:

Then I came across a blog that explained about the “GHOST CLEANUP” system task. Basically what happens is as follows:

 When a record gets deleted in an index on a table, it doesn’t get deleted physically but only get marked as a ghost record. Even a NOLOCK or READ UNCOMMITTED scan won't return these records. In this process, the page where this ghost record is on gets marked in one of the page allocation maps (PFS =Page Free Space) as well as the database state to indicate that there is ghost records to be cleaned up. The GHOST CLEANUP system task executes every 5 seconds and look for databases with ghost records to be cleaned up. There is a limit of 10 pages that it will cleanup per database at a time to prevent the swamping of the system.  

 Code:

The following code will display the GHOST CLEANUP command:


SELECT * INTO myexecrequests FROM sys.dm_exec_requests WHERE 1 = 0;

GO

SET NOCOUNT ON;

GO

DECLARE @a INT

SELECT @a = 0;

WHILE (@a < 1)

   BEGIN

   INSERT INTO myexecrequests

      SELECT * FROM sys.dm_exec_requests

         WHERE command LIKE '%ghost%'

   SELECT @a = COUNT (*) FROM myexecrequests

   END;

GO

SELECT * FROM myexecrequests;



Solution:

Now, to get back to the problem. This GHOST CLEANUP was swamping the system (that it is not supposed to happen) preventing the SQL Agent tasks to complete.

How do I stop this system task” Remember you can’t KILL a system command and the GHOST CLEANUP task restart as soon as you reboot the server.

Running a DBCC CHECKDB against the MSDB system database revealed that there was index errors – this was the cause the GHOST CLEANUP to not finish and subsequently blocking all the SQL Agent jobs (there info is stored in MSDB).


A DBCC CHECKDB (msdb, REPAIR_ALLOW_DATA_LOSS) command is required to fix the problem BUT the database (MSDB) must be in single user mode:


So:


Alter Database msdb
     Set Single_user 
           With Rollback Immediate

GO

DBCC CHECKDB (msdb, REPAIR_ALLOW_DATA_LOSS)

GO

Alter Database msdb
    Set Multi_user 
        With Rollback Immediate

GO


Now the GHOST CLEANUP could finish and all the SQL Agent task finished.



Remember :   Life is not a destination…..it’s  journey!!”

 Refferance :

       www.SQLSkills.com

                  In recover

                        by Paul S, Randal






1 comment:

  1. Thanks Buks, always great to see some lessons from real life cases. Great job!

    ReplyDelete