SQL Server 2005 introduced Service Broker, a message queue platform for decoupling database services from one another. It's pretty much just standard message-based Service Oriented Architecture type fare except for one thing: it's persistent.
Not only are the messages persistent but the conversations are too. A conversation contains metadata describing a communication between services. By default, they are persisted in the database with an indefinite lifetime. Because they are persistent, you must be very careful to close conversations so they can be cleaned up. Somehow that slipped past us... It also happened that tempdb was filling up at a prodigious rate with internal objects associated with Service Broker. Uh oh.
Moreover, there's a non-trivial cost associated with maintaining these conversations. Particularly if they pile up. Some people have recommended recycling or pooling conversations to improve performance. (Somehow having to resort to such hackery just seems broken to me for a critical piece of infrastructure like Service Broker.)
So we managed to pile up over 20 million conversations. Also due to some strange regression involving the loss of the Database's master key (no idea how that happened) we had over 12 million undeliverable messages in the transmission queue. You know it's bad when running
SELECT COUNT(*) FROM sys.transmission_queue WITH (NOLOCK) or
SELECT COUNT(*) FROM sys.conversation_endpoints WITH (NOLOCK) takes up to half an hour to run and returns an 8 digit number.
As we were running SQL Service 2005 Service Pack 1, there were only two viable options: write a loop to call
END CONVERSATION ... WITH CLEANUP for each stale conversation or backup and restore the DB without its queues. As you can expect, the loop was very slow, even when running batches of 100 handles at a time and running multiple parallel instances side by side. You can't just TRUNCATE your queues or anything. It would have taken a day or two to clean up.
It turns out that Service Pack 2 includes a very noteworthy enhancement (though I couldn't find mention of it in the public release announcements). To completely blow away all Service Broker conversation state in 1 second do this:
ALTER DATABASE ... SET NEW_BROKER WITH ROLLBACK IMMEDIATE. It was orders of magnitude faster to do the upgrade to SP2, wipe the state with SET NEW_BROKER and then carry on! (Remark: You may need to do some extra work to repair Service Broker routes after issuing this command since it will produce a new broker instance id.)
Ahhh... the benefits of constant-time recovery strategies!
I made a little checklist of things to watch for with Service Broker deployments: