Walkthrough of diagnosing a SQL Server Error – Part 3
- by Scott Whigham on June 10, 2009 2:46 PMSo here’s where we are: In Part 1, I explained the initial error. In Part 2, I began the main troubleshooting process. In this entry, I continue troubleshooting based off of what I learned in Parts 1 & 2.
After I found out there were potential disk errors, I realized that I likely needed a minimum of 30 minutes downtime up to a likely two hours. Sure, it could go beyond that but I did not tell Windows 2003 to repair all errors therefore the time to fix was hopefully shortened. Since I needed downtime and it was the middle of the week, I had two things to consider:
- The SQL Server is up and running. The websites are up and running. The websites are online and able to write to the database.
- If I could wait until Friday night or Saturday morning for the downtime, it would have the least amount of impact on users.
So that’s great news – that the server is up and running and the websites are online. The only negative, right now, is that I’m running naked – no backups since yesterday. Was I comfortable running naked for another 48-72 hours?
Yes.
Fourth action: Plan for disaster
The worst thing that could happen right now is that I made a change that caused the server to crash. I’d already tried stopping a job (which failed) so I knew there were serious problems. I didn’t want to do anything that destroyed the server state right now. The first thing I needed to do was to come up with a plan on how to get the data out of the “live” production database. Enter SSIS!
Since I could still connect to the server using Windows authentication (BTW – even Dedicated Administration Connections would fail saying that a DAC already exists – wasn’t me!), I fired up SSIS and hooked into a separate instance. For the LearnItFirst database (you know – the company that has awesome SQL Server training videos), there are only 20 or so tables that have changed in a meaningful way since yesterday (Orders, Members, etc). So I created an SSIS package to export that data out and I took a backup.
- Step 1: Create a new database on a separate instance
- Step 2: Load the database with the data from the 20 or so tables that have changed since yesterday
Yay! Now I have a “backup”". While it isn’t a traditional SQL Server backup like I would like to get, if worse comes to worst I can be back online in minutes now.
Fifth action: Install another instance of SQL Server 2008
I tried creating a new database on the “affected” server but was unable to – the connection immediately entered a SUSPENDED state. So, in the event that the current state of affairs deteriorated, I wanted to be able to “flick a switch” and turn on a “new” database. SSIS did the backup for me – now I needed somewhere to store it. I installed SQL Server 2008 along with SP1 (check out LearnItFirst’s SQL Server 2008 tutorials for help doing this on your own) and created a new database to store the migrated data. You might be wondering, “Wait – he said he thought his drive was failing. Why not install on a separate machine?” Good question – the reason is that this particular server was running on a dedicated server from a webhost. I only had one server. If I wanted another server, I’d have to wait the 12-24 hours for them to build it. I needed this done NOW. I took a backup of the backup (that is, I took a SQL Server backup of the scripted-out-tables backup I did using SSIS), zipped it, and downloaded it to a separate server in case the drive failed.
Despite having a backup, I’m not fully prepared for the worst-case scenario yet. You see, my quick-and-dirty SSIS backup does three things:
- Scripts out the 20 or so tables I needed up-to-date data from
- Creates the tables in a new database
- Loads the tables with complete data from the original database
What it doesn’t do/have:
- Only copy data that has changed since last backup
- New database does not have all other tables
Sixth action: Restore yesterday’s backup onto new instance
Now that I have the new instance installed (let’s call it “LIF2” for short), I need a database to start with. I restored last night’s full database backup and now it’s time to synch the database with the live copy.
TIME OUT – Question time!
There are likely to be a few questions – let me try to answer them:
- Did you try logging in using SQL Server’s Dedicated Administration Connection?
- Yes – would not allow it. I tried from the New Query button and from sqlcmd. The error was that there was already a DAC connection. sp_who showed that my connection and the app connections were the only “live” connections though.
- Are you ever going to tell us how you fixed it?
- Eventually… hopefully… It’s still on-going as I post this!
- Why are posting here instead of fixing it?
- It’s one of those things where I’ve done everything I can during peak time – now I’m waiting for a non-peak time to do more!
'I’ll post more tomorrow on this…




Leave a comment