This afternoon, the program that adds records to my database started timing out on every transaction. Although I could connect to the database and execute queries, all updates would time out. When I logged in to the SQL Server machine, I noticed that it was responding very slowly. It took a few minutes of poking around before I finally hit on the idea of checking the event log. There I found this message, which was repeated every two minutes:
Autogrow of file ‘Media_log’ in database ‘Media’ was cancelled by user or timed out after 120032 milliseconds. Use ALTER DATABASE to set a smaller FILEGROWTH value for this file or to explicitly set a new file size.
The log file was set to grow automatically by 10% whenever it gets full. What I didn’t realize is that growing the log file involves more than just allocating another bunch of sectors from the disk. Whatever SQL Server does to grow the log file takes longer than two minutes when the log file is 100 gigabytes in size. To make matters worse, my update program had over 40,000 updates pending and no way for me to store them on disk for later. An oversight on my part, I’ll admit. When you’re blowing and going trying to get something up and working, you tend to let silly things like disaster recovery fall by the wayside.
I tried a number of different things before I took a stab in the dark and decided to create a new log file, hoping that SQL Server would give up on the old overflowed file and write to the new one. Fortunately that worked and my data is now safely ensconced in SQL Server and backed up.
I obviously have a bit to learn about administering a SQL Server installation. For now, I’ve backed everything up and am in the process of shrinking the database before bringing my application up again. Tomorrow I’ll be reading up on the different recovery models and implementing a regular backup schedule so I don’t have to go through this again.