Not so common problems
There is a subset of problems that happen very very rarely. That require a set of coincidences to occur in order for them to happen. The chances of these events actually happening are so small as to be negligible. In modern computing, they happen.
A few days ago a customer complained that their Outlook Web App would not load, IIS was giving errors all over the page of “file not found” and other nasty pieces of info. I verified the report, and immediately looked in the event log of the offending server. I was confronted with a whole lot of messages indicating that the c:\windows\system32\inetsrv\config\ApplicationHost.config file had been truncated.
In the situation, the first thing to do was to check the file had indeed been truncated (it had), and fix it. Here though I want to look a little at why it had been truncated first.
From Googling around, it would seem that this happens now and again with various different AV softwares, including, in our case Microsoft’s own built in tools. The logic appears to be that the file is re-written by IIS periodically, and if the AV software locks the file during a re-write, the end of the file fails to write. IIS does keep a history of these config files (as we’ll see in a minute) but it would seem to make more sense to copy the file to the history, than to move the file, then copy back as seems to happen.
The configure file itself exists at c:\windows\system32\inetsrv\config\ApplicationHost.config and is an XML file, so it is very easy to see the tags that are not closed, that indicate that the file has been truncated. Where exactly in the file it gets truncated will depend entirely on the specific case, and it could be anywhere in the file. This makes Google searching a touch more complex as finding a report where the truncation is in exactly the same place in a 200KB file is not really all that likely.
So, what are the possible fixes?
The quickest fix would be to replace the file with some backup of the original. This could come from Shadow Copies on the volume, from a storage snapshot, or from a backup. A backup would take longest, and a copy from shadow copies is just a few clicks, so probably the easiest way to recover.
Unfortunately for me, this was a brand new server, with volume shadow copy turned off, so I had no backup to grab the file from.
A copy from another server performing the same function looked promising, and gave the OWA login page back, but actual logins failed from that point, so I went back to the drawing board.
My next step was to have a look around IIS. There is a folder at c:\windows\system32\inetsrv\history which looked promising, but which doesn’t keep copies of the applicationHost.conf file. Some deeper digging gave me C:\inetpub\history\CFGHISTORY_* which appear to be copies of the IIS config and which *do* contain the file I needed. The day was saved.
To quote Phantom’s CARLOTTA: “Si! These things do happen!” even though they shouldn’t. The best solution is to have reliable, tested backups. In this case having volume shadow copies available would have been a nice quick solution. But the ultimate answer is knowing your software. That IIS backs up files itself is a very good thing, that is isn’t too well documented is not great, that the backups are in two different places is just one of those things. Once you’ve been bitten by it, you will remember well.