Windows Server Domain Controller Crashes on Boot
We had a legacy Active Directory domain that had only one domain controller. Yeah yeah I know that is a terrible idea and we should have fixed that. The plan was to move everyone off of this legacy AD and onto our new domain (which has plenty of domain controllers geographically distributed).
Before this company/AD was acquired their previous small-business IT consultant thought they could put a 2tb laptop SATA SSD and stick it in their nice Dell R7xx server to save money, make the server faster, and ensure lots of problems because the drives were never designed to run in a RAID or tell the server when they were going to fail.
Long story short, one of these laptop drives that the VM for this domain controller was running on, died. Don't know when but it died. Then the second drive in the mirror set also started to die and the VM started randomly just turning off because obviously the hypervisor can't run the VM if the VHD just disappears.
We were able to pull the bad-bad laptop SSD, leaving the failing laptop SSD in place, import the RAID config, and get the VHD to at least appear. When we went to boot the VM, windows server kept crashing on boot. It would get part way through the Windows boot process and fail. No bluescreen, just a crash and then it would power cycle.
We tried windows recovery, safe mode, safe-mode with networking, and none of them worked.
Thank goodness one of my engineers did some searching of the internet and found this post:
Here is a straight up copy from that link in case the site or content disappears at some point, I don't want the world to lose this fix:\
There are several reasons you may get get this error. The most common being a corrupt Active Directory database (NTDS.DIT). I know this sounds detrimental, but it's actually easy to fix this blue screen.
*** This is the Active Directory Database we're talking about here, so make sure you have a good backup of the server, in case this doesn't work***
This Stop code is only seen on a system with Active Directory on it. You notice it when the server is booting. You'll get a blue screen and an error code, like the following:
STOP: c00002e2 Directory Services could not start because of the following error:
A device attached to the system is not functioning.
Error Status: 0xc0000001.
Please shutdown this system and reboot into Directory Services Restore Mode, check the event log for more detailed information.
To begin, do as the message says, and boot into Directory Services Restore Mode. When the server powers on, press F8 before the OS begins to load. You should see a selection screen like this. Choose Directory Services Restore Mode:
Once in Directory Services Restore Mode, you can check if there is a problem with the database by running the following commands:
activate instance ntds
If there is a problem with it, you'll see something like this returned:
Could not initialize the Jet engine: Jet Error -501. Failed to open DIT for AD DS/LDS instance NTDS. Error -2147418113
To fix, just rename all of the .log files located in C:\windows\ntds\ to .log.old, or anything else, so they can be recreated.
Now reboot the server. For most people, this fixed the database, and the server booted up. For others, it still blue screened after this. If you continue to get a blue screen, run the following command in Directory Services Restore Mode, and then reboot:
esentutl /p "c:\windows\ntds\ntds.dit"