Replication error 1127 While accessing the hard disk, a disk operation failed even after retries
Updated: September 17, 2012
Applies To: Windows Server 2000, Windows Server 2003, Windows Server 2008, Windows Server 2008 R2
This article describes symptoms, cause and resolution steps for cases where Active Directory operations fail with Win32 error 1127: "While accessing the hard disk, a disk operation failed even after retries."
Symptoms
Causes
Resolutions
The DCPROMO promotion of a new domain controller fails with error 1127: While accessing the hard disk, a disk operation failed even after retries.
The on-screen error displayed in DCPROMO is:
Dialog title text: Active Directory Installation Wizard
Message text:
The operation failed because:
Active Directory could not replicate the directory partition <DN path of failing partition> from the remote domain controller <fully qualified computer name of helper DC>.
"While accessing the hard disk, a disk operation failed even after retries."
DCPROMO.LOG contains the following text:
[INFO] Replicating the <partition name> directory partition [INFO] Error - Active Directory could not replicate the directory partition <partition DN> from the remote domain controller <helper DC>. (1127) [INFO] NtdsInstall for <DNS domain> returned 1127 [INFO] DsRolepInstallDs returned 1127 [ERROR] Failed to install to Directory Service (1127)
DCDIAG reports that the Active Directory Replications test has failed with error status (1127): While accessing the hard disk, a disk operation failed even after retries. Sample error text from DCDIAG is:
Testing server: <site>\<DC name> Starting test: Replications * Replications Check [Replications Check,<DC name>] A recent replication attempt failed: From <source DC> to <destination DC> Naming Context: DC=<DN path> The replication generated an error (1127):While accessing the hard disk, a disk operation failed even after retries. The failure occurred at <date> <time>. The last success occurred at (never)| <date>.
REPADMIN.EXE reports that the last replication attempt has failed with status 1127.
REPADMIN commands that commonly cite the 1127 status include but are not limited to:
- REPADMIN /REPLSUM
- REPADMIN /SHOWREPL
- REPADMIN /SHOWREPS
- REPADMIN /SYNCALL
- REPADMIN /REPLSUM
The Replicate now command in Active Directory Sites and Services returns "While accessing the hard disk, a disk operation failed even after retries."
Right-clicking on the connection object from a source DC and choosing Replicate now fails with "Logon Failure: The target account name is incorrect.” The on-screen error message is shown below:
Dialog title text:
Replicate Now
Dialog message text:
The following error occurred during the attempt to synchronize naming context <DNS name of directory partition> from domain controller <source DC> to domain controller <destination DC>: While accessing the hard disk, a disk operation failed even after retries. This operation will not continue.
NTDS KCC, NTDS Replication events with the 1127 status are logged in the Directory Services log in Event Viewer.
Active Directory events that commonly cite the 1127 status include but are not limited to:
Event ID Event Source Event String 1926
NTDS KCC
The attempt to establish a replication link to a read-only directory partition with the following parameters failed.
1084
NTDS Replication
Internal event: Active Directory could not update the following object with changes received from the following source domain controller. This is because an error occurred during the application of the changes to Active Directory on the domain controller.
1699
NTDS Replication
The local domain controller failed to retrieve the changes requested for the following directory partition.
As a result, it was unable to send the change requests to the domain controller at the following network address.
2108
NTDS Replication
This event contains REPAIR PROCEDURES for the 1084 event which has previously been logged. This message indicates a specific issue with the consistency of the Active Directory database on this replication destination. A database error occurred while applying replicated changes to the following object. The database had unexpected contents, preventing the change from being made.
NTDS Replication event 2108 may be logged in the Directory Services Event log citing the object, source DC and jet error that is triggering the logging of the 1127 status in on-screen errors, logged events and diagnostic tool output.
Jet errors known to appear in NTDS Replication event 2108 with status 1127 include but are not limited to:
Jet Error (decimal) Symbolic Error Error string -510
JET_errLogWriteFail
Failure writing to log file
-1018
JET_errReadVerifyFailure
Checksum error on a database page
-1019
JET_errPageNotInitialized
Blank database page
-1021
JET_errDiskReadVerificationFailure
The OS returned ERROR_CRC from file IO
-1022
JET_errDiskIO
Disk IO error
-1605
JET_errKeyDuplicate
Illegal duplicate key
NTDS ISAM events may be logged in the Directory Services event log indicating the existence of jet errors related to the 1127 status appearing in other on-screen errors, logged events and diagnostic tool output.
Event Source and Event ID Event ID Text NTDS ISAM 474
The database page read from the file <drive:\path\ntds.dit> at offset <decimal offset> (<hex offset>) for <decimal page size> (<hex page size>) bytes failed verification to a page checksum mismatch…. The read operation will fail with error <decimal jet error> (<hex jet error>). ). If this condition persists then please restore the database from a previous backup. This problem is likely due to faulty hardware. Please contact your hardware vendor for further assistance diagnosing the problem.
NTDS ISAM 475
The database page read from the file <drive:\path\ntds.dit> at offset <decimal offset> (<hex offset>) for <decimal page size> (<hex page size>) bytes failed verification to a page number mismatch…. The read operation will fail with error <decimal jet error> (<hex jet error>). ).
If this condition persists then please restore the database from a previous backup. This problem is likely due to faulty hardware. Please contact your hardware vendor for further assistance diagnosing the problem.
Active Directory is unable to write to the Active Directory database or log files. Root causes include:
Software on the local machine is interfering with the ability of Active Directory to write changes to the Active Directory database or log files.
A defect exists in the disk subsystem including the motherboard, driver controller, firmware, driver, physical drives.
Locate NTDS replication event 1084 events in the Directory Services Event Log
For DCs logging the 1127 status, open the Directory Service Event log and focus on NTDS Replication event 1084.
NTDS Replication Event 1084 indicates that Active Directory could not write updates to an object in its local copy of Active Directory.
Metadata in the Event 1084 identifies:
The DN path (and thus the objects host partition) that could not be updated
The objectGUID for the object in question
The fully qualified CNAME record of the source DC that is sending the update
Locate the NTDS Replication Event 2108 logged immediately following each NTDS Replication 1084 event and identify the jet error logged in the 2108 event.
NTDS Replication event 2108 is the "User Action" for the NTDS Replication 1084 event.
For every NTDS replication 1084 event logged, there should be a corresponding NTDS replication 2108 event logged in the Directory Services event log that cites (1.) the same object DN path and (2.) objectguid and (3.) source DC logged in the preceding NTDS Replication 1084 event AND a jet error that defines / scopes the cause and your recovery plan to resolve the error condition.
Execute the action plan for the Jet error logged in your NTDS Replication Event 2108:
If the Jet error logged in your NTDS replication events is listed in the table below, execute the user action, otherwise, skip to step 4.
Jet Error (decimal) Symbolic Error + Error String User Action -510
JET_errLogWriteFail / Failure writing to log file
A log write failure occurred on the destination DC.
Check disk, partition and file system health on the destination DC.
Check for software that may be creating locks on Active Directory log files such as antivirus software on the destination DC.
See if problem persists following reboot or try clean boot
Method 1: Stop services that create locks on files in the file system and focusing specifically on antivirus software.
Method 2: Press F8 during OS boot and chose "Safe Mode with Networking".
Method 3: Disable non-boot related 3rd party services. Reboot.
Windows key + R -> MSCONFIG -> Services tab - > Hide all Microsoft Services -> Disable checkbox for 3rd-party services
Windows key + R -> MSCONFIG -> Startup tab - > Hide all Microsoft Services -> click "Disable all"
-1018
JET_errReadVerifyFailure / Checksum error on a database page
DB is corrupt
Error caused by a hardware failure.
Evaluate the disk stack including motherboard / controller, firmware, connecting cables and physical drives. and contact the relevant vendors for known issues. Compare current configuration against vendor’s reference configuration.
Evaluate whether problem can be resolved by latest firmware updates or was triggered by recent firmware update.
If some DCs are logging -1018s while other DCs in same environment are not, looks for differences in hardware configuration.
Databases logging this error cannot be recovered or repaired by integrity checks or semantic database analysis in NTDSUTIL or ESENTUTL.
Offline defrags may resolve the problem in the unlikely case that problem is caused by an index consistency problem.
Try an offline defrag. Otherwise, restore a system state backup that pre-dates the corruption, OR force demote, perform a full metadata cleanup, and re-promote. If the -1018 error appears, repeat until hardware root cause is resolved.
One customer reported jet error -1018s on virtualized DCs running on the same virtual host only on computers using an on-board raid controller. Current thinking is that the UPS lacked sufficient power for on-board raid controllers to commit changes to disk following loss of electrical power. Workaround was to configure UPS software to shut down virtualized guests on loss of electrical power. Servers with dedicated (not on-board) raid controllers with their own battery backups did not experience the -1018 jet error.
-1019
JET_errPageNotInitialized / Blank database page
Similar to -1018 error but caused by a lost page flush.
A lost flush can represent a critical USN change. Failure to apply same to local DC or transitive replication partners could be harmful where a single replication path exists.
Deploy OS on server class hardware and disk subsystem components
Install UPS on host computer.
Install disk controller with battery backup.
Disable write-back cache on drive controller.
Avoid placing NTDS.DIT and LOG files on IDE drives
Databases logging this error cannot be recovered or repaired by integrity checks or semantic database analysis in NTDSUTIL or ESENTUTL.
Offline defrags may resolve the problem in the unlikely case that problem is caused by index consistency problem.
Try an offline defrag, otherwise, restore a system state backup that pre-dates the corruption, OR force demote, perform a full metadata cleanup, and re-promote. Repeat until hardware root cause is resolved.
-1021
JET_errDiskReadVerificationFailure / The OS returned ERROR_CRC from file IO
Jet error -1021 is new to Windows Server 2008 R2.
Pre-Windows Server 2008 R2 operating systems return -1022 for this case.
-1021 identifies that a -1018 error occurred at the disk level. Restated, -1021 indicates that a disk drive returned a bad check sum error and is the specific source of the problem in the disk stack.
Problem may be caused by bad blocks on the hard drive which the hard drive may keep track of.
Demoting and re-promoting the domain controller may trigger the storage of data on healthy blocks.
-1022
JET_errDiskIO / Disk IO error
Generic disk error
Disk IO errors mean that the OS encountered a non-specific error accessing the disk. This error may be logged when controllers return generic errors like "device not working". Some disks and versions of jet return this error for CRC problems.
Check whole driver stack.
-1605
JET_errKeyDuplicate / Illegal duplicate key
Sporadic error.
Demote and repromote. May be caused by index corruption.
Run NTDUSITL semantic database analysis. If still unresolved, perform an offline defrag.
If the Jet error in the NTDS replication event is NOT in table above, validate the vertical Jet database stack
If the 2108 event logs a jet error NOT cited in the table, use the Microsoft Exchange Server Error Code Look-up utility to resolve the jet error to its symbolic and friendly error string using the syntax "err <jet error>". It is critical that you add the leading "-" prefix character when resolving jet errors using ERR.EXE. (eg. "c:\>err -1018").
The event message text in NTDS Replication event 2108 contains a partial user action for the NTDS Replication 1084 event.
The NTDS Replication 2108 user action is documented in the linked KB article 837932. If the user action for your event is not cited in the table above, execute a modified version of the action plan in article 837932 by validating the vertical jet database stack from the bottom up (proceeding up to the next layer only when the underlying layer checks out "good"), just like you do with TCP.
Layer NTDSUTIL command ESENTUTL Physical consistency
no equivalent
ESENTUTL /K
ESE Logical consistency
NTDSUTIL FILES INTEGRITY
ESENTUTL /G
Application logical consistency
NTDSUTIL ->Semantic database analysis
+
NTDSUTIL -> Offline Defrag
no equivalent for SDA
+
ESENTUTL / D