It happened to me after a power failure, that damaged a server of our vmware farm. The resolution is presented below:
ESX 4.0 host fails to boot after power operation with the error: fsck.ext3: Unable to resolve UUID
Symptoms
- After power-cycling or rebooting an ESX 4.x server, the following error message is produced during boot:
fsck.ext3: Unable to resolve 'UUID=34d192db-17eb-442e-9613-c5c24c6fa9fa'
And*** An error occurred during the file system check.
*** Dropping you to a shell; the system will reboot
*** when you leave the shell.
- After encountering this error, you are unable to boot into ESX or Troubleshooting mode.
- The unresolvable EXT file systems or partitions most commonly later appear to have mount points such as /var, /opt and /tmp.
Resolution
This issue occurs when the boot-time file system check utility (FSCK) for EXT-3 file systems cannot resolve a file system (by UUID) defined in /etc/fstab.
Issues that can result in this may include:
- The default roll-back option is left enabled when a subsequent upgrade is being performed.
- The device not present during system boot.
- The unresolvable EXT file systems appear to reside on disks/devices that are initialized later during system boot (e.g. the last LUN).
Note: If you are experiencing an outage with virtual machines down, consider resolving the situation in a timely manner through the reinstallation of VMware ESX. Troubleshooting may take more time than a reinstallation, which is in the order of approximately 20 minutes.
Otherwise refer to instructions below for submission of information to VMware Technical Support for technical analysis.
Further troubleshooting is available in the shell:
- Confirm the UUIDs which were not resolvable, and remain so, by running fsck again without additional arguments. Information similar to the following is displayed:
# fsck
fsck 1.39 (29-May-2006)
e2fsck 1.39 (29-May-2006)
esx-root: clean, 32953/641280 files, 414801/1281175 blocks
e2fsck 1.39 (29-May-2006)
/dev/sdt1: clean, 35/140832 files, 25323/281596 blocks
fsck.ext3: Unable to resolve 'UUID=34d192db-17eb-442e-9613-c5c24c6fa9fa'
e2fsck 1.39 (29-May-2006)
/dev/sdt6: clean, 31/250368 files, 27851/500220 blocks
e2fsck 1.39 (29-May-2006)
/dev/sdt7: clean, 22/250368 files, 16815/500220 blocks
- Record the UUID or UUIDs which failed to resolve. You may take a screen shot of your System Management Interface, take a picture, or write the values down.
- Confirm these same values in the /etc/fstab file.
# cat /etc/fstab
UUID=79815890-f11c-4907-80fe-d1cd6bf061f8 / ext3 defaults 1 1
UUID=45460133-027b-40b6-8b4d-e52aaf4c417f /boot ext3 defaults 1 2
None /dev/pts devpts defaults 0 0
/dev/cdrom /mnt/cdrom udf,iso9660 noauto,owner,kudzu,ro 0 0
/dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0
None /proc proc defaults 0 0
None /sys sysfs defaults 0 0
UUID=34d192db-17eb-442e-9613-c5c24c6fa9fa
/var/log ext3 defaults,errors=panic 1 2
UUID=e32ec5f4-d795-414a-8d73-a2bb3ea86342 swap swap defaults 0 0
Note: Highlighted in blue is the mount point for the respective unresolvable UUID, in red. - Verify what UUIDs the system is currently aware of by running the following command:
# ls -l /dev/disk/by-uuid
total 0
lrwxrwxrwx 1 root root 10 Nov 9 14:36 45460133-027b-40b6-8b4d-e52aaf4c417f -> ../../sdm1
lrwxrwxrwx 1 root root 10 Nov 9 14:36 e32ec5f4-d795-414a-8d73-a2bb3ea86342 -> ../../sdr1
lrwxrwxrwx 1 root root 10 Nov 9 14:36 34d192db-17eb-442e-9613-c5c24c6fa9fa -> ../../sdr2
lrwxrwxrwx 1 root root 10 Nov 9 14:36 79815890-f11c-4907-80fe-d1cd6bf061f8 -> ../../sdr5
Notes:- This output reveals the UUID-to-partition relationship for all discovered EXT partitions in the system. Affected mount points or content can be associated using the previous step.
- It is possible in some environments that none of the known partitions reported by listing /dev/disk/by-uuid match the unresolved UUID. This is correctable; for additional instructions, proceed to the following sections and correct the content of the /etc/fstab file.
- This output reveals the UUID-to-partition relationship for all discovered EXT partitions in the system. Affected mount points or content can be associated using the previous step.
Solution
VMware is currently investigating further for a full root-cause and solution. Workarounds are available below.
If you are able to reproduce this issue while maintaining production via alternate servers, contact VMware Technical Support after completing the following:
- Log into the terminal of the affected ESX server.
- Remount the root partition in read-write mode:
# mount / -o remount,rw
- Configure Serial Line Logging per the section Configuring the Service Console for VMware ESX 3.x and 4.x in KB article: Enabling serial-line logging for an ESX and ESXi host (1003900).
- Reboot the ESX server and log the results via your listening serial terminal.
- Contact VMware Technical Support and file a Support Request. For additional information, see Filing a Support Request (1021619).
Workarounds
Both recommended workarounds involve the modification of the /etc/fstab file. You may either:
- Generate a new UUID for the affected file system(s) and update /etc/fstab to match the new value(s).
- Update /etc/fstab to incorporate the correct UUID from the file system.
Applying a new UUID
Apply a new UUID to the EXT-3 file systems which fail to resolve and update the /etc/fstab file.
- Run tune2fs against each Linux partition on the suspected disk device. For example:
# tune2fs -l /dev/sdr2 | grep UUID
Filesystem UUID: 34d192db-17eb-442e-9613-c5c24c6fa9fa
# tune2fs -U random /dev/sdr2
tune2fs 1.39 (29-May-2006)
# tune2fs -l /dev/sdr2 | grep UUID
Filesystem UUID: 25a18c70-ffcb-4b15-9d2d-1cfab1754d86
- Update /etc/fstab with the updated UUID. From earlier steps, /dev/sdr2 partition was determined to be the /var/log mount point:
- Remount the root partition in read-write mode:
# mount / -o remount,rw
- Open the /etc/fstab file for re-writing. For more information, see Editing configuration files in VMware ESX (1017022).
- Search for, and change, the original UUID to the newly-generated UUID from earlier steps, above.
- Save the file and remount the root partition in read-only mode:
# mount / -o remount,ro
- Reboot the server using shutdown -r now.
You can read the full document at (check the "mount" syntax):
http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1017162&sliceId=1&docTypeID=DT_KB_1_1&dialogID=127160699&stateId=0%200%20138435051
No comments:
Post a Comment