P2V: How To Make a Physical Linux Box Into a Virtual Machine

Over the last four days, I’ve been exploring how to convert physical
Linux boxes into virtual machines. VMWare has a tool
for doing P2V conversions, as they’re called, but as far as I can
tell it only works for Windows physical machines and for converting
various flavors of virtual machines into others.

I’ve had a Linux machine that I’ve used in my CS462 (Large Distributed
Systems) class for years. The Linux distro has been updated over
the years, but the box is an old 266MHz Pentium with 512Mb of RAM.
Overall, it’s done surprisingly well—a testament to the small
footprint of Linux. Still, I decided it was time for an upgrade.

Why Go Virtual

In an effort to simplify my life, I’m trying to cut down on the
number of physical boxes I administer, so I decided I wanted the new
version of my class server to be running on a virtual machine. This offers several
advantages:

Fewer physical boxes to manage

Easier to move to faster hardware when needed

Less noise and heat

I could have just rebuilt the whole machine from scratch on a new
virtual machine, but that takes a lot of time and the old build isn’t
that out of date (one year) and works fine. So, I set out to
discover how to transfer a physical machine to a virtual machine.
The instructions below give a few details specific to VMWare and OS
X, but if you happen to use Parallels (or Windows), the vast majority
of what I did is applicable and where it’s not, figuring it out isn’t
hard. I’ve tried to leave clues and I’m open to questions.

Note: I’ve used
this same process to transfer a VMWare virtual image to run on
Parallels. The are probably easier ways, but this technique works
fine for that purpose as well—it doesn’t matter if the source
machine is physical or virtual.

The Process

The first step is to make an image of the source machine. I
recommend g4l, Ghost for Linux. There are some detailed
instructions on g4l available, but the basics are:

Download the g4l bootable ISO and put it on a CD.

Boot it on the source machine.

Select the latest version from the resulting menu and start it up
(you have to type g4l at the prompt).

Select raw transfered over network and configure the IP address
and the username/password for the FTP server you want the image
transfered to.

Give the new image a name.

Select “backup” and sit back and watch it work.

Note that if you have more than one hard drive on the
source machine, you’ll have to do each separately. I found that
separately imaging
each partition on each drive worked best. One tip: there
are three compression options. Lzop works, in this application,
nearly as GZip or BZip but with much less CPU load. Compression
helps not only with storing the images, but also with transfering
them around on the ‘Net, so you’ll probably want some kind of
compression.

The next step is to create a virtual machine and put the images on
it’s drive(s). Create a virtual machine in VMWare as you normally
would, selecting the right options for the source OS. When you get
to the screen that asks “Startup virtual machine and load OS” (or
something like that), uncheck the box and you should be able to
change the machine options.

The first thing you need to do with the new VM is create the right
number and size of hard drives—and partitions on those drives—to
match the partition images you’re going to restore.

For transfering single image machines to VMWare, just using the
default drive, appropriately sized, worked fine. For more than one
drive image, however, I found that making the drive type (SCSI/IDE)
match the type on the source was easiest thing to do. Note that
VMWare won’t let you make the main drive an IDE drive by default.
You can always delete it and create a new drive that’s an IDE drive
if you need to.

The second thing you need to do with the new VM is set the machine to
boot from the CD ROM since we’ve got to start up g4l on the
target machine.

On VMWare, you can enter the BIOS by pressing F2 while the virtual
machine is loading. This isn’t as easy as it sounds since it starts
quick. Once you’re there, however, it’s a pretty standard BIOS setup
and changing the boot order is straight forward. On Parallels this
is easier since the boot order is an option you can change in the
VM’s settings.

If you’re creating partitions on the drives, you’ll need to boot from
a ISO image for the appropriate Linux distro and create the
partitions using the partition wiazrd, parted, or some other
tool—whatever you’d normally do.

Next boot the VM from the g4l ISO image on your computer or
the physical CD you made. If you have trouble, be sure the virtual
CDROM is connected and powered on when the virtual machine is
started. Start g4l and configure it the same way you did
before, but this time, you’ll select “restore” from the options.
g4l should start putting the images from the source machine
onto the target. If you have more than one hard drive or partition
image, you’ll have to restore each to a separate drive or
partition—as appropriate—on the virtual machine.

When doing a raw transfer, I you need make the drives the
same size as the machine you’re moving the image from (I’ve found
that larger works OK, but smaller doesn’t). If the drives aren’t big
enough to support the entire image, you’ll get “short reads” and not
everything will be transfered. Note that you won’t get much
complaint from g4l.

The virtual drives should theoretically only take as much space as
they need, but it turns out that since you’re doing a raw transfer,
you’ll fill them up with “space.” This is one of those instances
where copying a sparse data structure results in one that isn’t.
This results in awfully large disks—make sure you’ve got plenty of
scratch disk space for this operation. More on large disks later.

Repairing and Booting the New Machine

Linux panics if the init RAM disk is not updated
(click to enlarge)

Once the images are copied, you have to make them usable. If you
just try to boot from them, you’ll likely see something like the
screenshot shown on the right: a short message followed by a kernel
panic. Before you can use the new machine, you have to do a little
repair work on the old images.

Get an emergency boot CD ISO for your flavor of Linux and boot
the new virtual machine from it. Often you can just boot from the
installation image and then enter a rescue mode. For example for
Redhat, you can type “linux rescue” at the boot prompt and get into
recovery mode.

It will search for Linux partitions and should find any you’ve
restored to the machine. You’ll have the option to mount these. Do
so.

Now, use the chroot command to change the root of the
file system to the root partition. Mount any of the other partitions
that you need (e.g. /boot).

Run kudzu to find any new devices and get rid of old
ones.

Use mkinitrd to
create a new init RAM disk. This command should work:
```
/sbin/mkinitrd -v -f /boot/initrd-2.2.12-20.img 2.2.12-20
```
Of course, you’ll have to substitute the right initrd name
(look in /boot) and use the right version (look in
/lib/modules).

If you get an error message about not being able to find the right
modules, be sure that the last argument to mkinitrd matches
what you see in /lib/modules exactly.

Now, you should be able to boot the machine. With any luck, it
should work.

Disk Size Issues

When you restore the image, your new sparse disk will grow to the
size of the image, even if the image is only partially full of real
data. For example, my Linux box had a 6Gb drive (I told you it was
ancient) that contained the root partition and a 100 Gb drive that
I’d partitioned into two pieces: one 40Gb partition mounted as
/home and a 60Gb partition mounted as /web. After
restoring the images for these three partitions, I ended up with a 6Gb and
a 107Gb files representing the virtual disks. This despite the fact
that only 8Gb of the 107Gb actually contained any data.

Clearly, you don’t want 107Gb files hanging around if they can be
smaller. One option is to do a file copy rather than an image. This
would work fine for the /home and /web partitions
in my case, but wouldn’t have worked for the root partition—I wanted
an image for that. If you’ve just got one big partition, then you
can’t use the file transfer option and still have exactly the
same machine.

Fortunately there’s a relatively painless way of reducing the size of
the disk to just what’s needed (thanks to Christian
Mohn for the technique).

The first step is to zero out all the free space on each partition of
the drive you want to shrink. This, in effect, marks the free
space. You can do that easily with this command:


cat /dev/zero > zero.fill;sync;sleep 1;sync;rm -f zero.fill

After this runs, you’ll get an error that says
“cat: write error: No space left on device”. That’s
normal—you just filled the drive with one BIG file full of zeros,
made sure it was flushed to the disk, and then deleted it.

Next you can use the VMWare supplied disk management tool to do the
actual shrinking. For VMWare Workstation Manager, you use
vmware-vdiskmanager, but the version of this program that
ships with Fusion doesn’t support the shrink option. Note that this,
and other support programs, are in


/Library/Application Support/VMware\ Fusion/

on OS X.

Fortunately, in OS X at least, there’s another
program, called diskTool in


/Applications/VMware Fusion.app/Contents/MacOS/

that does support the shrink option (-k1). Running
this command


diskTool -k 1 Luwak-IDE_0-1.vmdk

on my large disk reduced it from 107Gb to 8Gb!

A few notes: Apparently you have to perform the shrink option on the
disks for a machine before any snapshots have been taken.
Also, be sure to run the zero fill operation in each partition on the
disk. The shrinking option takes a little time, but it’s well worth
it. I haven’t tried this in Parallels, but I suspect the disk
compaction option would work. If someone tries it, let me know.

Conclusion

So, after a lot of experimentation, some playing around, and a lot of
long operations on large files, I have a virtual machine that’s a
fairly accurate reproduction of the physical machine that it came
from. I’ll be testing it over the next few days to make sure it’s
usable.

On reflection, I needn’t have been so faithful to the structure on
the physical machine. I could have created the right number of
partitions on one drive rather than creating multiple drives. After
all, the new drive can be as big as I like. Maybe I’ll do that next
and see how things go…

Posted by windley on August 20, 2007 7:38 AM

Tips, Tricks and More for IT System Administrators and Computer Users

Thursday, 15 January 2009