I am writing this blog to share my thoughts on virtualization ,specifically the various server virtualization technologies in use nowadays. It is important that an IT administrator,or anyone who works in IT infrastructure management domain to have an idea about Virtualization. Before exploring the various virtualization technologies and the differences between them,we need to have an awareness on things like
1 , What is meant by virtualization ?
2, Why do we go for virtualization ?
And here we go,
1) What is meant by virtualization.?
Virtualization is a term that refers to the abstraction of computer resources, in simple words, the mechanism to run multiple instances/copies of various operating systems inside a base operating system, mainly to utilize under-used resources on the physical host.
2) Why do we go for virtualization ?
Virtualization will help you save time taken for maintaining the IT physical infrastructure,thereby helps you to focus more on the Quality of Service which you offer.
1. Reduce Cost and save power
Running multiple VMs on a server will reduce the need for having separate physical servers for running applications,thereby reducing the cost for buying and managing physical servers.It will also reduce the power consumption.
2,Backup and Recovery
Backup and Recovery process will be made easier.
3, Reduce Loss Of Service
Migration of servers and upgradation of resources without Loss of service or minimal downtime.
There are several virtualization softwares in use nowadays. Most widely used ones are , VMware, Xen, KVM,Qemu,VirtualBox,OpenVZ and Parallels Virtouzzo.
Server virtualization technology can be categorized into two based on the way through which the virtualization is achieved.
1) Hypervisor based virtualization.
2) Container based virtualization.
Most of the virtualization softwares in use nowadays will fall under any of these two categories.
Before explaining the difference between Hypervisor based and Container based virtualization mechanism,we need to familiarize with some of the common virtualization terminologies.
Hypervisor: Software that enables the running of multiple Virtual Machines on a single physical computer and that manages the sharing of the computing resources between them.
Type 1 Hypervisior: Bare-metal hypervisor runs directly on top of hardware.
Type 2 Hypervisor: Operates as an application on top of an existing operating system.
Emulation: Emulation is where software is used to simulate hardware for a guest operating system to run in
Full Virtualization : Provide total abstraction of the underlying physical system and creates a complete virtual system in which the guest operating systems can work. No modification in guest OS or application is required. The guest OS or application is will not be aware that its being virtualized.
Para virtualization : Para-virtualization techniques require modifications to the guest operating systems that are running on the VMs. As a result, the guest operating systems are aware that they are executing on a VM.
DOM-0 : Related to Xen virtualization. A virtual Machine having privileged access to the hypervisor. It manages the hypervisor and the other VMs .Also referred to as management domain.
DOM-U : VM created by DOM-0.Simply known as Guest domain.
Intel-VT and AMD-V : CPU ‘s with virtualization extensions enabled,from Intel and AMD family respectively. It is used to improve performance of Virtual Machines by removing the need for emulation through the hypervisor or virtualization software.
The Privilege Rings Architecture
Figure-1: Ring architecture
The x86 architecture offers a range of protection levels, also known as rings, in which code can execute and which operating systems and applications use to manage access to the computer hardware. Ring 0 has the highest level privilege and it is in this ring that the operating system kernel normally runs. Code executing in ring 0 is said to be running in system space, kernel mode or supervisor mode. All other code such as applications running on the operating system operates in less privileged rings, typically ring 3. Ring 2 and Ring 3 are normally unused.Some virtualization technologies makes use of Ring 1 to run guest OS.
Now we shall see what is the difference between Hypervisor based and Container based virtualization.
|Key Factors||Hypervisor Based Virtualization||Container Based Virtualization.|
|Working Principle||A full operating system is running in a virtual machine on top of the hypervisor or host .Hypervisor interacts directly with the hardware and allocates resources needed for the VMs to run .Hardware components are emulated or virtualized,so that the Virtual Machines running on the host will see fake hardware.||One kernel installed and runs on the hardware node, with different isolated guest instances or containers installed on top of it.Host kernel is shared between the guest instances. Hardware components are not virtualized. Several guest instances are running isolated from each other on top of the shared host kernel, each with their own processes, libraries, root, and users. Each isolated guest instance is called as containers.|
|Operating system||A wide range of guest Operating systems can be installed on a host.||Since the host kernel is shared with all the guest instances,no other Operating System other than the native OS is allowed.Eg: We cannot install a windows guest instance on top of the linux host.|
|Resource Utilization.||Hypervisor consumes significant amount of resources,which will have some performance impact on the VMs running on top of it.Performance is slightly affected since hardware virtualization/emulation is needed.||The processes performing the isolation of various VM instances is resource intensive,but it is much lower when compared with that of the hypervisor case.Overall performance will be better than hypervisor based virtualization.|
|Manageability||Since a wide array of Operating Systems are supported, management is done via usual methods of a regular dedicated server.||Easily manageable since it is using shared kernel method.|
|OS Updates & Upgrades||Requires separate update or upgrade of each individual virtual machine .||Single update or upgrade across all virtual environments|
|Virtualization softwares||VMware ,Xen,KVM, VirtualBox etc….||Parallels Virtuozzo and OpenVZ.|
Now,lets see the details of some widely used virtualization softwares like VMware, KVM, Xen Qemu , VirtualBox , OpenVZ etc and how virtualization is achieved using each of them.
QEMU (Quick Emulator ) is an open source virtualization and emulation tool which perform hardware virtualization/emulation. Its cross-platform support helps you run it on several hosts ( x86, PowerPC, ARM, Sparc, Alpha and MIPS) . It can emulate several CPUs like x86, PowerPC, ARM and Sparc.
Qemu works by a special recompiler which emulatesprocessors through dynamicbinary translation (binary translation is theemulation of oneinstruction set by another through translation ofcode) , enabling it to run a variety of unmodified guestoperating systems.
QEMU has two operating modes:
User Mode Emulation (only in Linux hosts) : In this mode,a process compiled for one target CPU can be run on another CPU. For example,a PowerPC software can be run on X86 or X86_64 hosts. This approach is mainly used by developers for checking the softwares compatibility with different CPU’s .
System mode emulation : In this mode Qemu emulates a full computer system, including a processor and various peripherals including NIC and graphic cards. It can be used to provide run several virtual guest VMs on a single host. QEMU can boot many guest operating systems, including Linux, Solaris, Microsoft Windows, DOS, and BSD.
QEMU is not a good choice to be used in a production environment ,since Qemu as a software is doing the entire emulation process,which makes it really slow when running multiple guest OS on a single host.
VMware launched its first product with an intention to virtualize X86 architecture.Since its inception in 1999 , it has launched a string of virtualization products like VMware workstation,VMware player, ESX, ESXi , and Vsphere.
Figure 2: VMware Workstation architecture.
Its widely used for the virtualization of desktop machines and not used in any production Virtualization clusters.
Since VMware workstation is installed on top of a host OS, it will fall under the type-2 hypervisor group.Here the layer which acts as hypervisor is called Virtual Machine Manager.So generally VMware workstation is a hypervisor which depends on Underlying Operating system like Windows and Linux. You can create multiple virtual machines side-by-side using VMware Workstation but only limitation is the access to the resources is given by the underlying host operating system.
VMware workstation provides full virtualization through instruction emulation. So , it can also be addressed as Software Assisted full virtualization. VMware incorporated two techniques “trap and emulate” and “binary translation” to ensure full virtualization on X86 hosts. The VMM switches between the appropriate technique dynamically based on the state of the virtual CPU. In “trap and emulate” technique, guest VMs will be made to run in less privileged rings (ring 1 or 3)and all the guest instructions will be executed directly on the host CPU except some sensitive privileged instructions. All the instructions from the guest OS will be analysed by VMM/hypervisor(ring 0) ,allowing safe instructions to run unmodified and all the sensitive instructions are trapped and emulated/replaced with a set of safe instructions before executing. But some privileged instruction from guest machines,cannot be trapped in VMM. To overcome this drawback,VMware incorporated “binary translation” technique to translate the sensitive instructions which cannot be trapped by VMM. The converted code is also cached in memory, to speed up future (sensitive) instructions coming in for execution.
VMware ESX / ESXi
Its mainly used in production clusters and datacenter environments for running multiple VMs in a host. Unlike VMware workstation,it is installed on a bare-metal hardware. In short ESX/ESXi by itself is a hypervisor OS. Since it runs directory on the top of hardware,it can be considered as bare metal or type-1 hypervisor. VMware ESX is the older variant and ESXi is the newer one. First of all, I will explain the architecture of ESX,followed by details like “what make ESXi different from ESX and why VMware stopped development of ESX and turned towards ESXi.
ESX has some linux dependencies. The VMware FAQstates: “ESX Server also incorporates a service console based on a Linux 2.4 kernel that is used to boot the ESX Server virtualization layer or VMkernel” . On starting an ESX host, a Linux Kernel will be started which will load some virtualization components like VMware’s vmkernel. The VMkernal is responsible for allocating memory, scheduling CPUs and providing other hardware abstraction and operating system (OS) services.
Figure 3 : ESX Architecture
ESX has a Linux kernel based service console(COS) which acts as a management interface for the host. Various VMware management agents are deployed in the Console OS,along with other infrastructure service agents (e.g. name service, time service, logging, etc). We can also deploy 3rd party tools for hardware monitoring and system management. Admins can login for configuring and troubleshooting purpose. Last and final ESX version released was VMware ESX-5.
In ESXi , the Console OS has been removed and included POSIX shell which looks and works like linux but with limited functionalities. .All VMware management agents run directly on the vmkernel. Infrastructure services are provided natively through modules included with the vmkernel. Other authorized 3rd party modules , such as hardware drivers and hardware monitoring components, can run in vmkernel as well ,if and only if the modules are digitally signed by VMware .All these features together creates a tightly locked-down architecture increases the security of the system.
Figure 4: ESXi architecture
With the removal of Service Console and Linux Kernel, the disk footprint of ESXi is very less (32 MB) which is very low when compared to that of ESX(2 GB). Since size of ESXi is small,installation and booting time decreases considerably. ESXi is being sold by hardware vendors as a built-in hypervisor. That means that, say, you buy a Dell server, ESXi can be built inside the server (embedded) on a flash chip, on the motherboard,which reduces the need for installing it in the hard disk.
VMware is in a phase of removing its dependencies with other vendors . It has reduced the dependency with Linux by removing the Linux kernel based COS support in ESXi.
VMware ESX/ESXi server achieves virtualization through two methods.
1) Software based full virtualization.
2) Hardware assisted full virtualization.
Software based full virtualization in ESX/ESXi is similar to that of VMware workstation. It makes use of binary translation method which I have explained above.
To implement Hardware assisted full virtualization ,host machine CPUs should have virtualization extensions (AMD-V and Intel-VT). This technology employs an additional “root VMM layer” under the Ring 0 of X86 ring architecture.. Ring 0 is left available for guest operating systems . Here , host CPU itself will take care of “trap and emulation” of privileged instructions from the guest OS,there by eliminating the need for “binary translation” and VMM/hypervisor based traps. This allows for guest OSes to be virtualized without any modifications.
VMware started using Hardware Assisted Full virtualization from VMware ESX 3.5 release. VMware is widely using it in latest ESXi 5.1 releases after the release of 2nd generation AMD-V processor which has the memory virtualization capability. In future , hardware assist technology will eliminate the need for software based full virtualization and paravirtualization.
Kernel Virtual Machine (KVM)
KVM (for Kernel-based Virtual Machine) is a fast growing open source full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module that provides the core virtualization infrastructure and a processor specific module. Using KVM hypervisor, one can run multiple virtual machines running unmodified Linux or Windows images. Each virtual machine has private virtualized hardware: a network card, disk, graphics adapter, etc. The kernel component of KVM hypervisor is included in mainline Linux. Since the hypervisor directly interacts with the hardware,it can be considered as a Level-1 hypervisor.
Figure 5: KVM architecture
In this approach , a Linux Kernel is converted into a hypervisor. Once we install KVM on a Linux box a hardware file /dev/kvm is created which will act as interpreter between actual hardware and hypervisor. A normal linux process has two modes of execution,user mode and kernel mode. User mode is the default mode used by applications.When it needs any sort of involvement from the kernel like writing some data into hard disk, then the kernel mode comes in action . KVM employs a third mode,called “guest mode” which is created by “/dev/kvm” . All the processes in the guest OS will be running under guest mode. Devices in the device tree (/dev) are common to all user-space processes. But with /dev/kvm , each process that opens it sees a different map thereby provides isolation between the VMs.
KVM makes use of hardware virtualization features for cpu virtualization,and uses the standard Linux scheduler and memory management. Modified version of Qemu is used for I/O virtualization.
Xen is an open-sourcetype 1 or bare metal hypervisor, which makes it possible to run many instances of an operating system or indeed different operating systems in parallel on a single machine (or host).
FIGURE 6: Xen Architecture
The Xen Hypervisor Its a thin software layer that runs directly on the hardware and is responsible for managing CPU, memory, and interrupts. It is the first program running after the bootloader. The hypervisor itself has no knowledge of I/O functions such as networking and storage.
Guest Domains/Virtual Machines (DOM-U) :are virtualized environments, each running their own operating system and applications. Xen supports two different virtualization modes: Paravirtualization (PV) and Hardware-assisted or Full Virtualization (HVM). Both guest types can be used at the same time on a single Xen system. It is also possible to use techniques used for Paravirtualization in an HVM guest: essentially binding together the best features from two of them . This approach is called PV on HVM. Xen guests are totally isolated from the hardware: in other words, they have no privilege to access hardware or I/O functionality. Thus, they are also called unprivileged domain (or DomU).
The Control Domain (or Domain 0) : Its a specialized Virtual Machine that has special privileges like the capability to access the hardware directly, handles all access to the system’s I/O functions and interacts with the other Virtual Machines. It also exposes a control interface to the outside world, through which the system is controlled. The Xen hypervisor is not usable without Domain 0, which is the first VM started by the system.
Toolstack and Console: Domain 0 contains a control stack (also called Toolstack) that allows a user to manage virtual machine creation, destruction, and configuration. The toolstack exposes an interface that is either driven by a command line console, by a graphical interface or by a cloud orchestration stack such as OpenStack or CloudStack.
In previous releases of Xen,the default toolstack included Xend daemon (Xen system Manager) and xm (cli management tool) ,which is now deprecated and replaced with “xl t toolstack” in Xen 4.2 release.
Xen has mainly two approaches for virtualization
1) Xen PV (Xen paravirtualization)
2) Xen HVM (Full Virtualization)
In hosts with CPUs (X86) which does not have virtualization extensions,it is not possible to trap some privileged instructions from guest OS .So the full virtualization was achieved through emulation or binary translation of privileged instructions(software based full virtualization) ,which will cause some performance issues with the guest VMs.Disk and network devices needed to be emulated, as did interrupts and timers, the motherboard and PCI buses, and so on. Guests needed to start in 16-bit mode and run a BIOS which loaded the guest kernel, which (again) ran in 16-bit mode, then bootstrapped its way up to 32-bit mode, and possibly then to 64-bit mode. Xen had a found a solution for this by implementing another approach called Para-virtualization.
Xen paravirtualization explained:
Under paravirtualization, the kernel of the guest operating system is modified specifically to run on thehypervisor. This typically involves replacing any privileged instruction from the guest OS, that will only run inring 0 of the CPU, with calls to the hypervisor (known as hypercalls). Since the hypervisor runs in ring 0,it in turn performs the task on behalf of the guest kernel and also provides hypercall interfaces for other critical kernel operations such as memory management and interrupt handling. But the hypervisor won’t support network or disk requests from PV guest VMs.
Para-virtualization offers better performance by eliminating the overhead caused due to software based full virtualization techniques like binary translation , trap and emulate etc. So the VMs running on the host will be aware that they are virtualized and the resources they are using is also being shared with other VMs. In a paravirtualized VM, guests run with fully paravirtualized disk and network interfaces; interrupts and timers are paravirtualized; there is no emulated motherboard or device bus; guests boot directly into the kernel in the mode the kernel wishes to run in (32-bit or 64-bit), without needing to start in 16-bit mode or go through a BIOS.
So, the host CPU does not need virtualization capabilities(AMD-V , INTEL-VT) for running PV guest.
Figure 7: Xen ParaVirtualization Architecture
I stated earlier that hypervisor won’t accept network and disk request from Xen PV guest VMs. Guest must communicate with the Domain 0 via the Xen hypervisor to accomplish a network or disk request. Each Domain U – PV Guest contains two drivers for network and disk access, PV frontend Network Driver and PV frontend Block Driver. Similarly,domain-0 or DOM-0 have corresponding backend Network Driver and backend Block Driver. An event channel exist between frontend and backend drivers .When an event occurs,say,for example,if a guest wants to write some data in the local disk, it will write the data in a shared memory which is accessible for both the Dom-U guest and Dom-0.Then the hypervisor will send an asynchronous interrupt to DOM-0 ,which will then access the data in shared memory and write it in the actual hard disk.
Since the guest OS needs to be modified ,para-virtualization is only supported by open source OS like Linux,solaris and BSDs etc. In short, we cannot use operating systems like Windows ,since they cannot be modified.
Xen-Hardware Assisted Full Virtualization (Xen-HVM)
The host cpu should have virtualization capability for running HVM guest. Here , host CPU itself will take care of “trap and emulation” of privileged instructions from the guest OS.So there is no need for “binary translation” by hypervisor. But to run a fully virtualized guest, may other components still need to be virtualized. To accomplish this, the Xen project integrated qemu (Qemu-dm or QEMU-DOMAIN ,now replaced by Stub-dm in latest Xen versions) to emulate disk, network, motherboard, and PCI devices. We can run unmodified guest using HVM.
Figure 8:Xen-HVM Architecture
Since HVM used qemu for network and I/O,performance is less when compared to Xen-PV. Nowadays Xen has implemented a new approach called PV on HVM or PVHVM to improve the HVM performance by using para-virtualized network and I/O drivers in HVM guests.
OpenVZ is a container-based virtualization solution forLinux. OpenVZ creates multiple isolated and secure Linux servers known as containers or virtual private server (VPS) on a single physical machine.Each container or virtual private server performs and executes instructions exactly like a stand-alone server. A VPS have root access, users, processors, memory, IP addresses, files, system libraries and configuration files, applications, ports, and routing rules. OpenVZ is an open source product available under GNUGPL .Unlike VMWare and Para virtualization technologies such as Xen, it requires both the host and guest OS to be Linux (although Linux distributions can be different in different VEs) under the same kernel ie, both the guest and host will share same kernel. Each container is a full Linux system, capable of running services, handling logins, and so on.
Figure 9 : OpenVZ architecture
OpenVZ offers better performance than hypervisor based virtualization solutions. But if we want to run different guest OS (like both windows and Linux guests) on a same host,then you will need to chose any hypervisor based solutions like Xen , KVM etc.
Which virtualization should we choose, hypervisor based or container based?
If you want to run lots of different operating systems in our virtual machines we should choose hypervisor based virtualization, but if we have lot’s of machines with the same operating system container based virtualization gives a much higher consolidation ratio(accommodate more instances in a single host)and a most efficient use of computing resources.