Virtualization technologies are the founding concepts behind the successful adoption of cloud computing paradigm. Virtualization not only upgrades the relation between hardware and software, but also helps utilize their full capabilities.
On-demand and shared virtual resources provided through a service architecture is indispensably the primary reason behind omnipotent architecture of cloud.
This tutorial introduces the basic concepts of virtualization. We will introduce the theory of hypervisor and also discuss a few predominantly used hypervisors.
This tutorial shall cover the following topics:
- Understanding virtualization
- Adopting virtualization
- Techniques of virtualization
- How virtualization works
- Types of virtualization
- Virtualization in cloud
With the expansion of computing, Internet and the latest mobile technology, the word virtual has experienced a radical change in the last few years.
Applications allow us to shop online on virtual stores; virtual guides create vacation tours on the basis of budget and time, and store downloaded movies and songs in our personalized virtual video library.
Virtualization is a technology which enables us to create a logical/virtual object of an actual physical object. It involves distribution of capabilities of a physical machine among many users or environments emulated using software.
Any resource such as operating system, storage, computer network, and so on can be virtualized. The spectacle of the technology is that the applications running on top of a virtual machine think that they have their own dedicated operating system and libraries but actually the resources are shared among various applications.
Virtualization technology can be tracked down for its foundations in late 60s but due to some recent developments and technological paradigm shifts like cloud and grid computing, it is now widely accepted.
Generations of operating system like batch processing systems also affect the adoption of virtualization and results in development of hypervisors which given multiple users simultaneous access to computers.
The actual change happened in 90s when IT sector evolved. The growth of Internet and high-level networking protocols resulted in obsolescing/incompatibility of most of the applications (Legacy apps) with the existing resource architecture.
There was an immediate need for the enterprises to upgrade their IT infrastructure. Acceptance of virtualization in that scenario allowed the IT industry to run multiple operating system types and versions over multiple partitions of their servers. This resulted in optimized use of resources and ultimately reduced various associated costs like license purchase, infrastructure set up, and cooling cost, and so on.
Virtualization of an object can somehow increase the efficiency of the resource in terms of its utility; for example, the ability to run versions multiple systems and application software on a single computer system simultaneously and software-defined controls for Storage Area Networks (SANs) for availability and load balancing.
Various organizations are taking advantage by adopting virtualization due to the following factors:
- Cost reduction in terms of limiting the hardware resources, and thereby reducing significant operational and maintenance cost which ultimately reduces the electricity, air conditioning, and cooling costs.
- Increased ubiquity, decreased downtime by increasing isolated instances of resources, and thereby increasing efficiency, uptime and business continuity.
- Easier backup and redeployment by simplified backup procedures not only on virtualized servers but also on virtual machines which can move from one server to another quickly.
- Protecting the environment by safe guarding planet Earth, saving energy and less e-waste generation.
We’ve discussed about the benefits and adoption of virtualization in business community in detail but there are still a few issues that need to be addressed:
- Proper VM management and proper capacity management is required or else VMs can become uncontrollable and create a virtual bottleneck that can cause disasters on the performance of virtualized system.
- Virtual backups are most prominent IT challenges for the business community and can be very tricky at times which threatens to compromise performance of whole system.
- Additional and unanticipated costs often result when specialized hardware and licenses are required to keep high availability and performance. Burden of VM management, requirement of storage area networks, increased headcount, and so on are a few examples of unanticipated cost.
- Depleted resources occur due to VM and network card saturation and can result in serious performance issues like reduced bandwidth, bogging of I/O intensive operations, and so on.
Techniques of Virtualization
The focus on virtualization is to escalate the utilization of underlined hardware resource to its maximum capacity. This results in decreased hardware costs by running multiple virtualized instances in one physical machine and minimizing power consumption.
However, a running application demands exclusive access to the processor and this is the task of the operating system to implement abstraction and make sure that there is no interference between the applications.
This protection is usually implemented as a set of concentric rings as shown in the figure below.
Also known as privilege level or protecting ring, they provide security and fault tolerance by restricting usage of resources to specific privilege levels:
Level 0 runs the OS kernel and therefore is the most privileged one whereas in Level 3, user programs run and thus it is the least privileged level.
Device drivers and other operating system services execute in Level 1 and Level 2. The operating system switches between these level as per the requirement and the type of the program in execution; for example, when the OS is booted, the CPU usually is in Level 0.
Techniques of virtualization can be categorized into the following three categories:
- Full Virtualization
- Para Virtualization
- Hardware Assisted Virtualization
In full virtualization, primary hardware is replicated and made available to the guest operating system. The guest OS is not aware that it is being virtualized and requires no further modification but the user level code (executing in Ring 3) is directly executed on the processor for high performance virtualization.
Since, this technique translates kernel code to replace non-virtualization instructions with new sequences of instructions that have the intended effect on the virtual hardware, VMM must be executed in Ring 0.
Full Virtualized architecture provides virtualized memory, virtual devices and also virtual BIOS. To run the guest operating system without any modification, a particular technique is used which is known as Binary Translation.
VMWare ESXi and Microsoft Virtual Server are examples of full virtualization.
Para Virtualization is an extension of virtualization which recompiles the guest operating system (Guest OS) before installing it inside a virtual machine.
It is sometimes also known as OS assisted Virtualization. Para virtualization improves the communication between the guest operating system and the hypervisor by replacing the non-virtualized instructions with hypercalls that communicated directly with the virtualization layer hypervisor. Due to this reason, the OS must run its privileged instructions in Ring 0. Ring 3 is for less privileged user applications.
As para virtualization modifies operating systems before executing, its compatibility and portability is poor. The open source Xen project is an example of paravirtualization that virtualizes the processor and memory using a modified Linux kernel and virtualizes the I/O using custom guest OS devices drivers.
Hardware Assisted Virtualization
As the name suggests, it is a virtualization approach that provides full virtualization using hardware capabilities.
The technology is invented by Intel and AMD to improve the performance of processor utilization and to overcome other challenges like memory and address resolution and instruction translation.
Hard virtualization is actually embedding of VM into the hardware component of a server. The idea is to aggregate small physical servers into one large physical server and use the processor effectively.
Since, the OS needs direct access to hardware and memory modules in this approach, it must execute its privileged instructions in Ring 0. While user level applications typically run in Ring 3. Privileged instructions are executed into a CPU execution mode that allows the VMM to run in a new root mode below Ring 0.
How virtualization works
The goal of virtualization technology is to create an independent environment for different applications on a single hardware machine.
This is done by creating virtual instances of operating systems, applications, etc. designed to run directly on hardware.
The technique extends the capabilities of your machine and allows you to run multiple applications (especially operating systems) at the same time over a single hardware configuration.
A Virtual Machine (VM) is application software which is responsible for creating these virtual instances. The end users have the same look and feel on a VM as they would have on the actual physical hardware.
These VMs are portable, platform independent and sandboxed from the host system. A host can even run multiple VMs simultaneously on a single hardware configuration. But to create a VM we need substantial processing power, physical memory and network bandwidth.
The figure below, shows the logical architecture of a VM in which multiple guest operating systems run simultaneously on a single host machine.
To properly manage the working of VMs and maintain the integrity of virtual environments, Virtual Machine Monitor (VMM) or Hypervisor is used.
It can be a software agent, a firmware or a hardware device. The physical machine on which the VMM runs is called the host machine and each VM is termed as a guest machine. We would like to discuss some popular VMM/Hypervisors.
Xen is a Type 1, native or bare-metal hypervisor. It directly runs on the host’s hardware and therefore multiple instances of operating systems (either similar or different) can be installed on a single hardware machine.
We can consider it as a software virtualization layer that operates over the hardware and manages CPU, memory and interrupts.
It is just the next program that succeeds the execution of the Bootloader program. Xen is an open source project which is used in the AWS Cloud, server virtualization, Infrastructure-as-a-Service (IaaS), desktop virtualization, security virtualization, and so.
The figure below shows the Xen hypervisor architecture.
You can see that the hypervisor layer is directly installed on Physical Hardware. There is no need to pre-install any host operating system to run virtual machines.
Features of Xen
The features of Xen are as follows:
It is an advance management technique in which they hypervisor can claim the unused memory from one guest machine and share it with the other guest machines within a host. It therefore allows the amount of RAM required by guest VMs to exceed the actual amount of physical RAM available on the host.
It is a unique resource management feature of XEN version 4.2. This technique divides the physical cores on the machine into different pools, each with its own customizable CPU scheduler.
At runtime, a VM is assigned to one of the pools, but it can further migrate between any other pools through the course of its execution. Since the scheduler is customizable, requests can be made for different scheduling parameters for different VMs.
Remus fault tolerance
It is responsible for high availability in Xen. This is done by recurrently creating live backups of running VMs to the backup server which automatically activates in case of failure.
Virtual machine introspection
It is a security technique in Xen which audits the sensitive memory areas of guest machines using specialized hardware support with minimal overhead.
Kernel-based virtual machine (KVM)
KVM is an open source virtualization layer fused into the mainline Linux kernel. It converts a Linux operated machine into Type 1, bare-metal hypervisor that can run multiple but isolated virtual environments.
Since KVM is integrated in Linux Kernel, modules like memory management, CPU scheduling, input/output (I/O), device management, and so on are already built-in.
Each VM is treated as a standard Linux process which is scheduled by a typical Linux Scheduler with virtualized hardware.
The figure below shows a typical KVM architecture.
The KVM module is embedded with the Linux operating system, over which virtual machines run simultaneously with standalone Linux applications.
QEMU is an open source emulator for hardware virtualization. It stands for Quick Emulator and acts as a virtual machine monitor when executed using the KVM kernel module in Linux.
Features of KVM
Mandatory Access Control (MAC) security
Mechanism are implemented for the security of guest machines. VM Security and VM Isolation is provided using enhanced Linux features, namely, security-enhanced Linus (SELinux) and secure virtualization (sVirt).
Hardware and storage support
KVM is compatible with a wide variety of Linux certified hardware platforms, local disks and network-attached storage (NAS)
Live VM migration
It is a feature of KVM through which running VMs can be migrated between hosts without service interruption.
VMware (now a subsidiary of Dell Technologies) is a virtualization software provider based in Palo Alto, California. The company has gained its position among the key virtualization provides in the industry.
VMware classified their products in the following two categories:
- Desktop applications
- Server applications
Desktop applications are compatible with almost all operating systems and provide three major applications which are as follows:
- VMware workstation : It is a virtualized software package in which multiple instances of operating systems (either similar or different) are installed on a single hardware machine.
- VMware fusion : It is a specialized product for Apple’s Mac OS X with additional compatibility.
- VMware player : VMware player is the free counterpart to VMware workstation.
- VMware server : It is a freeware server software used to introduce virtualization over pre-installed operating systems.
- VMware ESX server : It is an enterprise-level server that provides improved functionality with lesser system overhead over the VMware server.
- VMware ESXi server : It is the same as the ESX Server except that the service console is interchanging with the BusyBox installation. Alternatively, is also operated on a very low disk space as compared to ESX.
The figure below shows the architecture of VMware.
As mentioned, it requires a console OS to be installed over the hardware. It creates a software-based virtualization layer over which multiple instances of operating systems can be hosted simultaneously. All the instances of the operating systems can be similar or different but they share the single hardware configuration.
Features of VMware
- Fault tolerance : This feature provides high availability and fault tolerance by creating a copy of a primary virtual machine. The copy becomes active immediately in case of VM failure.
- Distributed Switch (VDS) : It is a virtual switch that can span multiple ESXi hosts. This feature enables a significant reduction of on-going network maintenance activities and increasing network capacity.
- Host profiles : This feature saves the record of valid and authenticated hosts. Later, the hosts are auto-deployed using this stored configuration.
VirtualBox is a free, open-source, pre-built Binaries hypervisor developed by Oracle Corporation for X86 AND AMD64/Intel64-based machines.
It is a ‘type 2 hypervisor’ that requires a pre-installed operating system over which it runs.
Being a cross-platform virtualization software product, VirtualBox can run on Windows, Linux, Mac OS, Solaris OS and all operating systems that exist as shown in the figure below.
It is a very powerful tool and provides support from desktop machines to cloud environment datacenters.
Features of VirtualBox
- No specialized hardware : Unlike Intel VT-x or AMD-V, VirtualBox doesn’t have any backward compatibility issues nor does it require any additional hardware resources to run.
- Hardware support : Inspite of being a Type 2 hypervisor, VirtualBox provides a number of hardware compatibility features like Guest Multiprocessing, USB device support, full Advance Configuration and Power Interface (ACPI) support, Multiscreen Resolution, PXE Network Boot, and so on.
- Remote Display Protocol (RDP) : It is a unique feature of a VirtualBox and is generally used for security purposes. Through RDP, remote access to a running virtual machine is given to a remote desktop client. The clients’ need to authenticate themselves using the RDP authentication mechanism before connecting to a server. Winlogon on Windows and Pluggable Authentication Modules (PAM) on Linux are examples of RDP authentication services.
Formerly known as XenServer, Citrix is a virtualization solution provider for application, desktop and server virtualization built over the Xen virtual machine hypervisor. It is well known for its integration with cloud technologies like Software-as-a-Service (SaaS) and Desktop-as-a-Service (DaaS).
Citrix offers remote devices to access applications and resources through a centrally located server.
Being an open source and platform independent, the resources can be accessed from anywhere, any time and from any device.
The figure below, shows a Citrix XenServer architecture.
The architecture is similar to Xen Hypervisor, which is a the heart of Citrix systems.
Areas where Citrix is used
- Desktop and application virtualization : Citrix XenApp provides application virtualization whereas Citrix XenDesktop, Citrix VDI-in-a-Box are tools for desktop virtualization.
- Desktop-as-a-Service (DaaS) : Some useful DaaS and business applications include Worx mobile apps for secure email, browser, and document sharing and Citrix workspace suite for mobile workspaces.
- Software-as-a-service (SaaS) : Podio, a cloud-based collaboration service, and OpenVoice for audio conferencing are SaaS offering by Citrix.
Features of Citrix
- Any device , any time : Users have simple and secure access to resources regardless of location or device
- Single instance management : Application and server images are stored, maintained and updated once in the datacenter and delivered on-demand.
- High end security features : Like encrypted delivery, multi-factor authentication, built-in password management and activity auditing, etc. provide secure cloud infrastructure for delivering resources.
- Scalability : XenApp has provided its efficiency to support more than 70,000 users, scale beyond 1,000 servers in a single implementation and ensure 99.999 per cent application availability. It also provides intelligent load and capacity management.
Types of virtualization
In this tutorial we will discuss the different types of virtualization.
Availability of right data at the right time is one of the main objectives of virtualization. Data virtualization is analogous to data agility, in which an application is allowed to access data irrespective of its technical details, formatting style and physical location.
As shown in the figure below, data virtualization aligns disparate sources into a single virtual data layer that provides unified access and integration data service.
Tools like Red Hat’s JBoss, Denodo, and so on fetch data from multiple heterogeneous sources, integrate it and transform it as per the user’s need. This ultimately results in faster access of data, less replication and high data agility.
Features of data virtualization
- Modern data integration : The integration and transformation of data is similar to the traditional Extract-Transform-Load Model (ETL Model) but leverages modern features like delivery of real-time data, data federation and data agility.
- Logical abstraction : This feature introduces the capability of heterogenous data from varied sources, middleware and applications to easily interact.
- Data blending : It is a feature of Business Intelligence (BI) in which heterogeneous data from multiple sources is combined and fed to the BI tool for analytical queries.
Desktop virtualization is a technique in which the virtualization layer runs on top of a hypervisor and provides desktop provisioning. It is also alternatively called virtual desktop infrastructure (VDI).
We can relate this type of virtualization with the traditional client-server model where a client requests for service from a centralized and remotely located server.
In desktop virtualization, VDI is responsible for hosting of the desktop environment in a VM that runs on a centralized or remote server. This is the reason why we also sometimes refer to desktop virtualization as client virtualization.
The figure below shows a Virtual Desktop, which is a two-facet centralized server.
User Experience (Thick Client, Desktop, Laptop) on one side and applications (OS, Provisioning & Update Data & Personalization, Application Virtualization) on the other side.
The remote server is responsible for disaster recovery, security, and availability and backup of data.
Features of Desktop Virtualization
- User’s computer dies; the user’s desktop does not die : If a hardware damage occurs, the failed hardware can be quickly replaced and simply reconnected to the virtual desktop.
- No installations or updates : Since the operating system and other application software are centrally managed, there is no overhead of installation and regular version updates.
- Inter device compatibility : The virtual desktop can be accessed by either a desktop machine or by a tablet or mobile device. Also, multiple users can share a common virtual desktop environment, that is, the desktop or application needs to be installed only once and it will be available to multiple users.
CPU virtualization allows a single processor to act as multiple individual CPUs, that is, two separate systems running into a single machine. The objective of CPU virtualization is to enable the users to run different operating systems simultaneously.
As represent in the figure below, a hypervisor layer is installed over the physical hardware.
Multiple virtual machine monitors are installed over the hypervisor to allow the execution of multiple instances of operating systems.
Currently, all prominent CPUs, including Intel (Intel Virtualization Technology or Intel VT) and AMD (AMD-V) support CPU virtualization.
Since virtualization sometimes requires kernel level and control sensitive instruction which tend to change memory mapping and resource configuration, CPU virtualization is generally disabled by default in BIOS and needs to be enabled manually once using the CPU configuration settings.
Features of CPU Virtualization
- Virtual preprocessor IDs (VPID) : These are the unique IDs given to each VM that is currently in the running state. This prevents the CPU from flushing out the data structures from the transition look-aside buffers (TLBs) at the time of context switching of VMs. VIPDs therefore help in activating flexibility and quality of service in terms of live VM migration.
- Descriptor table exiting : This feature prevents the relocation of key system data structures and thereby protects the guest OS from internal vulnerabilities.
- Pause-loop exiting : This feature enables to detect spin locks in the guest software and avoid lock-holder preemption which reduces overhead and improves performance.
- Extended page table (EPT) : This technique gives the capability to the guest OS to handle page faults and modify its page tables.
In network virtualization, hardware and software resources and their functionalities are encapsulated into a software-based administrative entity.
The ultimate result in a virtual network which is highly efficient in terms of utilization with less time overhead.
The figure below, illustrates that network virtualization creates the virtualized combination of available resources by splitting up the bandwidth into channels so that each device or user can have shared access to all the resources on the network.
The advantages of multiple instances of virtual networks are as follows:
- Each has a separate control and data plane.
- They coexist together over a single physical network.
Classification of Network Virtualization
Network virtualization can be classified into the following two classes:
- External network virtualization : In this type of virtualization, a single virtual network is created by either a combination or division of multiple local area networks, administered by the software system. VLAN and network switch are the components of this type of virtualization.
- Internal network virtualization : This type of virtualization uses a single system to act as a hypervisor (Xen/KVM) to control virtualized network interface cards (VNICs). Each host can have one or more NICs and each NIC can be a base for multiple VNICs.
Features of Network Virtualization
The features of network virtualization are as follows:
- Partitioning : This feature allows you to create logical network partitions with a programmable control panel so that users can define protocols, network topologies, and functions as per their requirements.
- Isolation : This feature is kept among various logical network partitions to avoid any kind of interference and reduce the performance.
- Abstraction : This feature hides the underlying complexities and characteristics of network elements from applications and users.
Virtualization of storage is nothing but assembling of physical storage from heterogenous storage devices to form a large pool of memory which is managed centrally as shown in the figure below.
By allowing the storage to participate in storage area networks (SANs), we are actually increasing the efficiency, flexibility and load balancing of storage devices.
You should not get confused with the technology of network attached storage (NAS) as SAN is a network of storage devices while NAS is either a single device or a server.
In 2001, Storage Network Industry Association (SNIA) has made an effort to describe the important characteristics of storage virtualization. The group first defined storage virtualization as follows:
- The act of abstracting, hiding, or isolating the internal functions of a storage system or service from applications, host computers, or general network resources for the purpose of enabling application and network-independent management of storage or data.
- The application of virtualization to storage services or devices for the purpose of aggregating functions or devices, hiding complexity, or adding new capabilities to lower level storage resources.
Features of Storage Virtualization
The features of storage virtualization are as follows:
- Non-disruptive data migration : This feature allows data to migrate without disturbing the concurrent I/O access.
- Improved utilization : Utilization is increased by pooling and migrating.
- Thin provisioning : Technology dynamically allocates storage capacity to a volume as per the usage requirement, that is, it allows you to tell the application that it has sufficient storage without actually assigning storage to it.
In the traditional client-server architecture, the server machine runs only one instance of resources. These resources can be processors, operating systems, application software, memory and so on.
The idea of server virtualization is to divide a physical server machine into a number of logically isolated virtual machines and thereby create a number of instances of resources as shown in the figure below.
Adopting this approach serves the advantage that rather than deploying number of servers that may not be fully utilized, numerous virtual machines can run on the same physical platform.
For example, in a company payroll system, rather than using separate servers for employee database, email server and document maintenance, all these applications can be virtualized onto a single server machine.
A server can be virtualized by any of the following techniques:
- Hypervisor or a VAM is a software layer that exists between the hardware machine and the operating system is used to handle kernel level instructions, queuing and processing client’s requests, and so on.
- Paravirtualization is a hypervisor-based virtualization in which the performance of a virtual machine is enhanced by pre-virtualization of the guest OS before installing it on the virtual machine. The idea behind paravirtualization is to prepare the machine for virtualization and abstract the underlying hardware resources from the software that uses those resources. Xen and User Mode Linux (UML) are examples of server virtualization through paravirtualization.
- Hardware virtualization is similar to a paravirtualization except that some hardware assistance is provided. AMD-V Pacifica and Intel VT Vanderpool are examples of hardware supported virtualization.
- OS virtualization provides multiple but logically isolated virtual machines that run on the operating system kernel. The technique is also called shared kernel approach because all the virtual machines share the same kernel of the host operating system. Free VPS, Linux Vserver are examples of OS level virtualization.
Features of Server Virtualization
The features of server virtualization are as follows:
- Partitioning : This feature allows multiple virtual servers to run on one physical server at the same time.
- Isolation : The virtual servers running on the physical server are completely isolated and don’t affect the execution of each other.
- Encapsulation : All the information on virtual servers, including boot disks is saved in the file format.
- Hardware independence : A virtual server runs as it is after migration to different hardware platforms.
- Improved business continuity : Live migration of virtual severs to another physical server results in maintenance of servers without shutting down and hence improves availability and business continuity.
Virtualization in Cloud
The ultimate aim of cloud technology is to be ubiquitous, pervasive, and agile but with an unquenchable thirst for collaboration and sharing of resources.
Virtualization plays a very vital role in achieving these objectives and empowering cloud. To convert your ideas into a streamline business, the enterprise needs business applications which are very expensive and come with a complicated software stack.
Whenever the new version releases, the updating can cause incompatibility among the stack and break the whole system down.
To overcome this problem, virtualization is used.
Virtualization not only provides sharing of data but also sharing of infrastructure. Through virtualization, the resources become massively scalable and can be integrated with IT-related capabilities.
The capabilities of virtualization and sharing of infrastructure in the Cloud computing paradigm is fulfilled by three major distribution models, namely, SAAS, PAAS, and IAAS.
Google Apps and Cisco WebEx for SAAS, Microsoft Azure, and Google App Engine for PAAS, Rackspace and Amazon AWS for IAAS are some third-party providers that host applications, platform and infrastructure over cloud.