In a virtualized environment, storage operations traditionally have been expensive from a resource perspective.
Functions such as cloning and snapshots can be performed more efficiently by the storage device than by
the host.
VMware vSphere® Storage APIs – Array Integration (VAAI), also referred to as hardware acceleration or
hardware offload APIs, are a set of APIs to enable communication between VMware vSphere ESXi™ hosts and
storage devices. The APIs define a set of “storage primitives” that enable the ESXi host to offload certain storage
operations to the array, which reduces resource overhead on the ESXi hosts and can significantly improve
performance for storage-intensive operations such as storage cloning, zeroing, and so on. The goal of VAAI is to
help storage vendors provide hardware assistance to speed up VMware® I/O operations that are more efficiently
accomplished in the storage hardware.
Without the use of VAAI, cloning or migration of virtual machines by the vSphere VMkernel Data Mover involves
software data movement. The Data Mover issues I/O to read and write blocks to and from the source and
destination datastores. With VAAI, the Data Mover can use the API primitives to offload operations to the array if
possible. For example, if the desired operation were to copy a virtual machine disk (VMDK) file from one
datastore to another inside the same array, the array would be directed to make the copy completely inside the
array. Whenever a data movement operation is invoked and the corresponding hardware offload operation is
enabled, the Data Mover will first attempt to use the hardware offload. If the hardware offload operation fails,
the Data Mover reverts to the traditional software method of data movement.
In nearly all cases, hardware data movement will perform significantly better than software data movement. It
will consume fewer CPU cycles and less bandwidth on the storage fabric. Improvements in performance can be
observed by timing operations that use the VAAI primitives and using esxtop to track values such as CMDS/s,
READS/s, WRITES/s, MBREAD/s, and MBWRTN/s of storage adapters during the operation.
In the initial VMware vSphere 4.1 implementation, three VAAI primitives were released. These primitives applied
only to block (Fibre Channel, iSCSI, FCoE) storage. There were no VAAI primitives for NAS storage in this
initial release.
In vSphere 5.0, VAAI primitives for NAS storage and VMware vSphere Thin Provisioning were introduced.
Functions such as cloning and snapshots can be performed more efficiently by the storage device than by
the host.
VMware vSphere® Storage APIs – Array Integration (VAAI), also referred to as hardware acceleration or
hardware offload APIs, are a set of APIs to enable communication between VMware vSphere ESXi™ hosts and
storage devices. The APIs define a set of “storage primitives” that enable the ESXi host to offload certain storage
operations to the array, which reduces resource overhead on the ESXi hosts and can significantly improve
performance for storage-intensive operations such as storage cloning, zeroing, and so on. The goal of VAAI is to
help storage vendors provide hardware assistance to speed up VMware® I/O operations that are more efficiently
accomplished in the storage hardware.
Without the use of VAAI, cloning or migration of virtual machines by the vSphere VMkernel Data Mover involves
software data movement. The Data Mover issues I/O to read and write blocks to and from the source and
destination datastores. With VAAI, the Data Mover can use the API primitives to offload operations to the array if
possible. For example, if the desired operation were to copy a virtual machine disk (VMDK) file from one
datastore to another inside the same array, the array would be directed to make the copy completely inside the
array. Whenever a data movement operation is invoked and the corresponding hardware offload operation is
enabled, the Data Mover will first attempt to use the hardware offload. If the hardware offload operation fails,
the Data Mover reverts to the traditional software method of data movement.
In nearly all cases, hardware data movement will perform significantly better than software data movement. It
will consume fewer CPU cycles and less bandwidth on the storage fabric. Improvements in performance can be
observed by timing operations that use the VAAI primitives and using esxtop to track values such as CMDS/s,
READS/s, WRITES/s, MBREAD/s, and MBWRTN/s of storage adapters during the operation.
In the initial VMware vSphere 4.1 implementation, three VAAI primitives were released. These primitives applied
only to block (Fibre Channel, iSCSI, FCoE) storage. There were no VAAI primitives for NAS storage in this
initial release.
In vSphere 5.0, VAAI primitives for NAS storage and VMware vSphere Thin Provisioning were introduced.
VAAI Block Primitives
In VMware vSphere VMFS, many operations must establish a lock on the volume when updating a resource.
Because VMFS is a clustered file system, many ESXi hosts can share the volume. When one host must make an
update to the VMFS metadata, a locking mechanism is required to maintain file system integrity and prevent
another host from coming in and updating the same metadata. The following operations require this lock:
1. Acquire on-disk locks
2. Upgrade an optimistic lock to an exclusive/physical lock
3. Unlock a read-only/multiwriter lock
4. Acquire a heartbeat
5. Clear a heartbeat
6. Replay a heartbeat
7. Reclaim a heartbeat
8. Acquire on-disk lock with dead owner
It is not essential to understand all of these operations in the context of this whitepaper. It is sufficient to
understand that various VMFS metadata operations require a lock.
ATS is an enhanced locking mechanism designed to replace the use of SCSI reservations on VMFS volumes
when doing metadata updates. A SCSI reservation locks a whole LUN and prevents other hosts from doing
metadata updates of a VMFS volume when one host sharing the volume has a lock. This can lead to various
contention issues when many virtual machines are using the same datastore. It is a limiting factor for scaling to
very large VMFS volumes. ATS is a lock mechanism that must modify only a disk sector on the VMFS volume.
When successful, it enables an ESXi host to perform a metadata update on the volume. This includes allocating
space to a VMDK during provisioning, because certain characteristics must be updated in the metadata to
reflect the new size of the file. The introduction of ATS addresses the contention issues with SCSI reservations
and enables VMFS volumes to scale to much larger sizes.
In vSphere 4.0, VMFS3 used SCSI reservations for establishing the lock, because there was no VAAI support in
that release. In vSphere 4.1, on a VAAI-enabled array, VMFS3 used ATS for only operations 1 and 2 listed
previously, and only when there was no contention for disk lock acquisitions. VMFS3 reverted to using SCSI
reservations if there was a multihost collision when acquiring an on-disk lock using ATS.
In the initial VAAI release, the ATS primitives had to be implemented differently on each storage array, requiring
a different ATS opcode depending on the vendor. ATS is now a standard T10 SCSI command and uses opcode
0x89 (COMPARE AND WRITE).
For VMFS5 datastores formatted on a VAAI-enabled array, all the critical section functionality from operations 1
to 8 is done using ATS. There no longer should be any SCSI reservations on VAAI-enabled VMFS5. ATS continues
to be used even if there is contention. On non-VAAI arrays, SCSI reservations continue to be used for
establishing critical sections in VMFS5.
0 comments:
Post a Comment