Storage Compression is one of those efficiency feature that isn’t talked about much. There are lots of ways in which you can implement compression like File System compression, storage array based compression, backup data compression etc. Storage Compression coupled with thin provisioning in CLARiiON/VNX Systems provide thin provisioned pools the much needed Space Reclamation capability that allows white space to be reclaimed. Remember one of the down side of implementing thin Provisioning for Server Virtualization environment was the thin space reclamation and Storage compression exactly brings that capability for the EMC CLARiiON/VNX arrays.
Data written on LUNs are compressed based on meeting set thresholds/parameters. Generally runs as background task. When host reads compressed data, compressed data is decompressed in the Memory and made available to the host. A LUN when enabled for Compression becomes thin provisioned as well. Freed thin-LUN block space is made available for allocation at a storage pool level. Compression is best suited for data luns that contains whitespaces or recurring patterns.
In EMC Arrays, when we enable compression on a LUN, Complete LUN processing is carried out after which data is processed in fixed 64KB increments encoding any recurring strings within data currently being processed and then moves on to next 64 KB increment. If the resulting savings is less than 8 KB, the data would be written in uncompressed format. If the System can free up to 1 GB of 8 KB blocks on a LUN, the space would be returned back to the Storage Pool as free space. NetApp uses 32K compression groups for scanning.
Most data stored on disk today has at least some redundancy and is easily compacted and lot of space can be reclaimed. Based on my experience at the least 10& of space can be reclaimed but based on data type the value can even go higher.
I have talked of EMC array compression as I have worked on them and pretty much felt the ease of use and its benefits but should pretty much the same with all other supported Storage arrays. Please go through the Vendor documentation for in-depth details and start compressing your data now!
In a traditional Storage allocation model, Storage Requirement for most environments are typically satisfied by allocating the Storage based on the Capacity requirement rather than Performance because of either not knowing the Performance requirement of the application/environment during its initial deployment phase or due to lack of understanding of the importance of RAID and associated Disk penalties.
Consider this, if there is a requirement to host 500 VMs of 18 IOPS each and 100 GB Capacity, We are looking at a Storage requirement of 6 TB (5 *100 GB + 25% Buffer Overhead) & 9000 IOPS. Considering a Read:Write ratio of 70:30 in RAID 5, we are looking at IOPS Requirement of 17100 which can be satisfied by 95 numbers of 15K drives. Consider RAID 5 configuration with 4+1 Drives for these 95 drives to satisfy the IOPS Requirement, We would be left with roughly 38 TB of Usable Space. But our capacity requirement was only 6 TB. How do we satisfy the IOPS and still right size for Capacity or how do we size for Capacity and still match the IOPS Requirement? Are we not spending additional $$$ to accommodate Performance requirements when most of the virtual workloads would not be active all the time?
Storage Tiering is recommended when you want maximum Efficiency at lower cost. Simply put, automating data migration/placement within different tiers of storage according to workload. If we consider the same example referred above, instead of placing in 95 numbers of 600 GB drives, Can we look at an alternate approach of using multiple drive types in a Single Storage Pool like 5 numbers of SSDs along with 27 numbers of 900 GB 10K SAS plus 16 numbers of 2 TB NL-SAS. SSD & SAS Drives would serve the performance requirements whereas the NL-SAS would serve the capacity requirement at lower cost. Active data set within the pool would be automatically migrated to SSD/SAS drives whenever they get active and inactive data would get into the NL-SAS drives. There are a lot of advantages in this model since you can keep expanding your Storage Pool with additional drives as and when required online. Data placement would be handled automatically by the Storage Array and with Powerful SSD based Cache that are available these days, Storage Tiering is the way to move forward for high-capacity environments.
For VMware Environments, you have an additional choice of enabling and making use of SDRS in cases where Storage Array doesn’t have the capability or licenses for Storage Tiering. This can also do a limited level of automatic placement of VMs between datastore and in case where datastore are from different tires of Storage Systems, this can also be used as a Virtual Storage Tiering.
Benefits of Storage Tiering
- Maximize Storage Efficiency at Lower Cost
- Simplified administration
- Decrease Drive Counts on the Array
- Cost Reduction
- Improve performance with wide striping on larger Storage Pools
Remember to understand your Storage array capabilities and understand its limitations before you implement.
Traditionally, we allocate storage required for an environment upfront which is also referred as thick provisioning or fat provisioning. For Example, when there is a request for a 100 GB Virtual disk space, we go ahead with creation & allocation of 100 GB space. But have you analyzed how long would it take for that 100 GB Space to be consumed?
- If this is the case for a Single VM, Imagine the space that is being allocated to multiple virtual servers in the environment and how much space goes underutilized? In cases where they are optimally utilized, how long before they get consumed completely? Can we use the space for that time being for some other use cases or for provisioning more VMs?
- Another use case is optimizing the free space in datastore. Depending on environment there is a certain amount of buffer space that is reserved in every datastore. For Example, if we consider 20% as buffer space, in a 2 TB datastore we would be left with 400 GB free space per datastore. If you environment has 25 datastore of 2 TB each, you would be approximately leaving 10 TB (400 GB * 25) of unused space. Can we optimize this 10 TB space?
Thin Provisioning help increase storage utilization by enabling dynamic allocation and intelligent provisioning of physical storage capacity. Thin Provisioning is an On-Demand allocation model where storage is allocated based on request from server rather than allocating everything upfront.
Thin Provisioning can be implemented in a Virtualized environment in either of the two ways or combination of both.
- Thin Provisioning at Hypervisor layer
- Thin Provisioning at Storage Array
Thin Provisioning at Hypervisor layer would mean that you present a thick LUN from Storage array or create datastore from your local HDDs and then do thin provisioning of virtual hard disk on the datastore. This would help you provision more VMs per datastore since space saved is at datastore level. You can look at Hypervisor based thin provisioning when you use local attached HDDs or your storage array doesn’t support the feature.
Thin Provisioning at Storage array can be implemented when your storage array supports the feature and you have appropriate license for it if need be. Typically you create a Storage pool and present Thin LUN back to your Hypervisor and you create VMs that are thick provisioned (Lazy Zeroed). Thin LUN presented from the Storage array would grow only when the data in the provisioned virtual HDDs grows. One would typically create multiple Thin luns from Storage Pool and present back to hypervisor for allocation. Advantage of doing thin provisioning at Storage array is that space saved an either be used back for the same hypervisor/cluster or for a different use case.
Whatever be the option you choose, it is imperative that you monitor any thin provisioned environment carefully for growth and do proper capacity planning. Also please be mindful of impact on the disk performance of the workload. Would not recommend thin provisioning if your workload is a critical disk intensive workload. There are lot of non-critical workloads that can be deployed on Thin Provisioned LUNs saving your Organization lots and lots of space which in return would mean lots of $$$ Savings !!
Virtualization – To unleash the full benefits and capabilities of any Hypervisor in the Industry today,One of the key requirement is Storage. Storage plays a critical role in Virtualization and while lot of focus is on Big Data , We must not over-look the benefits & challenges of storage for Virtualization. While Storage is considered de-facto in many environments , there are still environments that do not use storage for their entire virtualization layout more so because of cost per GB than being ruled out of technical reasons.
Also in environments where there is wide-spread adoption of Storage for Virtualization , there are a lot of opportunities for increasing the efficiency of the Storage that is being used. VM Sprawls and Private Cloud set-up leads to large number of VMs being consumes in Organizations and while Hypervisor techniques can help handle the over-subscribed compute and memory, Storage Efficiency features at the array level would help you optimally use the back-end storage.
There are 4 critical opportunities available for Storage Efficiency. In a Series of posts , I will share my thoughts as an administrator on these key features , their benefits and pit-falls to watch out for.
Based on the Storage Vendor , the Storage efficiency features supported may vary along with the implementation methodology. But the opportunities for increasing the efficiency in back-end Storage used for Server Virtualization is immense and would recommend you all to try out ( In case you haven’t tried yet). Please do refer your Vendor’s documentation and understand the features supported and its implementation methodology. With Every technology, there is also going to be some pitfalls and please be sure to watch out for them and set-up mechanisms to avoid them.
Results of the Annual voting conducted by Eric Sibert to Choose the Top VMware/Virtualization blogs is out and the complete survey results can be found here.
Wanted to take this moment to Congratulate all the 187 Bloggers on their amazing effort day in & day out taking time away from their work to keep the VMware Virtualization community alive and buzzing.
King Duncan continues to lead the Survey as the un-disputed No.1 .
Personally , It was humbling to have my blog as part of the Survey this Year and was more humbling to see my blog voted from being uncategorized to No.52. Thanks all for taking time and Voting for the Survey !
Annual voting to Choose the Top VMware/Virtualization blogs 2012 is now open . For People who aren’t aware , this is a annual poll conducted by vExpert / Virtualization Blogger Eric Sibert to identify Top VMware Virtualization Bloggers.
I have been an active participant in the Voting process for the past 4 times and I am very excited to have my blog as part of the nominations this Year along with the likes of Duncan Epping , Frank Denneman , Chad Sakac etc..
5 minutes is all that is required to complete the voting process and would request each and everyone of you to vote for your Top 10 bloggers. This gives them the much needed intangible Motivation to go Further . Additionally , If you like my Blog posts on Virtualization & Storage which are more from an Administrator perspective , Please do VOTE .
I am eagerly awaiting to see the Top 25 list this Year . Aren’t you
Cast your Vote Now @ http://www.surveygizmo.com/s3/786135/Top-VMware-virtualization-blogs-2012
Remember , Content is KING !!
Everyone knows by now that ESX 3.x to ESXi 5.x upgrade is not supported and Administrators has to upgrade their existing ESX 3.x hosts to ESX 4.x and then upgrade the host to ESXi 5.x. Please note that vCenter 5.0 can still manage ESX 3.x hosts provided License Server is configured to manage the ESX 3.x hosts licenses.
But for some reasons , ESX 3.x to ESX 4.x upgrade cannot be carried out using vSphere Update Manager 5 since vSphere Update Manager 5 does not support ESX 4.x Images. Without ESX 4.x images , we might not be able to do in-place upgrade of ESX 3.x hosts to ESX 4.x hosts using vSphere Update Manager.
You would receive error message as shown below
” The Uploaded upgrade package cannot be used with VMware vSphere Update Manager”
This can be confirmed from the vSphere Update Manager Release notes that can be found here. Please find a snapshot of the Interoperability and Software Requirements section which clearly indicates that VUM 5 supports Host upgrades of ESX / ESXi 4.x to ESXi 5.0
People who have already upgraded to vCenter 5.0 & vSphere Update Manager can use the good old vSphere Host update utility to upgrade their existing ESX 3.x hosts while Customers who have not yet upgraded yet to vCenter 5.0 can try and upgrade all hosts to ESX 4.x and then upgrade the existing vCenter 4.x to vCenter 5.x.