Archive
OpenStack and solid state drives
If you are a service provider or enterprise considering deploying private clouds using OpenStack (an open source alternative to VMware vCloud) then you are in the company of other OpenStack adopters like PayPal and eBay. This article considers the value of SSDs to cloud deployments using OpenStack (not Citrix CloudStack or Eucalyptus).
Block storage & OpenStack: If your public or private cloud is supporting a virtualized environment where you want up to a Terabyte of disk storage to be accessible from within a virtual machine (VM) such that it can be partitioned/formatted/mounted and stays persistent till the user deletes it, then your option for block storage is any storage for which OpenStack Cinder (an OpenStack project for managing storage volumes) supports a block storage driver. Open source block storage options include:
- LVM (block storage exposed as logical volumes to the OS)
- Ceph (open source object storage mounted as a thinly provisioned block device)
- Zettabyte File System – ZFS (a file system and volume manager original designed by Sun but now also available as OpenZFS)
Proprietary alternatives for OpenStack block storage include products from IBM, NetApp, Nexenta and SolidFire.
Object storage & OpenStack: On the other hand if your goal is to access multi terabytes of storage and you are willing to access it over a REST API and you want the storage to stay persistent till the user deletes it, then your open source options for object storage include:
- Swift – A good choice if you plan to distribute your storage cluster across many data centers. Here objects and files are stored on disk drives spread across numerous servers in the data center. It is the OpenStack software that ensures data integrity & replication of this dispersed data
- Ceph – A good choice if you plan to have a single solution to support both block and object level access and want support for thin-provisioning
- Gluster – A good choice if you want a single solution to support both block and file level access
Solid state drives (SSD) or spinning disk?
An OpenStack Swift cluster that has high write requirements would benefit from using SSDs to store metadata. Zmanda (a provider of open source backup software) has run benchmarks to prove that SSD based Swift containers outperform HDD based Swift containers especially when the predominant operations are PUT and DELETE. If you are a service provider looking to deploy a cloud based backup/recovery service based on OpenStack Swift and each of your customers is to have a unique container assigned to them, then you stand to benefit from using SSDs over spinning disks.
Turnkey options?
As a service provider if you are looking for an OpenStack cloud-in-a-box to compete with Amazon S3 consider vendors like MorphLabs. They offer turn-key solutions on Dell servers with storage nodes running NexentaStor (commercial implementation of OpenSolaris and ZFS), KVM hypervisor, VMs running Windows or Linux as the guest OS all on a combination of SSDs and HDDs. The use of SSDs allows MorphLabs to claim lower power consumption and price per CPU as compared to “disk heavy” (their term not mine) vBlock (from Cisco & EMC) and FlexPod (from NetApp) systems.
In conclusion if you are planning to deploy clouds based on OpenStack, SSDs offer you some great alternatives to spinning rust (oops disk).
Price per GB for SSD – why its not always the best yardstick
Price per GB, performance and endurance are the yardsticks used to decide which solid state drive (SSD) to buy for use in a corporate data center or to use in a server or flash based storage array.
Are endurance numbers really comparable? Especially when you consider that one vendor might use consumer grade cMLC NAND while another might use enterprise grade eMLC NAND with vastly different program erase (P/E) cycles? What was the write amplification factor (WAF) – the ratio of SSD controller writes versus the host writes – that was used for the calculation? One vendor might quote endurance in TB or PB written while another might use Drive Writes Per Day (DWPD).
One vendor might state fresh-out-of-the-box (FOB) performance numbers in IOPS on their datasheet while another might display steady state numbers. One might use a synthetic benchmark tool like IOMETER which focuses on queue depth (number of outstanding I/Os), block size and transfer rates instead of an application based benchmark like SysMark which ignores all these criteria and focuses on testing how a real-world application might drive the SSD. Even with tools like IOMETER whether IOMETER 2006, 2008 or 2010 is used will cause the results to vary. To add further complexity, the performance numbers will vary widely depending on whether they were measured with a queue depth (number of outstanding I/Os) of 3, 32, 64, 128 or 256. To compound it one vendor might be looking at compressible data (Word docs, spreadsheets) while another might be quoting numbers for incompressible data (.zip files or .jpeg), some might be using SandForce (now LSI) controllers which compress data before writing it to NAND while others might not. So what is an SSD buyer to do? Get a drive from a vendor you trust and run your own benchmarks whether they are synthetic or application based and derive your own conclusions.
Now why do I find $ per GB as a yardstick amusing? Consider this analogy – could we convince a Japanese consumer that the cantaloupe we buy from a local store for $2.99 here in California is equivalent to a musk melon purchased in Japan for $16,000 yen? From a $ per melon point of view, the price differences are difficult for us to fathom but to a buyer of the $16,000 melon it is apparently a premium worth paying for.