Reducing Storage Overhead with Small Write Bottleneck Avoiding in Cloud RAID System

Abstract

Cloud storage systems commonly use replication of stored data sets to ensure high reliability and availability. However, the high storage overhead of replication becomes increasingly unacceptable with the explosive growth of data stored in cloud. Some cloud storage systems have attempted to replace replication with erasure coding to reduce storage overhead, that is just the thinking behind Cloud RAID. A well-designed Cloud RAID mechanism should achieve the right tradeoffs between storage efficiency, performance, and reliability. As there exists no widely-accepted methods for Cloud RAID, we present a workloads-based Cloud RAID schema-Selective Cloud RAID (SCR for short). SCR treats primary storage and backup storage with different RAIDmethods, the former at the level of directories, and the latter at the level of individual files. SCR has three distinct advantages over previous attempts at Cloud RAID{:} (1) it can significantly reduce the storage overhead compared with threeway replication, (2) it can avoid most cases of the “small write bottleneck” and simplify system maintenance, (3) its implementation is modular, therefore, it is easy to configure different erasure codes for different workloads. Additionally, we have implemented a SCR prototype with RDP code, which shows significant benefits over Blaum-Roth codes in degraded read performance. To verify the effectiveness of SCR, we perform theoretical analysis and elaborate benchmark tests to evaluate the performance of SCR prototype.

Type
Publication
13th ACM/IEEE International Conference on Grid Computing
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.