Defense Date

2016

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Engineering

First Advisor

Xubin He

Second Advisor

Robert Klenke

Third Advisor

Weijun Xiao

Fourth Advisor

Preetam Ghosh

Fifth Advisor

Wei Cheng

Abstract

With the unprecedented growth of data and the use of low commodity drives in local disk-based storage systems and remote cloud-based servers has increased the risk of data loss and an overall increase in the user perceived system latency. To guarantee high reliability, replication has been the most popular choice for decades, because of simplicity in data management. With the high volume of data being generated every day, the storage cost of replication is very high and is no longer a viable approach.

Erasure coding is another approach of adding redundancy in storage systems, which provides high reliability at a fraction of the cost of replication. However, the choice of erasure codes being used affects the storage efficiency, reliability, and overall system performance. At the same time, the performance and interoperability are adversely affected by the slower device components and complex central management systems and operations.

To address the problems encountered in various layers of the erasure coded storage system, in this dissertation, we explore the different aspects of storage and design several techniques to improve the reliability, performance, and interoperability. These techniques range from the comprehensive evaluation of erasure codes, application of erasure codes for highly reliable and high-performance SSD system, to the design of new erasure coding and caching schemes for Hadoop Distributed File System, which is one of the central management systems for distributed storage. Detailed evaluation and results are also provided in this dissertation.

Rights

© The Author

Is Part Of

VCU University Archives

Is Part Of

VCU Theses and Dissertations

Date of Submission

8-5-2016

Share

COinS