DOI
https://doi.org/10.25772/Z6PM-SF44
Defense Date
2016
Document Type
Dissertation
Degree Name
Doctor of Philosophy
Department
Engineering
First Advisor
Xubin He
Second Advisor
Robert Klenke
Third Advisor
Weijun Xiao
Fourth Advisor
Preetam Ghosh
Fifth Advisor
Wei Cheng
Abstract
With the unprecedented growth of data and the use of low commodity drives in local disk-based storage systems and remote cloud-based servers has increased the risk of data loss and an overall increase in the user perceived system latency. To guarantee high reliability, replication has been the most popular choice for decades, because of simplicity in data management. With the high volume of data being generated every day, the storage cost of replication is very high and is no longer a viable approach.
Erasure coding is another approach of adding redundancy in storage systems, which provides high reliability at a fraction of the cost of replication. However, the choice of erasure codes being used affects the storage efficiency, reliability, and overall system performance. At the same time, the performance and interoperability are adversely affected by the slower device components and complex central management systems and operations.
To address the problems encountered in various layers of the erasure coded storage system, in this dissertation, we explore the different aspects of storage and design several techniques to improve the reliability, performance, and interoperability. These techniques range from the comprehensive evaluation of erasure codes, application of erasure codes for highly reliable and high-performance SSD system, to the design of new erasure coding and caching schemes for Hadoop Distributed File System, which is one of the central management systems for distributed storage. Detailed evaluation and results are also provided in this dissertation.
Rights
© The Author
Is Part Of
VCU University Archives
Is Part Of
VCU Theses and Dissertations
Date of Submission
8-5-2016