A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Distributed Storage Systems

Abstract: Deduplication has been employed mainly in distributed storage systems to improve space efficiency. Traditional deduplication research ignores the design requirements of shared-nothing distributed storage systems, such as no central metadata bottleneck, scalability, and storage rebalancing. Further, deduplication introduces transactional changes threatening the system’s data reliability, recovery, and consistency issues in the event of system failures. In this talk, I will present my work on building a robust, fault-tolerant, and scalable cluster-wide inline deduplication design that can eliminate duplicate copies across the cluster, maintaining consistency and effective garbage collection mechanism without violating the design properties of shared-nothing storage systems. We decouple the deduplication metadata from the read I/O path and replace it with an RMO object to further speedup the read performance. Finally, we show experimentally that our approach achieves high storage space efficiency without jeopardizing performance when compared against state-of-the-art content-addressable deduplication.

Please use this link to attend the virtual seminar:

https://bluejeans.com/978322106/6132

Meeting ID: 978322106 / Participant passcode: 6132

Argonne Leadership Computing Facility

Leadership Computing Resources

Featured: Aurora

Computational Science

Featured: Engineering

Growing the HPC Community

Accelerating Science

Support Center

Featured: Get Started

Featured: MyALCF

A Robust Fault-Tolerant and Scalable Cluster-wide Deduplication for Shared-Nothing Distributed Storage Systems

03/10/2021, 10am CT