Dynamic De-duplication Decision in a Hadoop Distributed File System
Ruay- Shiung Chang , Senior Member, IEEE , Chih - Shan Liao, Kuo- Zheng Fan, and Chia - Ming Wu
In this paper, we propose a dynamic De - duplication decision to improve the storage utilization of a datacenter which uses HDFS as its file system. Our proposed syst em can formulate a proper De-duplication strategy to sufficiently utilize the storage space under the limited storage devices. Our De-duplication strategy deletes useless duplicates to increase the storage space.