Efficient Ways to Improve the Performance of HDFS for Small Files
Parth Gohil Bakul Panchal
It is one of the core component of Hadoop and it does not perform well for small files as huge numbers of small files pose a heavy burden on NameNode of HDFS and decreasing the performance of HDFS. HDFS is a distributed file system which can process large amounts of data. It is designed to handle large files and suffers performance penalty while dealing with large number of small files. This paper introduces about HDFS, small file problems and ways to deal with it.