Project: Merge small files on HDFS for Hive table

Project: Merge small files on HDFS for Hive table Introduction Github: https://github.com/sskaje/hive_merge This is a solution for small file problems on HDFS, but Hive table only. Here is why I wrote this project: Solving Small Files Problem on CDH4. This script simply INSERT the requested table/partition to a new table, let data be merged by … Continue reading “Project: Merge small files on HDFS for Hive table”