Project: Merge small files on HDFS for Hive table

Project: Merge small files on HDFS for Hive table

Introduction

Github: https://github.com/sskaje/hive_merge

This is a solution for small file problems on HDFS, but Hive table only.

Here is why I wrote this project: Solving Small Files Problem on CDH4.

This script simply INSERT the requested table/partition to a new table, let data be merged by Hive itself, then INSERT back with compression.

Configuration Properties in Hive

Usage

Examples

Project: Merge small files on HDFS for Hive table by @sskaje: https://sskaje.me/2013/12/project-merge-small-files-hdfs-hive-table/

Incoming search terms: