Grokbase Groups Hive user April 2011
You can add these lines in hive-site.xml. It creates only one file at the end.
Hope it helps.

<description>Merge small files at the end of a map-reduce job</description>

<description>The default input format, if it is not specified, the system
assigns it. It is set to HiveInputFormat for hadoop versions 17, 18 and 19,
whereas it is set to CombineHiveInputFormat for hadoop 20. The user can always
overwrite it - if there is a bug in CombineHiveInputFormat, it can always be
manually set to HiveInputFormat. </description>

From: Michael Jiang <>
Sent: Fri, April 8, 2011 11:34:58 AM
Subject: How to configure Hive to use CombineFileInputFormat in case of too many
small files

Could not find the instructions regarding this to avoid performance issues when
too many mappers have to be created for every small file. Thanks!

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 4 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedApr 8, '11 at 6:35p
activeApr 8, '11 at 9:38p

2 users in discussion

Michael Jiang: 3 posts V.Senthil Kumar: 1 post



site design / logo © 2022 Grokbase