Grokbase Groups Hive user April 2011
FAQ
You can add these lines in hive-site.xml. It creates only one file at the end.
Hope it helps.

<property>
<name>hive.merge.mapredfiles</name>
<value>true</value>
<description>Merge small files at the end of a map-reduce job</description>
</property>

<property>
<name>hive.input.format</name>
<value>org.apache.hadoop.hive.ql.io.CombineHiveInputFormat</value>
<description>The default input format, if it is not specified, the system
assigns it. It is set to HiveInputFormat for hadoop versions 17, 18 and 19,
whereas it is set to CombineHiveInputFormat for hadoop 20. The user can always
overwrite it - if there is a bug in CombineHiveInputFormat, it can always be
manually set to HiveInputFormat. </description>
</property>






________________________________
From: Michael Jiang <it.mjjiang@gmail.com>
To: user@hive.apache.org
Sent: Fri, April 8, 2011 11:34:58 AM
Subject: How to configure Hive to use CombineFileInputFormat in case of too many
small files

Could not find the instructions regarding this to avoid performance issues when
too many mappers have to be created for every small file. Thanks!

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 4 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedApr 8, '11 at 6:35p
activeApr 8, '11 at 9:38p
posts4
users2
websitehive.apache.org

2 users in discussion

Michael Jiang: 3 posts V.Senthil Kumar: 1 post

People

Translate

site design / logo © 2022 Grokbase