Grokbase Groups Hive user April 2011
I have about 5K input files so running a Hive job creates as many (small)
output files. Small-file merging seems to be enabled by default
(hive.merge.mapfiles=true) but it doesn't seem to work unless output
compression is disabled (hive.exec.compress.output=false). If I do that, I
get only 30 (uncompressed) output files which is much more manageable.

Is there a way to enable both compression and small-file merge?

If not, I am thinking about saving into an uncompressed temp table first,
then enabling compression and saving into the output table. Is there an
easier way?


Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedApr 21, '11 at 5:56p
activeApr 21, '11 at 5:56p

1 user in discussion

Igor Tatarinov: 1 post



site design / logo © 2023 Grokbase