Grokbase Groups Hive user July 2010
You can create a table with a custom outputformat which puts the rows into whatever format your other job wants. See for example table hb_range_keys in this doc:

I added a HiveNullValueSequenceFileOutputFormat to get it to write out in the format needed downstream by the TotalOrderPartitioner (which wanted everything in the key and null in the value). You can load in your own extension classes to do similar things.

On Jul 9, 2010, at 9:38 AM, Matt Pestritto wrote:


Something I noticed is that when I run an insert overwrite table... for sequence files the key is empty.
This works as expected for further hive queries because as I understand, hive only reads the value for hive based queries.

I have another MR job outside of hive that needs a key specified and want to consume this same data.

My question is, can I run an insert overwrite table statement and specify a specific column to use as the key instead of an empty int writable in the output seq file ?

Thanks in advance.

Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 9, '10 at 4:38p
activeJul 9, '10 at 4:44p

2 users in discussion

Matt Pestritto: 1 post John Sichi: 1 post



site design / logo © 2021 Grokbase