Grokbase Groups Hive user July 2010
FAQ
You can create a table with a custom outputformat which puts the rows into whatever format your other job wants. See for example table hb_range_keys in this doc:

http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad#Prepare_Range_Partitioning

I added a HiveNullValueSequenceFileOutputFormat to get it to write out in the format needed downstream by the TotalOrderPartitioner (which wanted everything in the key and null in the value). You can load in your own extension classes to do similar things.

JVS
On Jul 9, 2010, at 9:38 AM, Matt Pestritto wrote:

Hi.

Something I noticed is that when I run an insert overwrite table... for sequence files the key is empty.
This works as expected for further hive queries because as I understand, hive only reads the value for hive based queries.

I have another MR job outside of hive that needs a key specified and want to consume this same data.

My question is, can I run an insert overwrite table statement and specify a specific column to use as the key instead of an empty int writable in the output seq file ?

Thanks in advance.
-Matt

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 2 | next ›
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 9, '10 at 4:38p
activeJul 9, '10 at 4:44p
posts2
users2
websitehive.apache.org

2 users in discussion

Matt Pestritto: 1 post John Sichi: 1 post

People

Translate

site design / logo © 2021 Grokbase