Grokbase Groups Hive user July 2010
FAQ
Hi.

Something I noticed is that when I run an insert overwrite table... for
sequence files the key is empty.
This works as expected for further hive queries because as I understand,
hive only reads the value for hive based queries.

I have another MR job outside of hive that needs a key specified and want to
consume this same data.

My question is, can I run an insert overwrite table statement and specify a
specific column to use as the key instead of an empty int writable in the
output seq file ?

Thanks in advance.
-Matt

Search Discussions

  • John Sichi at Jul 9, 2010 at 4:44 pm
    You can create a table with a custom outputformat which puts the rows into whatever format your other job wants. See for example table hb_range_keys in this doc:

    http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad#Prepare_Range_Partitioning

    I added a HiveNullValueSequenceFileOutputFormat to get it to write out in the format needed downstream by the TotalOrderPartitioner (which wanted everything in the key and null in the value). You can load in your own extension classes to do similar things.

    JVS
    On Jul 9, 2010, at 9:38 AM, Matt Pestritto wrote:

    Hi.

    Something I noticed is that when I run an insert overwrite table... for sequence files the key is empty.
    This works as expected for further hive queries because as I understand, hive only reads the value for hive based queries.

    I have another MR job outside of hive that needs a key specified and want to consume this same data.

    My question is, can I run an insert overwrite table statement and specify a specific column to use as the key instead of an empty int writable in the output seq file ?

    Thanks in advance.
    -Matt

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJul 9, '10 at 4:38p
activeJul 9, '10 at 4:44p
posts2
users2
websitehive.apache.org

2 users in discussion

Matt Pestritto: 1 post John Sichi: 1 post

People

Translate

site design / logo © 2022 Grokbase