|| at Jul 9, 2010 at 4:44 pm
You can create a table with a custom outputformat which puts the rows into whatever format your other job wants. See for example table hb_range_keys in this doc:http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad#Prepare_Range_Partitioning
I added a HiveNullValueSequenceFileOutputFormat to get it to write out in the format needed downstream by the TotalOrderPartitioner (which wanted everything in the key and null in the value). You can load in your own extension classes to do similar things.
On Jul 9, 2010, at 9:38 AM, Matt Pestritto wrote:
Something I noticed is that when I run an insert overwrite table... for sequence files the key is empty.
This works as expected for further hive queries because as I understand, hive only reads the value for hive based queries.
I have another MR job outside of hive that needs a key specified and want to consume this same data.
My question is, can I run an insert overwrite table statement and specify a specific column to use as the key instead of an empty int writable in the output seq file ?
Thanks in advance.