|| at Jun 5, 2011 at 5:42 am
Hive tables are nothing but some meta-data overlay on top of folders in HDFS
containing table data. So I guess hdfs-sink of flume suffices.
Please correct me if I am wrong.
On Sun, Jun 5, 2011 at 1:52 AM, Prashanth R wrote:
Just throwing this out to get some good ideas. Is anyone aware of any sink
for flume that would write / load data directly to the hive tables? If not,
one solution that I could think of is dump the data to hdfs or s3 and have a
periodic map reduce job load it to hive.