|
Shiva |
at Jan 22, 2010 at 8:10 pm
|
⇧ |
| |
I can try that. Here is what I am trying to do.
Load some fact data from a file (say weblogs moved to HDFS after some
cleanup and transformation) and then do summarization at daily or weekly
level. In that case, I would like to create a one fact table which get
loaded with daily data and bring dimensional data from MySQL to perform
summarization.
I appreciate any input on this technique, performance and how I can get
dimensional data to Hive (from MySQL -> to file -> HDFS -> Hive).
Thanks,
Shiva
On Fri, Jan 22, 2010 at 11:05 AM, Zheng Shao wrote:If you want the files to stay there, you can try "CREATE EXTERNAL
TABLE" with a location (instead of create table + load)
Zheng
On Fri, Jan 22, 2010 at 10:51 AM, Bill Graham wrote:Hive doesn't delete the files upon load, it moves them to a location under
the Hive warehouse directory. Try looking under
/user/hive/warehouse/t_word_count.
On Fri, Jan 22, 2010 at 10:44 AM, Shiva wrote:
Hi,
For the first time I used Hive to load couple of word count data
input
files into tables with and without OVERWRITE. Both the times the input
file
in HDFS got deleted. Is that a expected behavior? And couldn't find any
definitive answer on the Hive wiki.
hive> LOAD DATA INPATH '/user/vmplanet/output/part-00000' OVERWRITE
INTO TABLE t_word_count;
Env.: Using Hadoop 0.20.1 and latest Hive on Ubuntu 9.10 running in
VMware.
Thanks,
Shiva
--
Yours,
Zheng