Local tables are like hive tables in all other senses except that they are on the local disk rather than HDFS. The only other difference I know of is that when you call "drop table" on a local table, only the metadata on the table gets deleted. For tables on HDFS, the table data gets deleted with the metadata.
Ajo,
Guess there is a confusion here. No concept of Local tables in Hive AFAIK. The behavior you mention is for EXTERNAL tables. And the data for external tables can be on local file system or HDFS, depending on configuration. The other tables are addressed as MANAGED tables for which Hive creates a directory under warehouse dir.
-Ajo.
On Tue, Feb 1, 2011 at 8:41 AM, Amlan Mandal wrote:
Thanks Ajo.
Please confirm if my understanding is correct.
That means when I do "LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO TABLE tablename" data in is local file system. If I need to run HIVE queries (which in turn would be converted to Map Reduce jobs) I need to pull the data some other table for which data is in HDFS by means of
INSERT OVERWRITE TABLE tablename_new SELECT * FROM tablename ... (kind of)
So those LOCAL tables are kind of temporary.
See -
http://wiki.apache.org/hadoop/Hive/LanguageManual/DML That should clarify load local.
Amlan
On Tue, Feb 1, 2011 at 6:51 PM, Ajo Fod wrote:Look up for local :
http://wiki.apache.org/hadoop/Hive/GettingStarted-Ajo.
On Tue, Feb 1, 2011 at 3:15 AM, Amlan Mandal wrote:
LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO TABLE tablename
When I use LOCAL keyword does hive create a hdfs file for it?
Yes. Hive creates a file for it on HDFS.
As Ping Zhu mentioned, do a 'describe formatted <tablename>' or 'describe extended <tablename>' after loading data. Check that location on HDFS.
You can also check the logs (they are usually at /tmp/<username>/hive.log). You can see the local file getting copied to HDFS scratch directory and then being moved to a directory under warehouse. If you find anything strange, can u please post them here?