|
Aniket Mokashi |
at Jul 5, 2011 at 9:23 pm
|
⇧ |
| |
Hi,
I would like hive to detect the partition automatically as the directory
gets updated with new data (by MR job). Is it possible to do away with
"alter table tablename add partition (insertdate='2008-01-01') LOCATION
's3n://' or 'hdfs://<path>/abc/xyz/'" command everytime I get some new
partition.
Can I have-
CREATE EXTERNAL TABLE IF NOT EXISTS tablename (.......) partitioned by
(insertdate string) Location '/abc/xyz';
and hive would start scanning through all available partitions
(sub-directories inside /abc/xyz)
Thanks,
Aniket
On Fri, Jul 1, 2011 at 4:39 PM, Aniket Mokashi wrote:Thanks Prashanth,
select Count(*) from segmentation_data where (dt='2011-07-01');
java.io.IOException: Not a file:
hdfs://hadoop01:9000/data_feed/sophia/segmentation_data/1970-01-01
at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:206)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:261)
I am not sure why it looks for 1970 year!
Also, I am assuming I have to add all the partitions manually, but that
seems reasonable.
Thanks,
Aniket
On Fri, Jul 1, 2011 at 4:11 PM, Prashanth R wrote:Pasting an example here:
CREATE EXTERNAL TABLE IF NOT EXISTS tablename (.......) partitioned by
(insertdate string) ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.JsonSerde';
alter table tablename add partition (insertdate='2008-01-01') LOCATION
's3n://' or 'hdfs://<path>/abc/xyz/'
- Prashanth
On Fri, Jul 1, 2011 at 3:57 PM, Aniket Mokashi wrote:
Hi,
I have a data on HDFS that is already stored into directories as per
date. for example- /abc/xyz/yyyy-mm-d1, /abc/xyz/yyyy-mm-d2. How do I create
external table with partition key as date to point to data in this
directory?
Please advise.
Thanks,
Aniket
--
- Prash
--
"...:::Aniket:::... Quetzalco@tl"
--
"...:::Aniket:::... Quetzalco@tl"