Grokbase Groups Hive user May 2011
FAQ
Hi,

I have a partitioned external table on Hive 0.7. New subfolders are
regularly added to the base table HDFS folder.
I now have to perform this scan myself and let an external tool create new
partitions by generating and firing ALTER TABLE ADD PARTITION commands.

Is there an easier way to have hive scan the base table folder to see if
there are any new partitions around? Something like REBUILD PARTITIONS
perhaps??




Couldn't find anything similar on the Hive/LanguageManual/DDL

--
Kind Regards



Jasper

Search Discussions

  • Ashish Thusoo at May 19, 2011 at 8:55 pm
    afaik there is nothing like that currently. File a feature for this on the JIRA?

    Ashish
    On May 19, 2011, at 2:25 AM, Jasper Knulst wrote:

    Hi,

    I have a partitioned external table on Hive 0.7. New subfolders are regularly added to the base table HDFS folder.
    I now have to perform this scan myself and let an external tool create new partitions by generating and firing ALTER TABLE ADD PARTITION commands.

    Is there an easier way to have hive scan the base table folder to see if there are any new partitions around? Something like REBUILD PARTITIONS perhaps??




    Couldn't find anything similar on the Hive/LanguageManual/DDL

    --
    Kind Regards



    Jasper
  • Tim Spence at May 19, 2011 at 9:01 pm
    Is this functionality handled by ALTER TABLE [name] RECOVER PARTITIONS?
    Take a look at this presentation for context:
    http://www.slideshare.net/AmazonWebServices/aws-office-hours-amazon-elastic-mapreduce

    Best of luck,
    Tim



    On Thu, May 19, 2011 at 2:25 AM, Jasper Knulst wrote:

    Hi,

    I have a partitioned external table on Hive 0.7. New subfolders are
    regularly added to the base table HDFS folder.
    I now have to perform this scan myself and let an external tool create new
    partitions by generating and firing ALTER TABLE ADD PARTITION commands.

    Is there an easier way to have hive scan the base table folder to see if
    there are any new partitions around? Something like REBUILD PARTITIONS
    perhaps??




    Couldn't find anything similar on the Hive/LanguageManual/DDL

    --
    Kind Regards



    Jasper
  • Igor Tatarinov at May 19, 2011 at 10:24 pm
    That's Amazon's extension to Hive and it's really handy.
    On Thu, May 19, 2011 at 2:01 PM, Tim Spence wrote:

    Is this functionality handled by ALTER TABLE [name] RECOVER PARTITIONS?
    Take a look at this presentation for context:
    http://www.slideshare.net/AmazonWebServices/aws-office-hours-amazon-elastic-mapreduce

    Best of luck,
    Tim




    On Thu, May 19, 2011 at 2:25 AM, Jasper Knulst wrote:

    Hi,

    I have a partitioned external table on Hive 0.7. New subfolders are
    regularly added to the base table HDFS folder.
    I now have to perform this scan myself and let an external tool create new
    partitions by generating and firing ALTER TABLE ADD PARTITION commands.

    Is there an easier way to have hive scan the base table folder to see if
    there are any new partitions around? Something like REBUILD PARTITIONS
    perhaps??




    Couldn't find anything similar on the Hive/LanguageManual/DDL

    --
    Kind Regards



    Jasper
  • Roberto Congiu at May 19, 2011 at 10:53 pm
    I agree it's useful, especially for external tables, that may be loaded by
    an external process that may 'forget' to issue a ADD PARTITION.
    A 'sync partitions' feature to sync metadata with directories would be
    really handy.
    On Thu, May 19, 2011 at 3:23 PM, Igor Tatarinov wrote:

    That's Amazon's extension to Hive and it's really handy.

    On Thu, May 19, 2011 at 2:01 PM, Tim Spence wrote:

    Is this functionality handled by ALTER TABLE [name] RECOVER PARTITIONS?
    Take a look at this presentation for context:
    http://www.slideshare.net/AmazonWebServices/aws-office-hours-amazon-elastic-mapreduce

    Best of luck,
    Tim




    On Thu, May 19, 2011 at 2:25 AM, Jasper Knulst wrote:

    Hi,

    I have a partitioned external table on Hive 0.7. New subfolders are
    regularly added to the base table HDFS folder.
    I now have to perform this scan myself and let an external tool create
    new partitions by generating and firing ALTER TABLE ADD PARTITION commands.

    Is there an easier way to have hive scan the base table folder to see if
    there are any new partitions around? Something like REBUILD PARTITIONS
    perhaps??




    Couldn't find anything similar on the Hive/LanguageManual/DDL

    --
    Kind Regards



    Jasper

    --
    Roberto Congiu -Data Engineer - OpenX
    20 E Del Mar blvd, Pasadena, CA
  • Ashutosh Chauhan at May 19, 2011 at 11:08 pm
    Indeed a useful feature: created jira for it:
    https://issues.apache.org/jira/browse/HIVE-2173

    Ashutosh
    On Thu, May 19, 2011 at 15:52, Roberto Congiu wrote:
    I agree it's useful, especially for external tables, that may be loaded by
    an external process that may 'forget' to issue a ADD PARTITION.
    A 'sync partitions' feature to sync metadata with directories would be
    really handy.
    On Thu, May 19, 2011 at 3:23 PM, Igor Tatarinov wrote:

    That's Amazon's extension to Hive and it's really handy.

    On Thu, May 19, 2011 at 2:01 PM, Tim Spence <yogi.wan.kenobi@gmail.com>
    wrote:
    Is this functionality handled by ALTER TABLE [name] RECOVER PARTITIONS?
    Take a look at this presentation for context:
    http://www.slideshare.net/AmazonWebServices/aws-office-hours-amazon-elastic-mapreduce

    Best of luck,
    Tim




    On Thu, May 19, 2011 at 2:25 AM, Jasper Knulst <jasper.knulst@vlc.nl>
    wrote:
    Hi,

    I have a partitioned external table on Hive 0.7. New subfolders are
    regularly added to the base table HDFS folder.
    I now have to perform this scan myself and let an external tool create
    new partitions by generating and firing ALTER TABLE ADD PARTITION commands.

    Is there an easier way to have hive scan the base table folder to see if
    there are any new partitions around? Something like REBUILD PARTITIONS
    perhaps??




    Couldn't find anything similar on the Hive/LanguageManual/DDL

    --
    Kind Regards



    Jasper


    --
    Roberto Congiu -Data Engineer - OpenX
    20 E Del Mar blvd, Pasadena, CA

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedMay 19, '11 at 9:26a
activeMay 19, '11 at 11:08p
posts6
users6
websitehive.apache.org

People

Translate

site design / logo © 2022 Grokbase