Grokbase Groups Hive user June 2010
FAQ
Hi all,

Yesterday I committed Arvind's patch for HIVE-1176, which includes an upgrade from datanucleus 1.x to 2.x.

The patch works fine against a clean checkout, but just now Paul Yang and I noticed a couple of problems introduced due to a change in the way column names are generated by datanucleus when no name is specified in the JDO mapping (which is the case for some of ours such as "isCompressed"). This is a heads-up for people who happen to pull from latest trunk.

The problems only occur when running against an existing metastore, for example if you run trunk/build/dist/bin/hive against a new build in an existing sandbox (where a Derby embedded metastore had previously been created), or if you deploy against an existing production metastore DB.

In a developer sandbox, the default configuration tries to auto-update the schema to add the new column names, and hits an error due to the way the Derby ALTER TABLE statement is generated. If you hit this, a workaround is to delete your trunk/metastore_db directory so that a fresh schema will be recreated instead. Or just move to a fresh checkout.

Paul is taking a look at the column name generation to see if we can get it to match the datanucleus 1.x behavior.

JVS

Search Discussions

  • Arvind Prabhakar at Jun 25, 2010 at 2:35 am
    John,

    Can you describe the problem in more detail and perhaps give us an example
    that can be reproduced?

    Arvind
    On Thu, Jun 24, 2010 at 7:01 PM, John Sichi wrote:

    Hi all,

    Yesterday I committed Arvind's patch for HIVE-1176, which includes an
    upgrade from datanucleus 1.x to 2.x.

    The patch works fine against a clean checkout, but just now Paul Yang and I
    noticed a couple of problems introduced due to a change in the way column
    names are generated by datanucleus when no name is specified in the JDO
    mapping (which is the case for some of ours such as "isCompressed"). This
    is a heads-up for people who happen to pull from latest trunk.

    The problems only occur when running against an existing metastore, for
    example if you run trunk/build/dist/bin/hive against a new build in an
    existing sandbox (where a Derby embedded metastore had previously been
    created), or if you deploy against an existing production metastore DB.

    In a developer sandbox, the default configuration tries to auto-update the
    schema to add the new column names, and hits an error due to the way the
    Derby ALTER TABLE statement is generated. If you hit this, a workaround is
    to delete your trunk/metastore_db directory so that a fresh schema will be
    recreated instead. Or just move to a fresh checkout.

    Paul is taking a look at the column name generation to see if we can get it
    to match the datanucleus 1.x behavior.

    JVS
  • Paul Yang at Jun 25, 2010 at 3:29 am
    I reproduced this by connecting to a database (with schemas created prior to the upgrade) with these properties set:

    <property>
    <name>datanucleus.autoCreateSchema</name>
    <value>false</value>
    </property>

    <property>
    <name>datanucleus.fixedDatastore</name>
    <value>true</value>
    </property>

    The upgraded datanucleus will then try to insert into non-existent columns, resulting in exceptions. This problem doesn't show up with a fresh db and if auto create is enabled. See

    https://issues.apache.org/jira/browse/HIVE-1435

    From: Arvind Prabhakar
    Sent: Thursday, June 24, 2010 7:35 PM
    To: hive-user@hadoop.apache.org
    Cc: hive-dev@hadoop.apache.org
    Subject: Re: JDO upgrade issue with HIVE-1176

    John,

    Can you describe the problem in more detail and perhaps give us an example that can be reproduced?

    Arvind
    On Thu, Jun 24, 2010 at 7:01 PM, John Sichi wrote:
    Hi all,

    Yesterday I committed Arvind's patch for HIVE-1176, which includes an upgrade from datanucleus 1.x to 2.x.

    The patch works fine against a clean checkout, but just now Paul Yang and I noticed a couple of problems introduced due to a change in the way column names are generated by datanucleus when no name is specified in the JDO mapping (which is the case for some of ours such as "isCompressed"). This is a heads-up for people who happen to pull from latest trunk.

    The problems only occur when running against an existing metastore, for example if you run trunk/build/dist/bin/hive against a new build in an existing sandbox (where a Derby embedded metastore had previously been created), or if you deploy against an existing production metastore DB.

    In a developer sandbox, the default configuration tries to auto-update the schema to add the new column names, and hits an error due to the way the Derby ALTER TABLE statement is generated. If you hit this, a workaround is to delete your trunk/metastore_db directory so that a fresh schema will be recreated instead. Or just move to a fresh checkout.

    Paul is taking a look at the column name generation to see if we can get it to match the datanucleus 1.x behavior.

    JVS
  • John Sichi at Jun 25, 2010 at 3:30 am
    Paul explained one symptom in HIVE-1435.

    For the other, the way to repro it is to start from a clean checkout
    from before HIVE-1176 was committed, build there and run build/dist/
    bin/hive to init the metastore. Then svn update to the tip of trunk,
    build again, and run hive CLI again; this time you will hit a startup
    error when it tries to modify the existing Derby database.

    I think Paul's HIVE-1435 patch will resolve both.

    JVS

    On Jun 24, 2010, at 7:35 PM, "Arvind Prabhakar" wrote:

    John,

    Can you describe the problem in more detail and perhaps give us an
    example
    that can be reproduced?

    Arvind
    On Thu, Jun 24, 2010 at 7:01 PM, John Sichi wrote:

    Hi all,

    Yesterday I committed Arvind's patch for HIVE-1176, which includes an
    upgrade from datanucleus 1.x to 2.x.

    The patch works fine against a clean checkout, but just now Paul
    Yang and I
    noticed a couple of problems introduced due to a change in the way
    column
    names are generated by datanucleus when no name is specified in the
    JDO
    mapping (which is the case for some of ours such as
    "isCompressed"). This
    is a heads-up for people who happen to pull from latest trunk.

    The problems only occur when running against an existing metastore,
    for
    example if you run trunk/build/dist/bin/hive against a new build in
    an
    existing sandbox (where a Derby embedded metastore had previously
    been
    created), or if you deploy against an existing production metastore
    DB.

    In a developer sandbox, the default configuration tries to auto-
    update the
    schema to add the new column names, and hits an error due to the
    way the
    Derby ALTER TABLE statement is generated. If you hit this, a
    workaround is
    to delete your trunk/metastore_db directory so that a fresh schema
    will be
    recreated instead. Or just move to a fresh checkout.

    Paul is taking a look at the column name generation to see if we
    can get it
    to match the datanucleus 1.x behavior.

    JVS
  • John Sichi at Jun 25, 2010 at 4:18 pm
    I committed Paul's HIVE-1435 patch last night, so trunk should be OK now.

    JVS
    ________________________________________
    From: John Sichi
    Sent: Thursday, June 24, 2010 8:28 PM
    To: <hive-dev@hadoop.apache.org>
    Cc: hive-user@hadoop.apache.org; hive-dev@hadoop.apache.org
    Subject: Re: JDO upgrade issue with HIVE-1176

    Paul explained one symptom in HIVE-1435.

    For the other, the way to repro it is to start from a clean checkout
    from before HIVE-1176 was committed, build there and run build/dist/
    bin/hive to init the metastore. Then svn update to the tip of trunk,
    build again, and run hive CLI again; this time you will hit a startup
    error when it tries to modify the existing Derby database.

    I think Paul's HIVE-1435 patch will resolve both.

    JVS

    On Jun 24, 2010, at 7:35 PM, "Arvind Prabhakar" wrote:

    John,

    Can you describe the problem in more detail and perhaps give us an
    example
    that can be reproduced?

    Arvind
    On Thu, Jun 24, 2010 at 7:01 PM, John Sichi wrote:

    Hi all,

    Yesterday I committed Arvind's patch for HIVE-1176, which includes an
    upgrade from datanucleus 1.x to 2.x.

    The patch works fine against a clean checkout, but just now Paul
    Yang and I
    noticed a couple of problems introduced due to a change in the way
    column
    names are generated by datanucleus when no name is specified in the
    JDO
    mapping (which is the case for some of ours such as
    "isCompressed"). This
    is a heads-up for people who happen to pull from latest trunk.

    The problems only occur when running against an existing metastore,
    for
    example if you run trunk/build/dist/bin/hive against a new build in
    an
    existing sandbox (where a Derby embedded metastore had previously
    been
    created), or if you deploy against an existing production metastore
    DB.

    In a developer sandbox, the default configuration tries to auto-
    update the
    schema to add the new column names, and hits an error due to the
    way the
    Derby ALTER TABLE statement is generated. If you hit this, a
    workaround is
    to delete your trunk/metastore_db directory so that a fresh schema
    will be
    recreated instead. Or just move to a fresh checkout.

    Paul is taking a look at the column name generation to see if we
    can get it
    to match the datanucleus 1.x behavior.

    JVS

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJun 25, '10 at 2:01a
activeJun 25, '10 at 4:18p
posts5
users3
websitehive.apache.org

People

Translate

site design / logo © 2021 Grokbase