FAQ
What's the current strategy for when you have a production system and you
realize you need to add another column to the table or do some other thing?
Seems like you'd have to make a new table, run a script to transform and
load all your old data to the new table, and then remove the old table. Is
this what is currently being done?
Josh F.

Search Discussions

  • Ashish Thusoo at Jan 26, 2009 at 11:31 pm
    If you are adding a column at the end of the table, you should be ok with the old data staying in the state that it was provided it is created with MetadataTypedColumnSetSerDe (I am not sure what happens with DynamicSerDe). MetadataTypedColumnSetSerdDe interprets missing columns at the end as nulls in the old data. Note this only works when adding columns at the end without changing names...

    Ashish

    ________________________________
    From: Josh Ferguson
    Sent: Monday, January 26, 2009 3:06 PM
    To: hive-user@hadoop.apache.org
    Subject: Migration Strategy

    What's the current strategy for when you have a production system and you realize you need to add another column to the table or do some other thing? Seems like you'd have to make a new table, run a script to transform and load all your old data to the new table, and then remove the old table. Is this what is currently being done?

    Josh F.
  • Zheng Shao at Jan 26, 2009 at 11:46 pm
    Hi Josh,

    DynamicSerDe with TCTLSeparatedProtocol will also treat missing columns from
    data as NULL.

    Basically, if you create the table without specifying the SerDe or Protocol,
    then it should be Ok to add a new column in the schema, and for old data,
    that new column will be NULL.


    Zheng
    On Mon, Jan 26, 2009 at 3:31 PM, Ashish Thusoo wrote:

    If you are adding a column at the end of the table, you should be ok with
    the old data staying in the state that it was provided it is created with
    MetadataTypedColumnSetSerDe (I am not sure what happens with DynamicSerDe).
    MetadataTypedColumnSetSerdDe interprets missing columns at the end as nulls
    in the old data. Note this only works when adding columns at the end without
    changing names...

    Ashish

    ------------------------------
    *From:* Josh Ferguson
    *Sent:* Monday, January 26, 2009 3:06 PM
    *To:* hive-user@hadoop.apache.org
    *Subject:* Migration Strategy

    What's the current strategy for when you have a production system and you
    realize you need to add another column to the table or do some other thing?
    Seems like you'd have to make a new table, run a script to transform and
    load all your old data to the new table, and then remove the old table. Is
    this what is currently being done?
    Josh F.


    --
    Yours,
    Zheng

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedJan 26, '09 at 11:06p
activeJan 26, '09 at 11:46p
posts3
users3
websitehive.apache.org

People

Translate

site design / logo © 2022 Grokbase