FAQ
Hello.

PostgreSQL is very fast when we perform (even on a huge table)

ALTER TABLE ... ADD COLUMN ... NULL;

(nullable without a default value). This is because of NULL bitmap in
tuples. And it's greatest feature for a developer!


But another very common-case query like

ALTER TABLE ... ADD COLUMN ... BOOLEAN NOT NULL DEFAULT false;
or
ALTER TABLE ... ADD COLUMN ... INT NOT NULL DEFAULT 0;

for a huge table is performed very slow - this is because PostgreSQL have to
re-create all tuples assigning the default value to them. If I have a table
with 1 billion rows (for example), I have no chance to perform this query at
all - too slow.

(In most cases NOT NULL DEFAULT xxx fields are BOOLEAN, flags: it is not
handy to have 3-way flags.)


So, are there plans to optimize such kind of queries? This could be done by
many ways:

1. Store the DEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap
not only for NULLable fields, but also for NOT NULL DEFAULT ... fields).
2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means
that there is a real value in a cell, 1 - that the value is default.
3. The same as (1), but always force default value to be 0 (or false or any
other values with meaning "zero") and optimize only these cases.

Search Discussions

  • Peter Eisentraut at May 21, 2009 at 9:51 am

    On Thursday 21 May 2009 11:06:29 Dmitry Koterov wrote:
    1. Store the DEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap
    not only for NULLable fields, but also for NOT NULL DEFAULT ... fields).
    2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means
    that there is a real value in a cell, 1 - that the value is default.
    3. The same as (1), but always force default value to be 0 (or false or any
    other values with meaning "zero") and optimize only these cases.
    It seems to me that these solutions would require everyone in the world to pay
    the overhead of a few to many bits in every tuple to optimize what appears to
    be a relatively rare circumstance after all. Doesn't seem worth it to me.
  • Sam Mason at May 21, 2009 at 10:10 am

    On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
    ALTER TABLE ... ADD COLUMN ... NULL;

    (nullable without a default value). This is because of NULL bitmap in
    tuples. And it's greatest feature for a developer!
    I don't think this is because of the "NULL bitmap". PG just never needs
    to flush the changes to every tuple because it knows that all "old"
    tuples (i.e. ones that were created before this column was added) are
    supposed to be NULL.
    But another very common-case query like

    ALTER TABLE ... ADD COLUMN ... BOOLEAN NOT NULL DEFAULT false;
    or
    ALTER TABLE ... ADD COLUMN ... INT NOT NULL DEFAULT 0;
    So, are there plans to optimize such kind of queries? This could be done by
    many ways:
    I think this hasn't been done before because it's been considered too
    difficult to keep track of everything, but I've just tried to come up
    with an example of why it's difficult and failed. If I'm interpreting
    things correctly it's not nearly as difficult as I thought it should
    be. All that needs to be tracked is the "first" default value (this is
    currently assumed to be NULL). All subsequent INSERTs will have this
    value in the tuple and things should just work out.

    CREATE TABLE t ( i INTEGER PRIMARY KEY );
    INSERT INTO t (i) VALUES (1);
    ALTER TABLE t ADD COLUMN j INTEGER DEFAULT 1;
    INSERT INTO t (i) VALUES (2);
    ALTER TABLE t ALTER j SET DEFAULT 2;
    INSERT INTO t (i) VALUES (3);
    ALTER TABLE t ALTER j DROP DEFAULT;
    INSERT INTO t (i) VALUES (4);

    After this we will have the following tuples:

    (1)
    (2,1)
    (3,2)
    (4,NULL)

    All that needs to be done is to fill in the "default" for i=1 to the
    first default (i.e. the integer 1) and everything is done.

    Who wants to tell me what I've missed?
  • Tom Lane at May 21, 2009 at 2:45 pm

    Sam Mason writes:
    On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
    ALTER TABLE ... ADD COLUMN ... NULL;

    (nullable without a default value). This is because of NULL bitmap in
    tuples. And it's greatest feature for a developer!
    I don't think this is because of the "NULL bitmap".
    No, it isn't. It's because each tuple includes the actual count of
    fields it contains (t_natts or HeapTupleHeaderGetNatts), and the value
    extraction routines are coded to assume that references to fields
    beyond that number should yield NULL. So the ALTER can just leave
    the existing rows alone --- only when you update a row will it change
    to include the newly added field(s).

    AFAICS there's no good way to scale that solution up to handling
    non-null values.
    All that needs to be tracked is the "first" default value (this is
    currently assumed to be NULL).
    You're being a bit vague, but in any case I don't think it can work
    for non-constant defaults (consider DEFAULT NOW()). And what about
    ALTER COLUMN DEFAULT?

    (BTW, I'm quite sure schemes like this have been discussed before.
    Check the archives...)

    regards, tom lane
  • Robert Haas at May 21, 2009 at 3:07 pm

    On Thu, May 21, 2009 at 10:45 AM, Tom Lane wrote:
    Sam Mason <sam@samason.me.uk> writes:
    On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
    ALTER TABLE ... ADD COLUMN ... NULL;

    (nullable without a default value). This is because of NULL bitmap in
    tuples. And it's greatest feature for a developer!
    I don't think this is because of the "NULL bitmap".
    No, it isn't.  It's because each tuple includes the actual count of
    fields it contains (t_natts or HeapTupleHeaderGetNatts), and the value
    extraction routines are coded to assume that references to fields
    beyond that number should yield NULL.  So the ALTER can just leave
    the existing rows alone --- only when you update a row will it change
    to include the newly added field(s).

    AFAICS there's no good way to scale that solution up to handling
    non-null values.
    All that needs to be tracked is the "first" default value (this is
    currently assumed to be NULL).
    You're being a bit vague, but in any case I don't think it can work
    for non-constant defaults (consider DEFAULT NOW()).  And what about
    ALTER COLUMN DEFAULT?
    I think what Sam is proposing is that, in addition to storing the
    default value for a column, you could store the value at the time the
    column was added. These might be the same if the default is a
    constant and has never been modified, or they might be different.
    This would even work for now() because it's stable, but it couldn't be
    used for a volatile function like random() or timeofday(), of course.

    ...Robert
  • Greg Stark at May 21, 2009 at 3:13 pm

    On Thu, May 21, 2009 at 3:45 PM, Tom Lane wrote:
    All that needs to be tracked is the "first" default value (this is
    currently assumed to be NULL).
    You're being a bit vague, but in any case I don't think it can work
    for non-constant defaults (consider DEFAULT NOW()).  And what about
    ALTER COLUMN DEFAULT?

    (BTW, I'm quite sure schemes like this have been discussed before.
    Check the archives...)
    Schemes like this have been discussed before but I don't think we
    considered applying the limitation that only the "first" default value
    would be covered. We always wanted to be able to handle new defaults
    or making a non-null column nullable later. I think the main
    motivation in the past was space savings for default values rather
    than making adding new non-null columns cheap.

    AFAICS as long as we only want to handle newly created non-nullable
    columns with initial defaults that are applied when the column is
    added then we could actually handle it by storing the initial default
    value in the catalog and keeping it in the tupledesc at all times.

    If you later change the default then it would only affect newly
    inserted rows which will have the new default value explicitly
    included anyways. And if likewise if you later make the row nullable
    the column will be explicitly listed in natts and the null bitmap.

    One gotcha I can think of is if the default expression depends on
    something like a type. If the default is later changed then people
    might be surprised to find there's still an invisible dependency on
    the type. But if it's limited to simple constants that should minimize
    that issue quite a bit. At least for types the column itself will have
    the same dependency anyways.

    --
    greg
  • Tom Lane at May 21, 2009 at 3:22 pm

    Greg Stark writes:
    Schemes like this have been discussed before but I don't think we
    considered applying the limitation that only the "first" default value
    would be covered. We always wanted to be able to handle new defaults
    or making a non-null column nullable later.
    Yeah ... I don't see exactly what it would buy to restrict it to just
    the first such value.

    regards, tom lane
  • Greg Stark at May 21, 2009 at 4:19 pm

    On Thu, May 21, 2009 at 4:22 PM, Tom Lane wrote:
    Greg Stark <stark@enterprisedb.com> writes:
    Schemes like this have been discussed before but I don't think we
    considered applying the limitation that only the "first" default value
    would be covered. We always wanted to be able to handle new defaults
    or making a non-null column nullable later.
    Yeah ... I don't see exactly what it would buy to restrict it to just
    the first such value.
    Well it wouldn't buy you steady-state space savings or performance improvements.

    What it would buy you is a much narrowed set of circumstances where
    ALTER TABLE ADD COLUMN goes from a fast O(1) catalog change to a
    complete table rewrite. The use cases covered such as "boolean DEFAULT
    false" or "integer DEFAULT 0" are extremely common.

    I think users today often avoid the full table rewrite either make
    their application treat null as implicitly the default value or do a
    piecemeal rewrite using updates.

    I think Robert Haas is right that we could handle any stable
    expression by evaluating the expression once and storing only the
    final resulting value as a constant. That would avoid the problems
    with dependencies and later changes to functions.

    Another gotcha is that the default value might be very large.... It
    can't be very common but I suppose we would have to take some care
    around that.

    --
    greg
  • Tom Lane at May 21, 2009 at 4:26 pm

    Greg Stark writes:
    On Thu, May 21, 2009 at 4:22 PM, Tom Lane wrote:
    Yeah ... I don't see exactly what it would buy to restrict it to just
    the first such value.
    Well it wouldn't buy you steady-state space savings or performance improvements.
    What it would buy you is a much narrowed set of circumstances where
    ALTER TABLE ADD COLUMN goes from a fast O(1) catalog change to a
    complete table rewrite. The use cases covered such as "boolean DEFAULT
    false" or "integer DEFAULT 0" are extremely common.
    No, you missed my point --- what's the value of having an implementation
    of this that only works for one column? If we do it, I'd envision it
    as an extra column in pg_attribute, and it would work for any column(s).
    There's nothing to be gained by restricting it.
    I think Robert Haas is right that we could handle any stable
    expression by evaluating the expression once and storing only the
    final resulting value as a constant. That would avoid the problems
    with dependencies and later changes to functions.
    Right, that's *necessary* to avoid changing semantics compared to
    the non-optimized behavior.
    Another gotcha is that the default value might be very large.... It
    can't be very common but I suppose we would have to take some care
    around that.
    Yeah, that occurred to me too. We'd probably not be able to toast
    the pg_attribute column (depending on exactly how it's
    declared/represented) so we'd have to put a limit on the width of
    data value that we'd be willing to handle this way. Seems doable
    though.

    regards, tom lane
  • Greg Stark at May 21, 2009 at 6:06 pm
    --
    Greg

    On 21 May 2009, at 12:26, Tom Lane wrote:

    Greg Stark <stark@enterprisedb.com> writes:
    On Thu, May 21, 2009 at 4:22 PM, Tom Lane wrote:
    Yeah ... I don't see exactly what it would buy to restrict it to
    just
    the first such value.
    Well it wouldn't buy you steady-state space savings or performance
    improvements.
    What it would buy you is a much narrowed set of circumstances where
    ALTER TABLE ADD COLUMN goes from a fast O(1) catalog change to a
    complete table rewrite. The use cases covered such as "boolean
    DEFAULT
    false" or "integer DEFAULT 0" are extremely common.
    No, you missed my point --- what's the value of having an
    implementation
    of this that only works for one column? If we do it, I'd envision it
    as an extra column in pg_attribute, and it would work for any
    column(s).
    There's nothing to be gained by restricting it.
    Oh, I never meant to restrict it to one column.

    It might be nice to have vacuum notice the minimum natts in the table
    and trim the old unneeded ones. But I can't think of a very compelling
    reason to. Perhaps to save memory used in tuplestores?




    I think Robert Haas is right that we could handle any stable
    expression by evaluating the expression once and storing only the
    final resulting value as a constant. That would avoid the problems
    with dependencies and later changes to functions.
    Right, that's *necessary* to avoid changing semantics compared to
    the non-optimized behavior.
    I'm coming at it from the other direction. I was assuming we could
    only handle simple constants and am trying to see how wide we can
    expand it. Doing all stable expressions would seem pretty convincingly
    wide to me.

    Another gotcha is that the default value might be very large.... It
    can't be very common but I suppose we would have to take some care
    around that.
    Yeah, that occurred to me too. We'd probably not be able to toast
    the pg_attribute column (depending on exactly how it's
    declared/represented) so we'd have to put a limit on the width of
    data value that we'd be willing to handle this way. Seems doable
    though.

    regards, tom lane
  • Sam Mason at May 22, 2009 at 9:34 am

    On Thu, May 21, 2009 at 12:26:08PM -0400, Tom Lane wrote:
    I'd envision it
    as an extra column in pg_attribute, and it would work for any column(s).
    There's nothing to be gained by restricting it.
    Yes, when I said "first" I meant the only thing that needs to be stored
    is the first default value for the column. The code currently assumes
    that this first default value is NULL and is hence able to optimize the
    case where you adding a NULLable column. Adding this first default
    value to pg_attribute would allow you to break this assumption and
    allow the user to have non-NULL default values without complete table
    rewrites.

    I think the discussion covered this, but I don't think I was
    particularly clear in my first message and thought I should attempt to
    explain what I was saying. Other comments about only allowing STABLE
    expressions and non-toastable values make sense and were the sorts of
    things I thought I would miss.
  • Robert Haas at May 21, 2009 at 3:38 pm

    On Thu, May 21, 2009 at 11:13 AM, Greg Stark wrote:
    On Thu, May 21, 2009 at 3:45 PM, Tom Lane wrote:

    All that needs to be tracked is the "first" default value (this is
    currently assumed to be NULL).
    You're being a bit vague, but in any case I don't think it can work
    for non-constant defaults (consider DEFAULT NOW()).  And what about
    ALTER COLUMN DEFAULT?

    (BTW, I'm quite sure schemes like this have been discussed before.
    Check the archives...)
    Schemes like this have been discussed before but I don't think we
    considered applying the limitation that only the "first" default value
    would be covered. We always wanted to be able to handle new defaults
    or making a non-null column nullable later. I think the main
    How would this scheme prevent you from doing that? The tuples that
    are added subsequent to the initial ADD COLUMN will have the value for
    that column in it, whether NULL or otherwise, default or not. The
    only thing you need to store is the value to be used (in place of
    NULL) when reading an old, short tuple.
    One gotcha I can think of is if the default expression depends on
    something like a type. If the default is later changed then people
    might be surprised to find there's still an invisible dependency on
    the type. But if it's limited to simple constants that should minimize
    that issue quite a bit. At least for types the column itself will have
    the same dependency anyways.
    I was also wondering whether there might be some gotchas in this area.

    ...Robert
  • Dmitry Koterov at May 22, 2009 at 8:30 pm

    On Thu, May 21, 2009 at 6:45 PM, Tom Lane wrote:

    Sam Mason <sam@samason.me.uk> writes:
    On Thu, May 21, 2009 at 12:06:29PM +0400, Dmitry Koterov wrote:
    ALTER TABLE ... ADD COLUMN ... NULL;

    (nullable without a default value). This is because of NULL bitmap in
    tuples. And it's greatest feature for a developer!
    I don't think this is because of the "NULL bitmap".
    No, it isn't. It's because each tuple includes the actual count of
    fields it contains (t_natts or HeapTupleHeaderGetNatts), and the value
    extraction routines are coded to assume that references to fields
    beyond that number should yield NULL. So the ALTER can just leave
    the existing rows alone --- only when you update a row will it change
    to include the newly added field(s).
    No, I meant that in case of the row (1, NULL, NULL, 2, 3, NULL):
    - the corresponding NULL bitmap is (100110...)
    - the corresponding tuple is (1, 2, 3)
    - t_natts=3 (if I am not wrong here)

    And in case of the row (5, 6, NULL, 7, 8, 9):
    - the corresponding NULL bitmap is (110111...)
    - the corresponding tuple is (5, 6, 7, 9)
    - t_natts=4

    So, without a NULL bitmap, we cannot handle this kind of rows at all by
    t_natts only. And the NULL bitmap plays very important role in tuple saving,
    and I meant exactly that point.
  • Tom Lane at May 24, 2009 at 6:48 pm

    Dmitry Koterov writes:
    No, I meant that in case of the row (1, NULL, NULL, 2, 3, NULL):
    - the corresponding NULL bitmap is (100110...)
    - the corresponding tuple is (1, 2, 3)
    - t_natts=3 (if I am not wrong here)
    You are wrong --- t_natts would be six here. In general the length of
    the null bitmap in a tuple (if it has one at all) is always exactly
    equal to its t_natts value.

    regards, tom lane
  • Dmitry Koterov at May 28, 2009 at 4:17 pm

    Dmitry Koterov <dmitry@koterov.ru> writes:
    No, I meant that in case of the row (1, NULL, NULL, 2, 3, NULL):
    - the corresponding NULL bitmap is (100110...)
    - the corresponding tuple is (1, 2, 3)
    - t_natts=3 (if I am not wrong here)
    You are wrong --- t_natts would be six here. In general the length of
    the null bitmap in a tuple (if it has one at all) is always exactly
    equal to its t_natts value.
    And so, the real number of values in the tuple - (1, 2, 3) above - is equal
    to the number of 1-bits in NULL bitmap. And the size of NULL bitmap is held
    in t_natts. I meant that when I said "thanks to NULL bitmap, adding a new
    nullable column is cheap". :-) And, of course, thanks to t_natts
    (HeapTupleHeaderGetNatts macro) - too.
  • Robert Haas at May 21, 2009 at 11:44 am

    (In most cases NOT NULL DEFAULT xxx fields are BOOLEAN, flags: it is not
    handy to have 3-way flags.)
    This is certainly not true for me. I have both nullable booleans and
    not-nullable, defaulted columns of other types.
    So, are there plans to optimize such kind of queries? This could be done by
    many ways:

    1. Store the DEFAULT flag directly in NULL BITMAP (add a bit to NULL bitmap
    not only for NULLable fields, but also for NOT NULL DEFAULT ... fields).
    2. Add another bitmap for each tuple (DEFAULT bitmap). Bit value 0 means
    that there is a real value in a cell, 1 - that the value is default.
    Both of these options would expand storage usage by a not-inconsiderable amount.
    3. The same as (1), but always force default value to be 0 (or false or any
    other values with meaning "zero") and optimize only these cases.
    I'm not sure there's any way to know this for an arbitrary data type.

    ...Robert

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedMay 21, '09 at 8:06a
activeMay 28, '09 at 4:17p
posts16
users7
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase