Fetching data from just indexes has been discussed on this list several
times before, and it has been told that this can't be done with postgres
thanks to MVCC.

But this is true only when data is changing. In a data-warehousing
scenario what it is often needed is a possibility for fast querying of
static historical data.

If table has been VACUUM'ed or REINDEX'ed after last change
(update/delete/insert), then there is 1-1 correspondence between table
and indexes, and thus no need to fetch the tuple from heap for extra
visibility checks.

What I am proposing is

1) possibility to explicitly change table status to READ-ONLY .

2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
that are READ-ONLY

3) changing postgres planner/executor to make use of this flag, by not
going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true.

--
Hannu Krosing <hannu@skype.net>

Search Discussions

  • Hannu Krosing at Apr 22, 2005 at 3:34 pm
    Fetching data from just indexes has been discussed on this list several
    times before, and it has been told that this can't be done with postgres
    thanks to MVCC.

    But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.

    If table has been VACUUM'ed or REINDEX'ed after last change
    (update/delete/insert), then there is 1-1 correspondence between table
    and indexes, and thus no need to fetch the tuple from heap for extra
    visibility checks.

    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .

    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY

    3) changing postgres planner/executor to make use of this flag, by not
    going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true.

    --
    Hannu Krosing <hannu@skype.net>
  • Bruce Momjian at Apr 22, 2005 at 3:40 pm
    See this TODO:

    * Allow data to be pulled directly from indexes

    Currently indexes do not have enough tuple visibility information
    to allow data to be pulled from the index without also accessing
    the heap. One way to allow this is to set a bit to index tuples
    to indicate if a tuple is currently visible to all transactions
    when the first valid heap lookup happens. This bit would have to
    be cleared when a heap tuple is expired.

    I think this is the direction we should be heading because it has more
    general usefulness.

    ---------------------------------------------------------------------------

    Hannu Krosing wrote:
    Fetching data from just indexes has been discussed on this list several
    times before, and it has been told that this can't be done with postgres
    thanks to MVCC.

    But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.

    If table has been VACUUM'ed or REINDEX'ed after last change
    (update/delete/insert), then there is 1-1 correspondence between table
    and indexes, and thus no need to fetch the tuple from heap for extra
    visibility checks.

    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .

    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY

    3) changing postgres planner/executor to make use of this flag, by not
    going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true.

    --
    Hannu Krosing <hannu@skype.net>


    ---------------------------(end of broadcast)---------------------------
    TIP 5: Have you checked our extensive FAQ?

    http://www.postgresql.org/docs/faq
    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Hannu Krosing at Apr 22, 2005 at 4:25 pm

    On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote:
    See this TODO:

    * Allow data to be pulled directly from indexes

    Currently indexes do not have enough tuple visibility information
    to allow data to be pulled from the index without also accessing
    the heap. One way to allow this is to set a bit to index tuples
    to indicate if a tuple is currently visible to all transactions
    when the first valid heap lookup happens.
    I don't think this is implementable in any reasonably cheap way (i.e.
    this will be a general performance loss).

    This has all the downsides of storing full visibility in index tuples,
    except the size.
    This bit would have to be cleared when a heap tuple is expired.
    Does "expired" here mean marked for deletion ?

    This can be prohibitively pricey for big tables with lots of indexes, as
    marking the tuple means alsn finding and possibly writing to all index
    enytries pointing to this tuple.
    I think this is the direction we should be heading because it has more
    general usefulness.
    OTOH this will probably never be implemented for the same reason that
    full visibility in index tuples will not, whereas my proposition can be
    implemented quite easily (it's just a SMOP).
    ---------------------------------------------------------------------------

    Hannu Krosing wrote:
    Fetching data from just indexes has been discussed on this list several
    times before, and it has been told that this can't be done with postgres
    thanks to MVCC.

    But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.

    If table has been VACUUM'ed or REINDEX'ed after last change
    (update/delete/insert), then there is 1-1 correspondence between table
    and indexes, and thus no need to fetch the tuple from heap for extra
    visibility checks.

    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .

    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY

    3) changing postgres planner/executor to make use of this flag, by not
    going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true.

    --
    Hannu Krosing <hannu@skype.net>


    ---------------------------(end of broadcast)---------------------------
    TIP 5: Have you checked our extensive FAQ?

    http://www.postgresql.org/docs/faq
    --
    Hannu Krosing <hannu@skype.net>
  • Bruce Momjian at Apr 22, 2005 at 5:14 pm

    Hannu Krosing wrote:
    On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote:
    See this TODO:

    * Allow data to be pulled directly from indexes

    Currently indexes do not have enough tuple visibility information
    to allow data to be pulled from the index without also accessing
    the heap. One way to allow this is to set a bit to index tuples
    to indicate if a tuple is currently visible to all transactions
    when the first valid heap lookup happens.
    I don't think this is implementable in any reasonably cheap way (i.e.
    this will be a general performance loss).

    This has all the downsides of storing full visibility in index tuples,
    except the size.
    Yea, I suppose.
    This bit would have to be cleared when a heap tuple is expired.
    Does "expired" here mean marked for deletion ? Yes.
    This can be prohibitively pricey for big tables with lots of indexes, as
    marking the tuple means alsn finding and possibly writing to all index
    enytries pointing to this tuple.
    Yep, it could be very ugly, but it would help with our COUNT(*) problem
    too. Isn't there a solution? If there isn't, I can remove the TODO
    item.
    I think this is the direction we should be heading because it has more
    general usefulness.
    OTOH this will probably never be implemented for the same reason that
    full visibility in index tuples will not, whereas my proposition can be
    implemented quite easily (it's just a SMOP).
    I would like to find something that has more general usefulness.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Russell Smith at Apr 23, 2005 at 3:44 am

    On Sat, 23 Apr 2005 03:14 am, Bruce Momjian wrote:
    Hannu Krosing wrote:
    On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote:
    See this TODO:

    * Allow data to be pulled directly from indexes

    Currently indexes do not have enough tuple visibility information
    to allow data to be pulled from the index without also accessing
    the heap. One way to allow this is to set a bit to index tuples
    to indicate if a tuple is currently visible to all transactions
    when the first valid heap lookup happens.
    Storing visibility information in the index has always been put down as a cause of
    performance problems. Would it be plausible to have an index type that included the
    information and one that didn't. You could choose which way you wanted to go.

    I know especially for some tables, I would choose this index with visibility as it would
    increase performance by not looking at the table at all for that information (count being a good example). However
    for general purpose indexes I would use the normal index with no visibility information.

    The possibly of the bit method or full tuples is probably a decision for others, but having
    the flexibility to choose in this would be a great thing.

    Regards

    Russell Smith
  • Bruce Momjian at Apr 23, 2005 at 4:03 am

    Russell Smith wrote:
    On Sat, 23 Apr 2005 03:14 am, Bruce Momjian wrote:
    Hannu Krosing wrote:
    On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote:
    See this TODO:

    * Allow data to be pulled directly from indexes

    Currently indexes do not have enough tuple visibility information
    to allow data to be pulled from the index without also accessing
    the heap. One way to allow this is to set a bit to index tuples
    to indicate if a tuple is currently visible to all transactions
    when the first valid heap lookup happens.
    Storing visibility information in the index has always been put down
    as a cause of performance problems. Would it be plausible to have an
    index type that included the information and one that didn't. You
    could choose which way you wanted to go.

    I know especially for some tables, I would choose this index with
    visibility as it would increase performance by not looking at the table
    at all for that information (count being a good example). However for
    general purpose indexes I would use the normal index with no visibility
    information.

    The possibly of the bit method or full tuples is probably a decision
    for others, but having the flexibility to choose in this would be a
    great thing.
    I thought a little bit about the complexity. I think we should use a
    heap bit to say if the index "all visible" bit is set. The idea is that
    you only set the index "all visible" bit after the tuple has been
    visible for a while, perhaps 1000 transactions, and a GUC could countrol
    that number. I assume that will reduce the overhead of clearing the
    index "all visible" bit on an UPDATE or DELETE to an acceptable load.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Hannu Krosing at Apr 25, 2005 at 10:49 am

    On L, 2005-04-23 at 00:02 -0400, Bruce Momjian wrote:
    Russell Smith wrote:
    ...
    I know especially for some tables, I would choose this index with
    visibility as it would increase performance by not looking at the table
    at all for that information (count being a good example). However for
    general purpose indexes I would use the normal index with no visibility
    information.

    The possibly of the bit method or full tuples is probably a decision
    for others, but having the flexibility to choose in this would be a
    great thing.
    I thought a little bit about the complexity. I think we should use a
    heap bit to say if the index "all visible" bit is set. The idea is that
    you only set the index "all visible" bit after the tuple has been
    visible for a while, perhaps 1000 transactions, and a GUC could countrol
    that number. I assume that will reduce the overhead of clearing the
    index "all visible" bit on an UPDATE or DELETE to an acceptable load.
    Having the full index either CHECK-FROM-HEAP or INDEX-ONLY makes
    planning decisions much simpler. If the index has some leafs that
    require heap access and others that don't, we need to also start
    keeping some statistics about what proportion of index lookups result in
    heap accesses, possibly correlated with certain index regions.

    I think that starting from something simple (like read-only tables :)
    gives most gain for effort.

    --
    Hannu Krosing <hannu@tm.ee>
  • Hannu Krosing at Apr 25, 2005 at 10:49 am

    On R, 2005-04-22 at 13:14 -0400, Bruce Momjian wrote:
    This can be prohibitively pricey for big tables with lots of indexes, as
    marking the tuple means alsn finding and possibly writing to all index
    enytries pointing to this tuple.
    Yep, it could be very ugly, but it would help with our COUNT(*) problem
    too. Isn't there a solution? If there isn't, I can remove the TODO
    item.
    I think this is the direction we should be heading because it has more
    general usefulness.
    OTOH this will probably never be implemented for the same reason that
    full visibility in index tuples will not, whereas my proposition can be
    implemented quite easily (it's just a SMOP).
    I would like to find something that has more general usefulness.
    So would I :)

    But I assure you that this would be generally usefull in DataWarehousing
    applications, where you have to play partitioning tricks anyway and
    making some sub-table RO and running REINDEX on it would add little
    complexity..

    This would ease my current problem of running queries over tables with
    more than >100 M rows and getting the results in reasonable time.

    My setup is a big logical table, which consists of many inherited
    tables, filled one after another from OLAP database. After each 5 to 10
    M rows old table is frozen and new table started.

    So when I run a query that uses an index, which does not correlate well
    with primary_key and timestamp, postgres finds the needed rows from
    index quickly and spends most of the time waiting for seeks from heap-
    tuple accesses for visibility checks, which is pure wasting of resources
    as they all succeed.

    I guess avoiding heap tuple lookups could make some of the queries run
    10's maybe 100's of times faster, as index tuples are naturally
    clustered.


    --
    Hannu Krosing <hannu@tm.ee>
  • Jim C. Nasby at Apr 22, 2005 at 6:18 pm
    You should read the archives of this list; there was a pretty long
    thread about this a few months ago. IIRC the consensus after much debate
    was that this feature would add benefit in many instances, especially on
    large tables where only a small amount of data changes.

    Also, I think there is value to supporting read only tables, in addition
    to the index visibility info. I also like the idea of being able to have
    heap tuples that don't have visibility information, though I'm not sure
    how hard that would be to do. But it would certainly be useful to pull
    the version bytes out of a 200M row table.
    On Fri, Apr 22, 2005 at 07:25:19PM +0300, Hannu Krosing wrote:
    On R, 2005-04-22 at 11:40 -0400, Bruce Momjian wrote:
    See this TODO:

    * Allow data to be pulled directly from indexes

    Currently indexes do not have enough tuple visibility information
    to allow data to be pulled from the index without also accessing
    the heap. One way to allow this is to set a bit to index tuples
    to indicate if a tuple is currently visible to all transactions
    when the first valid heap lookup happens.
    I don't think this is implementable in any reasonably cheap way (i.e.
    this will be a general performance loss).

    This has all the downsides of storing full visibility in index tuples,
    except the size.
    This bit would have to be cleared when a heap tuple is expired.
    Does "expired" here mean marked for deletion ?

    This can be prohibitively pricey for big tables with lots of indexes, as
    marking the tuple means alsn finding and possibly writing to all index
    enytries pointing to this tuple.
    I think this is the direction we should be heading because it has more
    general usefulness.
    OTOH this will probably never be implemented for the same reason that
    full visibility in index tuples will not, whereas my proposition can be
    implemented quite easily (it's just a SMOP).
    ---------------------------------------------------------------------------

    Hannu Krosing wrote:
    Fetching data from just indexes has been discussed on this list several
    times before, and it has been told that this can't be done with postgres
    thanks to MVCC.

    But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.

    If table has been VACUUM'ed or REINDEX'ed after last change
    (update/delete/insert), then there is 1-1 correspondence between table
    and indexes, and thus no need to fetch the tuple from heap for extra
    visibility checks.

    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .

    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY

    3) changing postgres planner/executor to make use of this flag, by not
    going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true.

    --
    Hannu Krosing <hannu@skype.net>


    ---------------------------(end of broadcast)---------------------------
    TIP 5: Have you checked our extensive FAQ?

    http://www.postgresql.org/docs/faq
    --
    Hannu Krosing <hannu@skype.net>

    ---------------------------(end of broadcast)---------------------------
    TIP 9: the planner will ignore your desire to choose an index scan if your
    joining column's datatypes do not match
    --
    Jim C. Nasby, Database Consultant decibel@decibel.org
    Give your computer some brain candy! www.distributed.net Team #1828

    Windows: "Where do you want to go today?"
    Linux: "Where do you want to go tomorrow?"
    FreeBSD: "Are you guys coming, or what?"
  • Ron Mayer at Apr 23, 2005 at 10:00 pm

    Bruce Momjian wrote:
    See this TODO: * Allow data to be pulled directly from indexes
    I think this is the direction we should be heading because it has more
    general usefulness.
    I think read-only tables would have a few different types
    of general usefulness in addition to enabling index scans.

    Is this a fair summary of the potential benefits of READ-ONLY
    tables? (from both this thread and the archives):

    1. Index-only scans are made possible fairly easily because
    you wouldn't need to check the heap for visibility.

    2. Simple tables can be much smaller since you don't need
    most of the HeapTupleHeaderData. This reduction in
    space translates to a reduction in I/O through better
    use of the shared memory and OS caches.

    3. A Read-Only Clustered table could be assumed to be
    sorted, so you could avoid some Sort steps
    for things like GroupAggregates and Merge Joins.

    Any others?

    The biggest/slowest tables in my database happen to be
    entirely read only (road network and other GIS features
    for the country; and data warehouse tables representing
    previous quarter's data).
  • Tom Lane at Apr 23, 2005 at 10:27 pm

    Ron Mayer writes:
    Is this a fair summary of the potential benefits of READ-ONLY
    tables? (from both this thread and the archives):
    1. Index-only scans are made possible fairly easily because
    you wouldn't need to check the heap for visibility.
    2. Simple tables can be much smaller since you don't need
    most of the HeapTupleHeaderData.
    What you are talking about is not a "read only" table, it is a
    "non-MVCC" table. This is a much greater assault on the fundamental
    semantics of Postgres than it's being painted to be in this thread.
    In particular, how is such a table going to come into being? You'd
    not be able to just flip the READ-ONLY flag on and off.

    (The notion of having tuples in the system that don't have the standard
    HeapTupleHeader is not as easy to implement as you might think, either,
    because that data structure is *everywhere*.)

    While I don't say it's impossible to do, I do think that the work and
    semantic ugliness involved would outweigh the possible benefits. In
    particular, there are other, more transparent ways of doing #1.

    regards, tom lane
  • Alvaro Herrera at Apr 23, 2005 at 11:25 pm

    On Sat, Apr 23, 2005 at 06:27:38PM -0400, Tom Lane wrote:

    While I don't say it's impossible to do, I do think that the work and
    semantic ugliness involved would outweigh the possible benefits. In
    particular, there are other, more transparent ways of doing #1.
    One idea that may be closer to reality might be implementing heaps that
    behave as indexes (or indexes that behave as heaps), or clustered
    indexes as some other database call them; the main idea being that at
    the leaf nodes of the index, the whole heap tuple resides instead of
    only a CTID.

    One problem I see with that approach is the maximum tuple size ... our
    current btree index code can't handle tuples bigger than (BLCKSZ/3)
    IIRC, some 2 kB.

    --
    Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
    "Vivir y dejar de vivir son soluciones imaginarias.
    La existencia está en otra parte" (Andre Breton)
  • Tom Lane at Apr 23, 2005 at 11:33 pm

    Alvaro Herrera writes:
    One idea that may be closer to reality might be implementing heaps that
    behave as indexes (or indexes that behave as heaps), or clustered
    indexes as some other database call them; the main idea being that at
    the leaf nodes of the index, the whole heap tuple resides instead of
    only a CTID.
    The main problem with this is that you'd not have a stable TID for a
    given tuple, since it might be forced to move by operations such as
    index page splits. It might be impractical to support any secondary
    indexes on such a table --- but I can definitely see that there are
    applications that wouldn't care.
    One problem I see with that approach is the maximum tuple size ... our
    current btree index code can't handle tuples bigger than (BLCKSZ/3)
    IIRC, some 2 kB.
    So you toast 'em ... I don't see this as a fatal drawback.

    regards, tom lane
  • Paul Tillotson at Apr 24, 2005 at 1:56 am

    2. Simple tables can be much smaller since you don't need
    most of the HeapTupleHeaderData.
    What you are talking about is not a "read only" table, it is a
    "non-MVCC" table. This is a much greater assault on the fundamental
    semantics of Postgres than it's being painted to be in this thread.
    In particular, how is such a table going to come into being? You'd
    not be able to just flip the READ-ONLY flag on and off.

    I can see it now....

    Nov 21, 2005: Earlier today, the PostgreSQL Global Development group
    issued a press release announcing the availability of PostMySQL 10g.
    PostMySQL 10g continues PostgreSQL's venerable legacy of doing things
    the Right Way, but brings great performance enhancements to read-only or
    light-update loads through the introduction of PostMyISAM tables.
    PostMyISAM tables are eagerly awaited by the data warehousing community
    because of their more compact on-disk representation and because their
    lack of tuple visibility information allows queries to be answered
    directly from the index.

    The move is widely seen as an effort to gain market share at the expense
    of MySQL. MySQL AB CEO Marten Mickos was quick to point out that the
    new features made PostgreSQL much more like MySQL, but that their
    implementation had serious flaws.

    "Their [PostMySQL's] implementation is nearly useless in an enterprise
    OLAP environment. ... In a double-blind test of the two systems, users
    preferred the mature MySQL implementation that does the Best Thing Under
    the Circumstances (tm) rather than PostMySQL's fixation with data
    integrity. While attempting to load the sample data before running
    benchmarks the user found in 'unbelievably' slow. After consulting with
    PostgreSQL experts, the user was instructed to tune various default
    configuration parameters and wrap the inserts in a transaction. After
    doing so, the loading proceeded at a reasonable speed.

    "However, it was discovered that one of the records in the 120 gigabyte
    import data had February 29, 2003 stored in a date field. The PostMySQL
    database engine refused to store this date, and rolled the entire
    inserting transaction back. Apparently this is a non-trivial action for
    the PostMyISAM tables, as after 15 minutes the user thought that the
    database had locked up and terminated the postmaster process, causing
    more trouble when the database would not start. [Editor's comment: the
    corruption was blamed on a bug introduced during the somewhat hasty
    rewrite of the entire PostgreSQL code base in C++.] After doing initdb
    again and editing the import data by hand, the user managed to load the
    data.

    "In conclusion, ... the new PostMyISAM tables negate all of PostMySQL's
    claimed ACIDity and concurrency benefits, while we [MySQLAB] offer a
    mature, stable implementation that requires much less work to configure,
    is more flexible and forgiving of the surprises found in real-world
    databases, and enjoys a much larger user community and comprehensive
    suite of 3rd party tools."
  • Hannu Krosing at Apr 25, 2005 at 10:47 am

    On L, 2005-04-23 at 18:27 -0400, Tom Lane wrote:
    Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
    Is this a fair summary of the potential benefits of READ-ONLY
    tables? (from both this thread and the archives):
    1. Index-only scans are made possible fairly easily because
    you wouldn't need to check the heap for visibility.
    2. Simple tables can be much smaller since you don't need
    most of the HeapTupleHeaderData.
    What you are talking about is not a "read only" table, it is a
    "non-MVCC" table. This is a much greater assault on the fundamental
    semantics of Postgres than it's being painted to be in this thread.
    That's why I proposed a much less ambitious, and much more "low-hanging-
    fruit-first" compliant thing when I satrted the thread - a simple two-
    flag system to mark a relation as safe to use for index-only queries.

    Main use of that would be in partiotioned table setups in data
    warehouses, where older partitions can be switched to read-only state.

    and such all-or-nothing scheme would also make it much easier to
    estimate cost of index(only)scan.
    In particular, how is such a table going to come into being? You'd
    not be able to just flip the READ-ONLY flag on and off.
    original idea (with HeapTupleHeader intact) was to first mark the table
    as READ-ONLY, and then either run VACUUM, or preferrably VACUUM-FULL +
    REINDEX TABLE on it, so that all index tuples point to valid and visible
    tuples.
    A check must be made to make sure, that all transactions started before
    setting the READ-ONLY flag have finished before starting VACUUM or
    REINDEX.
    (The notion of having tuples in the system that don't have the standard
    HeapTupleHeader is not as easy to implement as you might think, either,
    because that data structure is *everywhere*.)
    My impression was, that HeapTupleHeader is usually not carried with
    fields after doing the initial visibility checks ?

    But if it is needed, then it should be added when generating tuples from
    index scan, preferrably in such a way, that non-SELECT queries get these
    tuples with Xids set in a way which prevent them from being modified.
    While I don't say it's impossible to do, I do think that the work and
    semantic ugliness involved would outweigh the possible benefits. In
    particular, there are other, more transparent ways of doing #1.
    If HeapTupleHeader is essential for pg, than #1 should generate fake
    HeapTupleHeader with some constant values (xmin=FrozenTransactionId,
    xmax=MAXINT).


    BTW, do we really store tableoid column in heap tuples or is it added
    somewhere on the way from heap ?

    --
    Hannu Krosing <hannu@tm.ee>
  • Hannu Krosing at Apr 25, 2005 at 10:54 am
    I send it now the 3rd time because I'm not sure my mail works, as it has not appeared on pgsql-hackers

    On L, 2005-04-23 at 18:27 -0400, Tom Lane wrote:
    Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
    Is this a fair summary of the potential benefits of READ-ONLY
    tables? (from both this thread and the archives):
    1. Index-only scans are made possible fairly easily because
    you wouldn't need to check the heap for visibility.
    2. Simple tables can be much smaller since you don't need
    most of the HeapTupleHeaderData.
    What you are talking about is not a "read only" table, it is a
    "non-MVCC" table. This is a much greater assault on the fundamental
    semantics of Postgres than it's being painted to be in this thread.
    That's why I proposed a much less ambitious, and much more "low-hanging-
    fruit-first" compliant thing when I satrted the thread - a simple two-
    flag system to mark a relation as safe to use for index-only queries.

    Main use of that would be in partiotioned table setups in data
    warehouses, where older partitions can be switched to read-only state.

    and such all-or-nothing scheme would also make it much easier to
    estimate cost of index(only)scan.
    In particular, how is such a table going to come into being? You'd
    not be able to just flip the READ-ONLY flag on and off.
    original idea (with HeapTupleHeader intact) was to first mark the table
    as READ-ONLY, and then either run VACUUM, or preferrably VACUUM-FULL +
    REINDEX TABLE on it, so that all index tuples point to valid and visible
    tuples.
    A check must be made to make sure, that all transactions started before
    setting the READ-ONLY flag have finished before starting VACUUM or
    REINDEX.
    (The notion of having tuples in the system that don't have the standard
    HeapTupleHeader is not as easy to implement as you might think, either,
    because that data structure is *everywhere*.)
    My impression was, that HeapTupleHeader is usually not carried with
    fields after doing the initial visibility checks ?

    But if it is needed, then it should be added when generating tuples from
    index scan, preferrably in such a way, that non-SELECT queries get these
    tuples with Xids set in a way which prevent them from being modified.
    While I don't say it's impossible to do, I do think that the work and
    semantic ugliness involved would outweigh the possible benefits. In
    particular, there are other, more transparent ways of doing #1.
    If HeapTupleHeader is essential for pg, than #1 should generate fake
    HeapTupleHeader with some constant values (xmin=FrozenTransactionId,
    xmax=MAXINT).


    BTW, do we really store tableoid column in heap tuples or is it added
    somewhere on the way from heap ?

    --
    Hannu Krosing <hannu@skype.net>
  • Jochem van Dieten at Apr 25, 2005 at 11:43 am

    On 4/24/05, Tom Lane wrote:

    What you are talking about is not a "read only" table, it is a
    "non-MVCC" table. This is a much greater assault on the fundamental
    semantics of Postgres than it's being painted to be in this thread.
    In particular, how is such a table going to come into being?
    You need an operation that rewrites the entire table and all indexes
    at the same time. That pretty much means the only sensible way would
    be a special form of CLUSTER (including al the locking uglyness).
    If you need to make the table READ-WRITE again you need to run CLUSTER
    against it so the tuples get visibility information again and the fill
    factor of the indexes is reset to 0.7.


    But I have to agree with Ron that this probably makes much more sense
    when you have partitioned tables and do this on partitions instead of
    a full table. Archived might be a better word for it then read-only.

    Jochem
  • Simon Riggs at Apr 27, 2005 at 1:17 pm

    On Sat, 2005-04-23 at 15:04 -0700, Ron Mayer wrote:
    Bruce Momjian wrote:
    See this TODO: * Allow data to be pulled directly from indexes
    I think this is the direction we should be heading because it has more
    general usefulness.
    I think read-only tables would have a few different types
    of general usefulness in addition to enabling index scans.

    Is this a fair summary of the potential benefits of READ-ONLY
    tables? (from both this thread and the archives):

    1. Index-only scans are made possible fairly easily because
    you wouldn't need to check the heap for visibility.

    2. Simple tables can be much smaller since you don't need
    most of the HeapTupleHeaderData. This reduction in
    space translates to a reduction in I/O through better
    use of the shared memory and OS caches.

    3. A Read-Only Clustered table could be assumed to be
    sorted, so you could avoid some Sort steps
    for things like GroupAggregates and Merge Joins.
    4. No need to VACUUM tables as part of a database-wide VACUUM, which is
    particularly important for larger databases.

    Best Regards, Simon Riggs
  • Jochem van Dieten at Apr 22, 2005 at 3:53 pm

    On 4/22/05, Hannu Krosing wrote:
    Fetching data from just indexes has been discussed on this list several
    times before, and it has been told that this can't be done with postgres
    thanks to MVCC.

    But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.

    If table has been VACUUM'ed or REINDEX'ed after last change
    (update/delete/insert), then there is 1-1 correspondence between table
    and indexes, and thus no need to fetch the tuple from heap for extra
    visibility checks.

    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .

    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY
    Why stop at indexes? If you switch to readonly status with a CLUSTER
    or ALTER TABLE command you can even remove the visibility information
    from the heap tuples. Visibility would be exclusively controlled by
    the visibility of the table in the catalog, i.e. all or nothing.

    Jochem
  • Ron Mayer at Apr 22, 2005 at 7:16 pm

    Jochem van Dieten wrote:
    On 4/22/05, Hannu Krosing wrote:
    ...But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.
    And when we get partitioning, I think many data warehouses will have
    the bulk of their data in tables like that (previous quarters marked
    read-only, current quarter growing).
    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .
    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY
    Why stop at indexes? If you switch to readonly status with a CLUSTER
    or ALTER TABLE command you can even remove the visibility information
    from the heap tuples. Visibility would be exclusively controlled by
    the visibility of the table in the catalog, i.e. all or nothing.
    Seems this could reduce the size of some data warehouses considerably
    too. My biggest tables have nothing but columns of IDs; and the
    30-some bytes of the row header is a significant fraction of the
    entire size. I think the diskspace === I/O bandwidth savings on
    the heap would be almost as big a benefit as the indexes.
  • Simon Riggs at Apr 22, 2005 at 9:46 pm

    On Fri, 2005-04-22 at 18:34 +0300, Hannu Krosing wrote:
    Fetching data from just indexes has been discussed on this list several
    times before, and it has been told that this can't be done with postgres
    thanks to MVCC.

    But this is true only when data is changing. In a data-warehousing
    scenario what it is often needed is a possibility for fast querying of
    static historical data.

    If table has been VACUUM'ed or REINDEX'ed after last change
    (update/delete/insert), then there is 1-1 correspondence between table
    and indexes, and thus no need to fetch the tuple from heap for extra
    visibility checks.

    What I am proposing is

    1) possibility to explicitly change table status to READ-ONLY .

    2) setting a flag CAN_OMIT_HEAP_CHECK after REINDEX TABLE for tables
    that are READ-ONLY

    3) changing postgres planner/executor to make use of this flag, by not
    going to heap for tuples on tables where CAN_OMIT_HEAP_CHECK is true.
    Hannu,

    I'm with you on this, for all sorts of reasons. I've done a lot of
    detailed thought on this.

    As Ron says, until we have partitioning, it isn't as useful for DW.

    Best Regards, Simon Riggs
  • Ron Mayer at Apr 23, 2005 at 10:08 pm

    Simon Riggs wrote:

    As Ron says, until we have partitioning, it isn't as useful for DW.

    Well, it's a bit stronger than what I said. The last big DW I dealt
    with did have previous quarter's data archived into different tables,
    and those could be marked read-only. Also, quite a few of our
    ID<->Value lookup tables have a fixed or slowly changing set of
    values (products, states, etc); and those would benefit from index scans.

    But yes, it'd be an even nicer feature with partitioning, since then
    you could hypothetically keep inserting into a mostly-read-only table
    and mark partitions read-only as needed.


    I think I was trying to say it'd be even more useful for DW with
    partitioning - and I hope that when partitioning is being designed
    it considers the possibility of taking advantage of read-only
    partitions if we happen to implement read-only tables.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedApr 22, '05 at 3:17p
activeApr 27, '05 at 1:17p
posts23
users11
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase