Fact tables are supposed to be nothing but foreign keys to dimension tables
plus "facts" - numeric measurements.
From the 10g Concepts manual, on the benefits of IOT -
"Index-organized tables provide faster access to table rows by the primary
key or any key that is a valid prefix of the primary key. Presence of nonkey
columns of a row in the B-tree leaf block itself avoids an additional block
access. Also, because rows are stored in primary key order, range access by
the primary key (or a valid prefix) involves minimum block accesses.
In order to allow even faster access to frequently accessed columns, you can
use a row overflow segment (as described later) to push out infrequently
accessed nonkey columns from the B-tree leaf block to an optional
(heap-organized) overflow segment. This allows limiting the size and content
of the portion of a row that is actually stored in the B-tree leaf block,
which may lead to a higher number of rows in each leaf block and a smaller
B-tree.
Unlike a configuration of heap-organized table with a primary key index
where primary key columns are stored both in the table and in the index,
there is no such duplication here because primary key column values are
stored only in the B-tree index.
Because rows are stored in primary key order, a significant amount of
additional storage space savings can be obtained through the use of key
compression.
Use of primary-key based logical rowids, as opposed to physical rowids, in
secondary indexes on index-organized tables allows high availability. This
is because, due to the logical nature of the rowids, secondary indexes do
not become unusable even after a table reorganization operation that causes
movement of the base table rows. At the same time, through the use of
physical guess in the logical rowid, it is possible to get secondary index
based index-organized table access performance that is comparable to
performance for secondary index based access to an ordinary table."
Since the fact tables will be the start point for the majority of queries I
want the access to the result set to be as fast as possible. A current
complaint is the length of time that it takes to get a result. I plan to
move most of the measurements to the overflow area and make the IOT portion
just the foreign keys to the dimension tables (the fact table's primary
key). That's what led me to the idea of a fact table as an IOT. The fact
table consists of it's primary key plus some measurements. Since I use an
index to enforce that primary key I have a situation where most of the data
that I want from the fact table can be satisfied from the index alone. I
decided that I might as well save the space of the table (since the table
and the index will basically duplicate each other's data) and make the table
index organized. I'm partitioning it by date since most queries are limited
to a single fiscal year.
But I need a mechanism to speed queries that are not via the primary key or
a valid prefix thereof. That's where the secondary indexes come in.
I'm a little dissapointed that IOT tables cannot be composite partitioned.
My dimension tables will be range partitioned by date (because of the date
limited nature of the queries) and hash subpartitioned to spread the I/O
over as many spindles as possilbe (each subpartition goes to its own
tablespace and each tablespace will be on a separate spindle).
However, that's just my thinking at the planning stage. If you have
experience with the flaws of using IOT fact tables I certainly want to learn
from that experience and not repeat a known mistake.
What are the problems with IOT fact tables?