If you are using type 2 dimensions then run each query as of last night and
you don't have to worry about the newly loaded data.
On Behalf Of Tony Aponte
Sent: Wednesday, March 26, 2008 6:15 PM
To: debaditya.chatterjee_at_gmail.com; oracle-l_at_freelists.org
Subject: RE: Flashback query to ensure data consistency
The first thing I would say to you is to try the flashback/trigger strategy
you've worked out and see how it affects the user queries. I have flashed
back up to 10 days and still met query requirements.
Another approach is the Kimball real-time data warehouse strategy that
Dennis Williams already suggested.
A third approach I would pursue is to use Analytic Workspaces. With that
you can redirect the burden to the ETL workload instead and relieve the
queries from having to indo the transactions. At the end of the ETL
processing you would use the Workspace features to merge the changes that
have been made to the base tables.
Another approach could be to use Materialized Views and Query Rewrite
against the base tables. This would give you control over when to publish
the ETL updates by simply refreshing the MVs after the 8-hour jobs are
complete, yet still provide a consistent view of the previous data.
On Behalf Of Debaditya Chatterjee
Sent: Wednesday, March 26, 2008 4:42 PM
Subject: Flashback query to ensure data consistency
In a 24*7 data warehouse (accessed by users globally from different
timezones), incremental ETL jobs are executed to load data from different
source systems. Due to excessive volume of source data the ETL window has
grown to over 8 hours and users from a particular timezone (the most
affected ones) have started complaining about performance of the system and
also have become very vocal about data inconsistency. e.g. figures of a
sales report can change as the ETL tool commits every 10K records.
In order to give the users a consistent view of data we are thinking of
using the Flashback query feature of the database. i.e. When the ETL starts,
flashback query will be enabled and the user queries (through a logon
trigger) will run against the flashback data and once the ETL completes the
queries will run against the tables.
Although the solution seems feasible, but having flashback to work for more
than 8 hours seems to be our biggest challenge. I wanted to verify if you
all agree with this approach or would you suggest using a simpler (but
costly) approach like routing queries to a read only standby database while
ETL jobs execute on the primary database.
Your suggestions are welcome. The oracle database version is 10.2.0.3, the
ETL tool used is Informatica and the reporting tool used is Siebel