Fact table with various update schedules

I have two datasets with the same grain level, for example the invoice number. Most of the required data is updated daily, as we recognize revenue from previous invoices. However, some of this data is provided through a separate costing system once a month and then sent to the data warehouse with additional information. Should I create one fact table that contains both datasets, and then run the update of the fact table once a month when the other data is imported, or should I create two fact tables due to the different update schedule. Data is related to each other, and many queries (~ 35%) need information from both datasets (when available). The system imports 30,000 rows per day into a fact table, it contains about 38,000,000 rows, a monthly update will affect 660,000 rows.

0
source share
1 answer

Provided that already existing measures will not be changed in the second stage, you can consider the fact table as “accumulating a snapshot”. The table describes the processes with the final start and the final types of workflows. Check it out in the Kimball datastore toolkit, or just google “Kimball Snapshot Fact Table”.

+4
source

All Articles