![]() ![]() In this case, rather than attempting to store a null value in the Store FK, you build a special record in the Store dimension representing “No Store.” Now the No Store condition has a perfectly normal FK-PK representation in the fact table. ![]() Perhaps in some situations a product can be sold to a customer in a transaction without a store defined. Occasionally a perfectly legitimate measurement will involve a missing dimension. It’s a major mistake to build data warehouse keys out of the natural keys that come from the underlying data sources. In the real world, there are many compelling reasons to build the FK-PK pairs as surrogate keys that are just sequentially assigned integers. This situation is perfectly consistent with referential integrity and proper dimensional modeling. Therefore, a product dimension table might be paired with a sales fact table in which some of the products are never sold. Note that this design allows the dimension table to possess primary keys that aren’t found in the fact table. In other words, every foreign key in the fact table has a match to a unique primary key in the respective dimension. This predicament gives rise to the technique of slowly changing dimensions, the subject of the next column in this series.įoreign keys in the fact table obey referential integrity with respect to the primary keys in their respective dimensions. In effect, as keepers of the data warehouse, we have taken a pledge to faithfully represent these changes. But you have to account for the slow, episodic change of these dimensions in the way you handle them. The assumption of dimension independence would mean that all the dimensions, such as Product, Store, and Customer, are independent of Time. And the record of when a given customer shopped in a given store would be expressed more naturally in a fact table that also showed the Time dimension. ![]() A single, combined dimension would likely be unwieldy with tens of millions of rows. But it’s usually the right decision to model Customer and Store as separate dimensions. In the grocery store example, Customer and Store clearly will show a statistical correlation. In truth, dimensions rarely are completely independent in a strong statistical sense. We call these logical clumps dimensions and assume informally that these dimensions are independent. When you record facts - dollar sales of a grocery store purchase of an individual product.įor example - you naturally divide the context into clumps named: It’s not uncommon for the designer to add context to a set of facts partway through the implementation.Īlthough you could lump all context into a wide, logical record associated with each measured fact, you’ll usually find it convenient and intuitive to divide the context into independent logical clumps. ![]() By contrast, the context surrounding the facts is open-ended and verbose. Facts are very specific, well-defined numeric attributes. Facts are always surrounded by mostly textual context that’s true at the moment the fact is recorded. Measurements are usually numeric and taken repeatedly. Dimensional modeling begins by dividing the world into measurements and context. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |