Dimension modeling and Data normalization

PDF

A significant difference of dimension modeling when compared to transaction data modeling is the normalization process. While OLTP stresses on need for normalization, OLAP database emphasises on retrieval performance and hence allows data redundancy.

This article helps to understand what is normalization and what is dimension modeling.

Data Normalization

This is a technique used in data modeling that emphasizes on avoiding storing the same data element at multiple places. We follow the 3 rules of normalization called the First Normal Form, Second Normal Form, and Third Normal Form to achieve a normalized data model.

A normalized data model may result in many tables/entities having multiple levels of relationships, example table1 related to table2, table2 further related to table3, table3 related to table 4 and so on.

  1. First Normal Form – The attributes of the entity must be atomic and must depend on the Key.
  2. Second Normal Form – This rule demands that every aspect of each and every attribute depends on Key.
  3. Third Normal Form (3NF) – This rule demands that every aspect of each and every attributes depends on nothing but the key.

Theoretically We have further rules called the Boyce-Codd Normal Form, Fourth Normal Form and the Fifth Normal form. In practice we don’t use the rules beyond 3NF.

Dimensional Modeling

This is the form of data modeling used in data warehousing where we store business related attributes in tables called Dimension Tables and the business related metrics/measures in tables called Fact Tables. The business metrics are associated with the business attributes by relating the dimension tables with the fact table. The dimension tables are not related to one another.

A fact table could be related to many dimension table and hence form the center of the structure when graphically represented. This structure is also called the star schema.

Please note that in dimension modeling we do not emphasize on normalizing the data instead have lesser levels of relationships.

 
Home Fundamentals Dimension modeling and Data normalization