Published on Wednesday October 26 2011 by Luc Durand
Whether you are doing transactional data modeling, dimensional data modeling, Data Vault modeling or object-oriented modeling, all these techniques have one essential thing in common: they include complex entities (facts, dimensions, hubs, classes). One reason for this complexity: instances of these entities go through a life cycle that make then pass by different status (states) when conditions or events occur.
Describing the life cycle of these entities is therefore an excellent way to understand them better. There is a simple technique for this matter: the state (life cycle)diagram. This diagram was popularized by object-oriented analysis and design approaches (it is also part of the UML object-oriented modeling standard ), but it is rarely used in the BI world even though it deserves a place among the major deliverables.
Figure 1 shows an example of a state diagram for equipments’ instances.
Figure 1 Equipment's life cycle diagram
The rounded rectangles represent the possible status of an equipment and the links represent the conditions or events needed to move from one status to another:
At the beginning of its life cycle, equipment is received and stored.
The equipment is then usually activated and becomes in service
The equipment can then be moved while still being in service.
At one point, the equipment will be removed and become deactivated.
In some cases, the equipment will be reactivated and put back in service.
At the end of its life cycle, the equipment will be removed permanently.
The embedding of status allows representation of a parent status and its sub-status which are particular cases of the parent status. In our example, new or deactivated equipment are two special cases of a piece of equipment in storage that is moved in the warehouse.
This diagram has several advantages in the BI world :
It permits a deeper understanding of the most important entities of a model: the important dimensions (often conformed) and the facts for a dimensional model; the hubs and links for a Data Vault model.
As they are the most important entities, they are at the heart of the business context being modeled. Therefore, to better understand them implies a deeper understanding of the overall business context.
Being very easy to understand, the model is easy to validate and therefore to refine the entities understanding and to communicate this understanding to the development team.
It permits to precisely derive the cumulated snapshot facts which are one of the three types of facts in a dimensional model (together with the transactions facts and the snapshot facts). The cumulated snapshots allow to easily measure the effectiveness and efficiency of a process that involves several steps (status); for instance: an ordering process.
It specifies precisely the values of one of the most important mini dimension of a dimensional model: the status dimension.
The life cycle diagram has therefore a more general usage scope than to limit its usage to the object-oriented world. It permits an understanding of the entities in the heart of the business environment modeled. It adds a temporal dynamic to the structural data model. Therefore, it is a deliverable to add in the BI modeller’s toolbox.