Introduction to Data Modeling
Posted on April 7th, 2004 by Gabriel HarperData modeling is the practice of representing data structures and their relationships through conceptual modeling. The benefits of implementing this process during the design phase of your database project include improved data integrity, a more efficient development cycle, a clearer vision of the entities that are being represented in your database, and a powerful visual aid for collaboration and database design.
Methodologies and the Entity-Relationship Model
You may be familiar with the term Entity-Relationship Model (ERM). ER is technically a methodology for data modeling. A methodology is just a defined set of processes and methods which are followed during the design and development of a software project. Other methodologies of data modeling exist, such as Object Model and Oracle CASE. ER is the most common and simplest method of modeling that can be used almost universally for application design.
Proposed in 1976 by Peter P. Chen, the Entity-Relationship model provides us with one of the most common and useful components of a data modeling methodology . the Entity-Relationship diagram. Simply put, ER diagrams are visual diagrams that show database information represented as entities and relationships. If you’re familiar with the Relationships tool found in Microsoft Access, you already know what an ER diagram is.
ER diagrams do a great job of representing relational data models because entities and relationships between those entities map well to primary data components in your data model and their relationships between one another. For example, your company many need a small database to store data about customers, orders, products, and representatives. Each of these four primary entities is directly related to a table in the database (see Figure 1.1).
Figure 1.1
Relationships, Connectivity, and Cardinality
Looks great, but we still need to represent our relationships between these entities so we can define our indexes and foreign keys. Relationships have a plethora of meta data that can be associated with them, such as direction, existence, type, cardinality, connectivity, and direction. We will focus on the connectivity and cardinality of the relationships, which define the mapping between entities in our diagram. There exist three primary relationships:
1. One-to-One (1:1)
2. One-to-Many (1:N)
3. Many-to-Many (M:N)
A one-to-one relationship describes two entities with a direct mapping, such as .Each employee has one parking spot, and each parking spot is assigned to one employee..
A one-to-many relationship describes one entity related to more than one entity, such as .Each employee will have one assigned break period, but more than one employee can take a break during the same period..
A many-to-many relationship describes two entities sharing multiple relationships with one another, such as .Each employee can work on multiple advertising campaigns at any given time, and each campaign can have multiple employees participating in it at any given time..
Getting back to the example from Figure 1.1, let.s define the precise rules we need to define the relationships in our database.
1. Each customer must have one representative
2. Each representative can have many customers
3. Each customer can have many orders
4. Each order can have many products
5. Each product can belong to many orders
In order to present these relationships properly, we need to understand the basic components of an ER diagram. Let.s examine the diagram in Figure 1.2 and then apply it to our company model.
Figure 1.2
As you can see, it.s pretty simple to describe the most common relationships that occur in a relational database. Applying our relationship rules from the company model, we can tie everything together with our wonderful new relationship components (see Figure 1.3)
Figure 1.3
Now using the diagram from Figure 1.3, anybody with a basic understanding of an ER diagram can in just a few moments understand deep relationship requirements and identify independent and dependent entities in your data model. Most developers, project managers, and customers tend to agree this is a little easier than reading SQL statements!
Data Modeling Tools
As was previously mentioned, the Relationship tool include with Microsoft Access is simple ER diagram maker in itself. However, when working with the Access tool you are creating relationships on entities that directly correlate to tables in all cases, and you will need to explicitly define your associative entities (see Figure 1.4).
Figure 1.4
Many other modeling tools exist for a variety of environments, such as the SQL Server Diagrammer and Microsoft Visio. An easy and cheap place to start though is with a pencil and a pad of paper.
Conclusion
One of the most valuable attributes of data modeling is that it can be used as a tool for the entire chain of project stakeholders, developers, investors, and managers. It is a powerful visual aid that anyone can understand the basics of, but is still of significant technical usefulness to the application developers.
No seasoned developer will tell you that planning ahead is a bad idea. While data models are incredibly helpful when drafting project requirements and collaborating on data system design, they are still part of a much larger systems analysis that should be completed for any project of significant proportions.
March 25th, 2009 at 11:43 am
A nicely written article full of very helpful database design guidelines.
I have found that you get much more benefit by doing data modeling at the business level, prior to the database design.
In such a data model the boxes on the diagram would represent data entities as opposed to tables and the lines would represent relationships, as opposed to foreign keys. This model will be the data model for the business and not just for a single application.
In this model you can show complex structures such as subtypes and exclusive relationships. This model would also be fully normalized.
How these structures are implemented in any application is a design decision. One application could implement them in one way and another application in another. These design decisions would not be arbitrary but would depend on the scope and size of each application, taking into account performance and denormalization requirements.
I have written an eBook on business data modeling which will help all developers to arrive at far more elegant, yet powerful database design. It is part of the Integrated Modeling Method.