• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

DB Normlization

 
Elizabeth King
Ranch Hand
Posts: 191
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure this is the right place to ask
the following:
Can anyone elabrate the meaning of datebase
normlization?
M
 
Joe Ess
Bartender
Posts: 9337
10
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Normalization is the process of balancing the information stored in database tables so that information that does not have to be repeated isn't. For example, if you have a database an inventory and you have 100 items and 10,000 instances of the items you wouldn't make one table with 10,000 records, one for each item instance. You'd have 100 records representing the items and the count of each item either stored in the item table or another related table. It is a common statement in database circles that a fully normalized database is practically unusable so it's more of an art than a science (like most of programming). Google for normalization. I got 147,000 hits.
 
Elizabeth King
Ranch Hand
Posts: 191
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Joe, thanks.
I have a table with many many columns, but each row
fills only a few columns, and most of them are null.
Is this type of DB unnormlized? Should I divide
the table into smaller tables, and use keys to refer each
other?
 
Bhushan Jawle
Ranch Hand
Posts: 252
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mary,
This should get you started
Thanks,
Bhushan
 
Joe Ess
Bartender
Posts: 9337
10
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Mary King:

Is this type of DB unnormlized?

Standard Engineering Response: That depends on the data
Bhushan's link gives a good introduction. It should get you started.
 
Glen Cai
Ranch Hand
Posts: 121
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Joe Ess:
[QB]
...
It is a common statement in database circles that a fully normalized database is practically unusable ...
[QB]

Why?
 
Loren Rosen
Ranch Hand
Posts: 156
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The problem is that normalization scatters the data into multiple tables, and at some point you have to gather it up again. Sometimes this can cause performance problems.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I gave that article a 60 second scan and it seems to be going all the right places. One of my favorite Google words is "tutorial". Try "database normalization tutorial" if you want to dig deeper.
The article mentions normal forms. Relational theory goes up to 5th normal form, but the usual practice is to get yourself to 3rd normal and then see if there are any MAJOR efficiencies to be gained by denormalizing a bit. I'd wait until the 3rd normal form was proven to be a problem before going back.
Relational modeling is a vast and deep field. Hey, "relational modeling tutorial" would be another good Google. If you have just the right brain wiring, you may find it to be great fun.
 
Jason Steele
Ranch Hand
Posts: 100
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Normalization is an excellent way of sorting data that is not directly related to other data.
For instance, take a student taking some courses.
you don't want to add a column for each class the student takes:
StudentID Name Address CourseID1 CourseName1 CourseID2 CourseName2 etc.
CourseID and CourseName are in no way directly related to what a student is so why have it in the same table?
instead we create three tables
Student Table
-------------
StudentID | Name | Address
Course Load Table
-----------------
StudentID | CourseID
Courses Table
-------------
CourseID | CourseName | Description
Now in a normalized manner we can see how data relates to others
Any data that is not directly related to the specific object should be pulled out and put into another table. This is the sole basis of Relational Databases. And though in larger tables it could hinder performance, it is better for readability and logic flow.
 
Scott Ambler
author
Ranch Hand
Posts: 608
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You might find the Normalization section of my Data modeling 101 essay to be of help. The URL is http://www.agiledata.org/essays/dataModeling101.html#Normalize. This section is the rough draft of Chapter 4 in Agile Database Techniques (www.ambysoft.com/agileDatabaseTechniques.html). IMHO data normalization is a skill which every developer should have.
-Scott
 
Dana Hanna
Ranch Hand
Posts: 227
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Glen Cai:

Why?

We have a developer at work who created a beautifully normalized customer data mart, but queries against it take forever due to the level of complexity in them. We've started creating triggers and de-normalized tables to store data in a more speed efficient manner.
Trust me, he's right.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic