The 3 Little Pigs Do Data Warehousing

Hi Again,

It’s Sanjay, and I was talking to my 4 year old daughter about the 3 Little Pigs and the Big Bad Wolf and trying to teach her about the value of doing things right the first time, and for the long-term. The geek that I am, I began to wonder what would happen if the “3 Little Pigs” did Data Warehousing?

Now, just in case you don’t know the story, they built houses to avoid the big bad wolf eating them. The first one made a house of straw which was easily blown away. The second one used sticks which were marginally better and it took a little more doing to destroy it. The third one built a house of bricks and kept the wolf at bay. None of the wolf’s tricks worked and the 3rd pig basically saved all the 3 little pigs.

Fast forward a few years …

These chaps have grown up and now have technical skills. Each one of them was given a position to take charge of a Data Warehousing project and make some key decisions.

The first one (who built a straw house) likes things fast and easy, and went with the easiest and quickest delivery. He chose a star-schema based Data Warehouse.

Their nemesis, the big bad wolf this time came back with complexity and said, “I’ll just throw more sources into your Data Warehouse and break your project cycles by complicating your Slowly Changing Dimensions.”

The little piggy said, “I know what I’m doing. This is the best way to build a DW and it’s fast and easy. Do your worst, you complexity beast.”

And And the wolf threw just a few new business rules into the mix and added two new sources to an already complex SCD-II which inflated the budget by 100% just for maintenance. The project got eaten because restarting from scratch was actually less expensive.

The second little piggy was a little better, as he had demonstrated with his use of a better choice of building material with his house. So, he said,“I’ll build my Data Warehouse as a 3NF DW. It’s a proven construct. I can then have a mart layer on top as and when I need it.”

The big bad wolf of complexity was laughing.

He knew it would take a bit more doing, but he had many weapons up his sleeve. He started by throwing a few business rules and followed up with some cascaded updated and deletes, and then said, “You’re DW will now fail audit and compliance”, by showing the the little piggy all the rules from different compliance acronyms like BASEL, SOX, PCI, HIPAA etc.

The second little piggy was a little better, as he had demonstrated with his use of a better choice of building material with his house. So, he said, “I’ll build my D

The 2nd little pig was very sad, because he had to spend extra money to ensure that the Data Warehouse was in fact compliant. The extra work had already blown his budget and he hadn’t anticipated this in the project plan.

But the wolf wasn’t done yet because he was determined.

The wolf then threw a new system to integrate which completely changed the dynamic of the data and both these things inflated the project budget so much, it was cheaper to just build a new one – and the project was toast and it was already over spent.

The wolf said, “I enjoyed eating that project. Extra work it was, but fun.”

The 3rd little pig thought, “Huh, these 2 did what everyone recommends and got eaten. Let’s see if I can do better”. He decided to use the Data Vault Architecture (Model and Methodology) to fight the big bad wolf (who they found out later, had many other friends).

Now, we all know that he was the one who likes to do it right and long-term (After all he’d used bricks for his house)

The big bad wolf had an old score to settle with the 3rd little pig and he knew this chap was smart, so he would need to test and see what he can throw. He started with new source applications. The Data Vault just integrated all the new data into hubs, links and satellites without changing anything, and adding to the pieces, as and when required.

The wolf was surprised.

After all, both the others had at least some trouble with it. The big bad wolf of complexity threw more new systems at the data warehouse and the 3rd little pig
smiled and said, “Well, it’s already designed for this, what else have you got?”

This ticked off the wolf.

Then the wolf threw the audit and compliance weapon at the Data Vault and the 3rd little pig laughed again and said, “Don’t you know, it’s already built into the actual architecture? The methodology takes care of this.”

The wolf said that he didn’t and decided to call his friends to fight the 3rd little pig. One by one his friends started attacking the Data Vault Data Warehouse with different weapons including:

New business rules – They are applied on the way out of the data warehouse, so the pig just added them on the way out to the user-facing layers and re-generated the user-facing data sets.

Real-Time data integration – which is already possible in the Data Vault without model changes and it’s designed to permit late binding.

Very large data volumes – The 3rd little pig just easily scaled the system to MPP hardware because it borrows the shared nothing concept from MPP architectures anyway.

The big bad wolf and his friends were getting tired.

The big bad wolf and his friends kept trying to eat the DW project, but didn’t succeed, because the 3rd little pig had a solid Data Warehouse backbone with the Data Vault and it was much easier to build and manage than the alternatives. It also stood the test of time just like it has for close to 20 years now (That’s how old the Data Vault is). Also, unlike the alternatives, subsequent projects were less expensive because of the solid foundation already built.

The 2 other pigs realized their follies and asked the 3rd little pig to teach them how to implement a Data Warehouse as a Data Vault and he said, “Hey, why learn from me, when you can learn from the inventor of the architecture – Dan Linstedt himself?”

Get a FREE introduction to the Data Vault Architecture

Introduction to the Data Vault (FREE) Training ]

One of the students who has already been through the course said, “Thank you for keeping this online. I don’t have any time right now, but I’ll start watching the videos from next month. I’d rather learn directly from you and it’s really very valuable”. The format helped him learn at his pace and time.

And talking about format, another student said, “This is the best Data Vault training I’ve ever attended AND I LOVE the delivery format”, words that do inspire Dan to do more and he’s actually prepping for some more courses with a focus on Agile BI using the Data Vault.