This is a story from Tom Bruer’s fantastic presentation at WWDVC (circa 2014). By the way, if you haven’t already booked your tickets to this years event, there are seats still available here -> Registration
He was praising star-schemas as being the closest to the business (which is true for many use cases). Then he went on to explain how a typical BUS architecture consultant worked to consolidate their conformed dimensions and figuring out all the common elements and dimensions to build their solution.
They worked really, really hard.
Talked to the business users.
Mapped it to the various data objects and figured out the conformed dimension map in the grand scheme of things.
It was ready!
It took a good three months.
It was presented to the business.
It was a spreadsheet. Yes!, a spreadsheet of the dimension bus with the estimates it would take to build each one, the complexities involved, the sources they would need etc.
They hadn’t even encountered ANY real data yet.
They hadn’t even encountered ANY complications the real sets would bring.
By the way … so far, guess how much the company had already spent?
A whopping $300,000 USD
For … a … frickin …. spreadsheet!
Unfortunately, this is more of the norm than the exception, especially with any architecture that requires ALL your assumptions to be done upfront. Can you imagine what would happen once they start profiling the data and discover they can’t match the grain from two sources in one of the shared dimensions?
Oh, worse things have happened in reality.
Now, in our world where we use the Data Vault 2.0 model as the foundation EVEN IF the front-end marts are star-schemas (sometimes even virtualized) …
First thing is … DO NOT ASSUME!
Then we try and build a hub and immediately expose it to a technical business user. And, we work on vertical scoping (Something you should be doing with any architecture … we’ll talk about that another time though).
We expose the raw data.
We throw what they have in their face first and say …
Hey, this is what you have. What do you want to do with it? (Not the exact words … I’m being a bit playful here. Please use proper tact and diplomacy as well as professionalism or you’ll risk getting booted).
It helps us understand what they have
It helps them see what they really have
And, it helps with not just expectation setting, but also with helping them to start thinking about what they want to do with this data, it’s limitations, it’s uses and potential etc. Heck, you can even do raw star-schemas projected from the sets so they can play with it, even during development.
On the plus side, they end up becoming fantastic clients, because they end up being extremely happy with you.
Now, exposing real data does certainly can cause some “riffs” with other people such as those who are responsible for creating the source systems, DBAs, project managers etc. The key though is to take them along on the journey and encourage and help them move the organization forward instead of playing a “blame game”.
You’re the “solution” guy or gal, remember.
Dan Linstedt & Sanjay Pande
PS: This year we have 3 vendors all involved in DV projects for different things doing hand-on pre-conference sessions. The sessions are definitely not worth missing as there are going to be both real-case studies, experimental architectures, data mining, MPP, Hadoop and Big Data … Lot’s to thin about. So before they’re all gone, book your seats here -> WWDVC Registration