How to convince your boss to use Data Vault


If you ask any experienced DV architect which is the most complicated part of implementing a DV solution, there will be a big chance that convincing the stockholders to use the “new” Data Vault concept.

Basically, the DW world got a very bad ratio of successful projects, so nobody is willing to “try” new things. Along these lines, there are benefits on the DV methodology, but one of the cons you can find is that there are not a big community neither plenty of experts available on the market (so it is normal that people have some fears about this methodology).

So I would like to expose, how I did manage to convince one of my last clients about implementing this kind of solutions.


1) Problem statement


  • Long initial delivery phase per project
  • Fully manual process
  • Difficult to maintain / Fix
  • Reports are not reusable across different businesses
  • Not possible to adjust quickly to business requirements
  • Unable to do automated data lineage
  • For new projects we need to reverse engineer from ground zero to find the EBOs/meaningful data


2) Current situation


  • Reverse engineer existing data marts could take a very long time due to a non clear data lineage and lack of documentation
  • Any new report requires an analysis of the databases and tables to find the required EBOs
  • Manual creation of ETL and maintenance of the same is a mandatory step
  • Long initial delivery phase could lead to refactor and delay the project considerably
  • Customers doesn’t have an easy access to create their own jobs and extract any data they need


3) To be (hypothesis)


  • Short initial delivery phase
  • Automation can be applied (ELT and DDL)
  • Easy to maintain and Troubleshooting
  • Easy to reuse (80% reusable VS 20% non-reusable)
  • Faster deliveries (quicker iteration between delivery team and business) will help to shape the end result as business is expecting
  • Generic reports can be reused for all business Fully automated data lineage
  • Every time the EDW project covers one more EBO, all the departments can use it in a very simplified and easy way, so they don’t need to reverse engineer


4) Assertions


  • Applying automation will reduce human error and the speed will be increased
  • Having a unique repository for all our data, well structured, will help other teams to easily create their marts
  • Reusing same generic report across multiple services will save resources
  • Creating or modifying virtual marts will be faster than physical marts


5) Criteria of Success


  • Ability to deliver value in biweekly basis
  • Enterprise business objects can be reused
  • Ability to run same report for different businesses
  • Expect a new data consumer to avoid reverse engineer the EBO for each business, finding a useful and easy EBO already built in EDW project
  • Reduce Average time to production for one business to 2/3


6) Definition of done


  • Framework will automate the ingestion of data only providing one view to feed our EDW
  • Minimum number of tables and links will provide enough data to start doing reporting
  • Generic reports will show the same generic insights across every business already modelled in the EDW


7) Comparison





Initial delivery phase Long initial phase Short initial phase
Creation of ETL/Tables Manual process + lifecycle Automated
Maintenance Difficult – modify physical tables and check impact Very easy – modify virtual tables (views) no impact on current production
Reusability 20% reusable / 80% non usable 80% reusable / 20% non reusable
Data lineage Not available Fully integrated
Initial analysis Requires to check all databases/tables to find EBO every time We need to find the EBO only once, then shape for easy consumption
Agile Not possible to deliver quickly Delivery value every 2 weeks



EBO = Enterprise Business Object
DDL = Data Definition Language
ELT = Extract Load Transform
ETL = Extract Transform Load
EDW = Enterprise Data Warehouse


Did you like this? Share it:

Leave a Reply