Data Set Encapsulation for Data Lake (Data Mesh)
Dave Does Demos Dave Does Demos
1.72K subscribers
145 views
0

 Published On Feb 5, 2021

In this video I talk about how you should consider structuring your data lake. Often data lake structure is described as layers (Bronze, Silver, Gold or Raw, Standardised, Modelled, Curated) but I find it more useful to consider data sets as encapsulated things which stand alone. Just because you have two raw data sets there is no reason they would have the same requirements so don't bind yourself to a structure which will limit choice without good reason. This encapsulation is particularly usefull when doing DataOps, Agile and CI/CD with your data lake platform. The ideas discussed here align well with the data mesh concept explained at https://martinfowler.com/articles/dat...

0:00 - Introduction
1:29 - Why encapsulation?
6:16 - Datasets
6:37 - Data Contracts
8:13 - Performance
8:46 - Availability
9:27 - Compliance
10:29 - Lifecycle Management
11:12 - Data Transformations
12:52 - Access Control
14:12 - Transitional Datasets and layers
15:15 - Recap on why we encapsulate
16:47 - Wrap-up

For all of my other demos, go to https://davedoesdemos.com or go straight to the GitHub page at https://github.com/davedoesdemos/Demo.... Also please subscribe to the channel to make sure the latest demos show up in your playlist!

show more

Share/Embed