Staging Data Transformation with Notebooks
Follow this guide to create staging notebooks required for your model.
Data Acquisition
This guide assumes that you have already finished the Data Acquisition process. Follow Data Acquisition guide for more details if necessary.
What are Staging Notebooks?
Staging notebooks are utilized to produce staging data essential for modeling, sourced from the silver layer of the Medallion architecture.
In order to effectively prepare staging data for modeling purposes, dedicated notebooks are required for each entity encompassed within the model. These entities consist of both dimensions and facts, necessitating separate notebooks to cater to their distinct data processing requirements. These source query notebooks offer a structured framework for interacting with different data entities within the model.
Navigate to the folder /Workspace/DmBuild
in the databricks instance. Check if the source query notebooks are available for your dimensions and facts in the following path /Workspace/DmBuild/Dimensions
and /Workspace/DmBuild/Facts
respectively.
If notebooks are not available, you'll need to generate a separate notebook for each entity in the model. You can utilize the source query template named SourceQueryTemplate
as a starting point to create staging notebooks, located within the Dimensions
and Facts
folders.
Updated 21 days ago