Unity Catalog in Empower

What is Unity Catalog?

Unity Catalog is a unified security and governance feature set built by Databricks. It provides several critical features that make Empower a more robust data estate solution.

UC enables Delta Sharing, providing the ability to share read-only copies of your data securely both internally and with your partners and associates without requiring them to be on the Databricks platform. UC also provides the ability to create row-filters and column masks for different users and groups in SQL, giving you fine-grained control over your data security and access.

Unity Catalog comes with the 1.14 Empower release to all customers, and as such there are a few required steps necessary to ready the foundation for 1.14's deployment.

Initial Configuration Values

Steps

  1. Provide the Empower-Service service principal with Databricks Account Admin access

    1. Note: You will need an account with either Databricks Account Admin access or Azure Global Access Administrator permissions.
    2. Login to the account console at https://accounts.azuredatabricks.net.
    3. Perform this action through the Databricks Account Console:
      1. In a Databricks workspace, navigate to the top right drop-down menu and select "Manage Account."
      2. Under the "User Management" tab, click on the "Service Principals" tab and click the blue button on the right-hand side of the screen to add a new service principal.
      3. Name the service principal "Empower-Service" and use the service principal's Object ID for the UUID field. (this will be different for each directory, please look it up in AAD).
      4. Once the service principal is added, click on its name and navigate to the "Roles" tab. Then enable "Account Admin."
  2. Create a resource group for the Databricks Metastore

📘

This step is optional; you may ask Hitachi Solutions to create this on your behalf.

  • This step must be repeated for every Azure region where your company has an Empower solution deployment.
  1. Keep in mind the following considerations:
    1. The metastore will be global for all Databricks workspaces in a given Azure region. Therefore, the resource group name need not be Empower-specific e.g: [company name]-[region]-unitycatalog.
    2. The name will be immutable. Once deployed, the Unity Catalog cannot be moved.
    3. The name should conform to your company's resource group naming conventions.
  2. Please provide a hitachi team member with at least reader on this resource group.
  3. Please provide Empower-Service with Owner over this resource group.

Required Values for Deployment

Please provide the following values to your Hitachi Empower contact for each metastore deployment:

  1. Resource group name created in Step 2
  2. Subscription ID that holds the resource group
  3. Metastore naming: The metastore will require three names provided, which will be deployed by our team.
    1. Metastore name (lowercase alphanumeric characters and hyphens). Suggested name: {location}-metastore
    2. Metastore storage account name (lowercase alphanumeric characters only) Suggested name: {location}metastore
    3. Metastore storage account container name (lowercase alphanumeric characters and hyphens) Suggested name: {location}-metastore
  4. A /24 block of IP addresses for the unity catalog Vnet. The Vnet itself does not need to be deployed. We require an unused block of IP addresses to use in our automated deployment.

During Deployment

There are a few steps that our delivery team will need to perform to prepare your environment for Empower v1.14. Note that some of these steps may have already been performed by our delivery team for your environments. Please consult your delivery manager.

These steps are listed below

Hotswap the Databricks Workspaces

Each Databricks workspace used by Empower in your deployment must be swapped to a Unity Catalog compliant workspace. All your notebooks, workflows, and tables will be maintained in the new workspace.

We do expect a 1-2 hour window of downtime, and our delivery team will coordinate a good time for that with your team.

Upgrade to Empower's Model Builder 2.0

Empower’s Model Builder 1.0 does not support Unity Catalog, and we plan on sunsetting it in the new release while upgrading every customer to Empower’s Model Builder 2.0. This comes with remarkable performance improvements.

The conversion is mostly silent and only takes minutes to perform. Additional verification and testing could take more time, but we do not anticipate any measurable downtime during this part of the deployment.

Clone and Prepare Dimensions and Facts Building Notebooks

This phase of the deployment requires members of our delivery team to clone and modify any existing dims and facts notebooks to handle the new UC conventions.

There is no downtime required for this phase, and it could happen in advance. The completion of this phase is dependent on the number and complexity of your notebooks.

Identify other Customizations

This phase is highly environment specific, as it depends on what customizations to the Empower platform have been done and if they would be affected by the UC rollout. If your environment will require some time to bring online due to these customizations, we will make sure to communicate that to you if not done already.

One such example is unmanaged notebooks. These notebooks will have to be converted to be Unity Catalog compliant to be useable in your new environment.