Databricks


Connector Details

Connector AttributesDetails
NameDatabricks
DescriptionThe Databricks platform is a unified analytics solution designed to accelerate data-driven innovation by enabling seamless collaboration between data engineers, data scientists, and business analysts. It provides a powerful environment for processing, analyzing, and visualizing large-scale data with built-in support for machine learning and AI workloads. Leveraging Apache Spark at its core, Databricks simplifies the development of complex data pipelines, while ensuring scalability and performance. With integrated data lakes, real-time streaming, and collaborative notebooks, Databricks empowers organizations to extract actionable insights, optimize workflows, and drive transformative outcomes across various industries.
Connector TypeClass B

Features

Feature NameFeature Details
Load StrategiesFull Load
Metadata ExtractionSupported
Data AcquisitionSupported
Data PublishingNot Supported
Automated Schema Drift HandlingNot Supported

📘

Delta Sharing vs Data Copy

Databricks Unity Catalogs natively support delta sharing. Our Databricks connector does not use this technology, as it is already possible within the Databricks Workspace experience (see documentation here). If you want to use Delta Share instead of data copy, simply follow the steps in the Delta Share documentation above.

Our connector copies data from one catalog to another. Outside of having a manipulatable copy of the data rather than a Delta Share read only version, the Empower Databricks connector also allows you to take advantage of the Type 2 history stored in your bronze layer tables without having to only use Time Travel.

Source Connection Attributes

Connection ParametersData TypeExample
Connection NameStringDATABRICKS
Server hostnameStringServer Hostname
TokenString<your-dayabricks-token-here: dapi*>
HTTP PathStringHTTP Path
CatalogStringCatalog name
Silver Schema (Optional)String
Bronze Schema (Optional)String

Connector Specific Configuration Details

  1. Databricks connector has optional values such as Bronze Schema and Silver Schema

  2. If you are using Delta Sharing, ensure that the PROVIDER has assigned the correct permissions on the source catalogs, tables, etc., that the RECIPIENT will access. At a minimum, read access is required. You can verify this by running a simple select query.

  3. The cluster you will use with Databricks should be set up with Unity Catalog.

  4. Generate the access Token for the user:

    1. Log in to Databricks Go to your Databricks workspace URL and log in.
    2. Open User Settings Once logged in, click on your user profile icon in the upper right corner of the screen. From the dropdown menu, click User Settings.
    3. Generate a New Token Under the Access Tokens tab, click the Generate New Token button. In the dialog box, provide a comment or description for the token (optional but recommended). Optionally, set an expiration date for the token. If no expiration date is set, the token will last for the default period, which varies by workspace configuration. Click Generate.
    4. Copy and Save the Token After generating the token, copy the token immediately as it will only be shown once. Store it securely (e.g., in a password manager or a secure environment).
    5. Use the Token You can use the token in various APIs, SDKs, or CLI commands to authenticate with Databricks.
  5. Open the list of available clusters. Choose your own and get from the cluster settings the next values:

    1. HTTP path from Advanced options:
    2. Server hostname from Advanced options:
  6. Get the Catalog name from the catalog explorer, but make sure to check the permissions. You need sufficient permissions to read both the catalog and the schema.

  7. More details around this connector

Screenshot To Use Connector