Data Publishing

Pushing data out of the Lakehouse into other tools

Data Publishing Flows define the outbound movement of data to external systems, such as Power Bi, SFTP and SQL Servers, or Dynamics 365.

Publishing flows have an associated Data Source Target (Data Target, for short) which specifies where they will be writing data to.

📘

Quick Links

This page only covers Publishing flow creation and modification.

Overview

The Data Publishing module is accessible from the left navigation menu.

Click Data Publishing to navigate to that module.

Click Data Publishing to navigate to that module.

From here, you can view all Data Publishing flows within your selected environment.

There are a few restrictions when it comes to Data Publishing flows.

  1. A Publishing flow can only ever be affiliated with one Data Source at a time.
    1. This means you can't push data out to multiple data sources under the same publishing flow! In case you have this need, create a separate acquisition flow for each of your data sources, and schedule them to run at the same time.
  2. Archiving a Data Source will also force the archiving of all affiliated Publishing flows.
    1. You will not be able to edit, schedule, or trigger these flows until you restore the underlying Data Source.
  3. Deleting a Data Source will permanently archive all affiliated Publishing flows.
    1. You will never again be able to edit, schedule, or trigger these flows.

Flows

From the default view for the Data Publishing module, you can see all the Publishing flows within your current environment.

A list of all Publishing flows in Neo - Augustus - Development.

A list of all Publishing flows in Neo - Augustus - Development.

You can search these flows by name, view configurations or historical run logs, trigger a flow to run on demand, view/set/activate scheduling, and create new flows.

Creation

Creating a Data Publishing flow can be done by simply clicking on "+" at the top of the screen on the homepage for the Data Publishing module.

Create a new Data Publishing flow by clicking "+".

Create a new Data Publishing flow by clicking "+".


Fill out the name of the flow, and select a Data Source Target.

Fill out the name of the flow, and select a Data Source Target.

More configuration fields may appear depending on the Data Source Target type.

More configuration fields may appear depending on the Data Source Target type.

Once you fill out the required fields (Name and Data Source), click "Create" to complete the creation process. Depending on the data target selected (SFTP in the example below) you may have additional fields to configure. See the Supported Data Publishing Targets section below for more details.

You can now see your newly created flow at the top of the page

Congratulations, you have successfully created a Data Publishing flow.

Congratulations, you have successfully created a Data Publishing flow.

📘

INFO: 1 to Many - Data Sources and Data Publishing Flows

Data Targets and Publishing flows have a 1:many relationship.

This means that a single Publishing flow can only ever be associated with one Data Target at any moment in time. However, a specific Data Targets may be associated with many different Publishing flows, all with different schedules and entity configurations.

Editing

You can modify any existing Publishing flow by clicking "..." near the name of the flow. Doing so will bring up a menu with one option being to Edit the flow.

Click "..." then Edit to modify an existing flow.

Click "..." then Edit to modify an existing flow.

You may edit the Name, the Data Source Target, and (if applicable) additional fields depending on the Data Source Target type.

Deletion

You can delete any existing Data Publishing flow by clicking "..." near the name of the flow. Doing so will bring up a menu with one option being to Delete the flow.

Click "..." then Edit to modify an existing flow.

Click "..." then Delete to delete an existing flow.

A confirmation modal will pop up. You must confirm you wish to delete the flow in order to complete the deletion process.

Confirm you wish to delete the flow and all of its historical logs.

Confirm you wish to delete the flow and all of its historical logs.

❗️

DANGER: Deletion is Permanent

Flow deletion is a permanent action. Deleting a flow will also remove the entire historical log of that flow. You will not be able to reverse a flow's deletion, so make sure you actually want to perform this action!

Scheduling and Triggering Flows

To read about how to schedule and trigger DAQ flows or any other flow type, visit Scheduling Data Flows.

Configuration

Click "View Configuration" on any existing Publishing flow to visit its configuration page.

From the configuration page, you can view the Entities that will be published when this flow is executed. Each Entity may have additional configurable settings (Options), described in the apty named subsection below.

When you make edits on this page, the change is automatically saved to the flow's associated Model configuration.

Entities

📘

What are Entities in Publishing flows?

Unlike Model Entities, Publish Entities are the objects within your lakehouse that are set to be published when this publishing flow is executed. You can set any number of publish entities within a publishing flow. Entities can be configured using data from any bronze, silver, or gold table in your data estate.

The Entity table defines the list of publishable objects for the Publishing flow. For each Entity, you can view its target name, its source schema and name in your lakehouse, any SQL filter to be run before publishing, the source catalog (set to your environment's catalog by default - AUTO), and any additional options for this Entity. Options are typically defined by the data target you are publishing data out to. See the Supported Data Publishing Targets below for more details on each supported target.

You can activate and deactivate any entity on demand. Deactivated entities will not be published during this flow's execution.

Below is a table describing each of the Entity columns and example values.

Column NameDescriptionExample Value
IDThe global ID for this Entity (autogenerated).123456-abccda-1233-124abcd
Target Entity NameThe name of the entity or desired path (month/date/Target_entity_name) as it will be written to an SFTP server (if applicable)dim_account_revenue
Source SchemaThe source table’s schema in your lakehouse, i.e. "where is this data coming from?"sales_gold
Source Entity NameThe source table's name in your lakehouse, i.e. "where is this data coming from?"dim_account_revenue
Source Entity FilterAn optional WHERE condition parameter, to be executed before publishing this data as a temporary view. Keep in mind that this filter must be valid SQL (everything that could be part of a SQL WHERE clause, just without the preceding WHERE in the statement).
For proper behavior, make sure to use fields that are contained within the source entity name you are specifying.
account_type=”Debit”
Source CatalogThe source table's catalog in your lakehouse. Select AUTO to use your environment's default catalog.AUTO
ActiveA toggle which activates/deactivates this entity for publishing. Only active entities will be published.Active/Inactive
OptionsContains a separate key/value dictionary modal to define specific data-target-specific configurations on an object by object basis. See Supported Data Publishing Targets below for more details.N/A

Supported Data Publishing Targets