Empower supports publishing datasets to SQL servers automatically when the underlying data is updated.

Prerequisites

Steps

Navigate to the Advanced Options tab on the left side of the screen.
Add a new record to the DatabaseToStepCommand table.
Use the SQL server in the Database ID field.
Configure this record to be a PUBLISH a PUBLISH phase.
Select a new load group number if you want this to run independently OR add to an existing loadgroup to have it run with the rest of that load group.
Write 1 for the Execution Order field if this is the only command to be executed; otherwise configure to be in whatever order you wish.
You can leave the Source Schema Suffix and Target Schema Suffix fields as they are defaulted.
You can leave the Item Name to Execute field set to None.

Identify the entities (tables) you want to publish to the target SQL server.
Create a new record for each table you wish to publish in the Publish Entity Table.
Ensure each new Publish Entity entry includes at least the following columns:
1. Load Group: matches DatabaseToStepCommand entry’s loadgroup.
2. Target Entity: this could be name of the entity or desired path (month/date/Target_entity_name) as it will be written to a path in the SQL server.
3. Source Catalog: the source table’s Unity Catalog.
4. Source Schema: the source table’s schema.
5. Source Entity Name: the source table’s name.
6. Source Entity Filter: this adds an optional where condition parameter to the (ex: account_type=”Debit”).
  1. Keep in mind that this filter must be valid SQL (everything that could be part of a SQL WHERE clause, just without the preceding WHERE in the statement).
  2. For proper behavior, make sure to use fields that are contained within the table you are specifying.
7. Target ID: The DatabaseListID of the target SQL server (this should be the same as the DatabaseToStepCommand’s DatabaseID).
8. Is Active: only active entities are published, must be set to true for this entity to be enabled.

Retry on Failure (retry_on_failure)
- Defines the behavior when a batch operation fails.
- Options:
  1. True:
    - Attempts to retry row-level publishing for failed batches.
    - Logs each failed row for detailed analysis in the row log table.
  2. False:
    - Logs only batch-level success/failures in the batch log table without attempting retries at the row level.
- Significance: Ensures data consistency while isolating problematic rows.
Batch Size (batch_size)
- Defines the number of rows processed per batch.
- Behavior:
  1. Large batch size: Improves throughput but requires more memory.
  2. Small batch size: Suitable for constrained environments or granular control.
- Significance: Enhances flexibility and scalability in data processing workflows.

Batch Log Table (sql_batch_logs)
- Logs metadata and outcomes of each batch operation.
- Schema:
  - batch_log_id: Unique identifier for the batch.
  - timestamp: Timestamp indicating when the batch operation occurred.
  - target_table: The name of the target table in SQL Server.
  - batch_error_message: Error message if the batch fails.
  - sourceRowCount: Total rows in the source data.
  - rowsRead: Number of rows read during processing.
  - rowsSkipped: Number of rows skipped during processing.
  - rowsWritten: Number of rows successfully written to the target table.
  - FailedRecordCount: Count of rows that failed during processing.
  - publishSuccess: Indicates success (1) or failure (0) of the batch operation.
Row Log Table (sql_row_level_logs)
- Logs information about individual rows that fail during processing (when retry_on_failure=True).
- Schema:
  - row_log_id: Unique identifier for the row log entry.
  - batch_log_id: Links the row entry to its parent batch in the batch log table.
  - timestamp: Timestamp indicating when the row operation occurred.
  - target_table: The name of the target table in SQL Server.
  - row_status: Status of the row operation (FAILED, SUCCESS).
  - error_message: Error message detailing the failure for the specific row.