Introduction

A high-performance, fault-tolerant Python package for synchronizing data between Apache Spark and Dynamics 365 CRM. This package provides robust batch processing, intelligent retry mechanisms, comprehensive monitoring capabilities, and advanced tooling for efficient troubleshooting and performance tuning.

Hitachi Solution's Empower Data platform has a series of capabilities for Dynamics 365 Customer Engagement, Finance and Operations, Customer Insights, and several other industry cloud options. In a nutshell, we have a series of high-level capabilities:

  • Near Realtime Ingestion of D365 Data to Delta Lake - This allows advanced analytics and transformations to be performed for BI, ML, and Integration Use.
  • Publishing of any Supported Data to D365 - This process allow Empower customers to use Empower for one-time use ("DataDriven Upgrade") to migrate into D365, and it also allows ongoing integrations for customers of the full platform.
    • High Performance - Empower automatically manages bulk batch uploading.
    • Virtual Entities are supported as a target via SQL Server, allowing any data from Empower to be accessible to PowerAutomate and other Power Platform components, even if it's not a physical entity.
    • Failed Rows Handling -- When data is rejected by the CRM or ERP, data is stored in a table in the lake which allows

The traditional use case for Dynamics integrations is as follows:

Data Acquisition

Pulling data in
from source systems

Analytics Engineering

Mapping and transforming
SQL

Data Publishing

Pushing data out
Dynamics Connector


  1. Data Acquisition: In large part, the data acquisition phase is highly UI-driven. Please visit Data Acqusition for how to get data loaded into the lakehouse from a variety of source systems.
  2. Transform: Using the interactive notebook experience in Azure Databricks and Microsoft Fabric, move data across the lakehouse and transform it into a desired format. Customers can orchestrate these notebooks as needed with Workflows. See the Transform user guides on how to configure models in Empower.
  3. Publish: Write data to Dynamics tables and entities using Empower's robust Dynamics publishing capabilities. Keep reading this guide for how to leverage this feature set.

Table of Contents

The documentation is organized into three major sections, each addressing different aspects of working with the Empower for Dynamics library:

  1. Getting Started Guide
    Essential information for new users, including installation procedures, basic configuration, and introductory examples to help you begin using the library effectively. This section walks you through your first implementation and explains core concepts.

  2. Advanced Configuration Guide
    Detailed exploration of the library's configuration options, from performance tuning to debugging features. This section helps you optimize the library for your specific use case and understand the impact of each setting.

  3. Understanding the Processing Pipeline
    Learn how your data flows through the library's four-stage pipeline, from initial load through final processing. This section explains each stage's purpose and interaction, helping you understand how the library efficiently handles data transformations, parallel processing, and resource management to ensure reliable synchronization with Dynamics CRM.

  4. Understanding the Connection Management System
    Deep dive into how the library manages connections to Dynamics CRM, including connection pooling, service principal rotation, and parallel processing. This section explains how to scale your implementation effectively while maintaining reliable performance.

  5. Understanding Rate Limit Handling
    Comprehensive explanation of how the library handles Dynamics CRM's rate limits through its two-layer retry system. This section covers both immediate and deferred retry mechanisms, helping you understand how the library maintains throughput while respecting system constraints.

  6. Troubleshooting Guide
    Practical guidance for diagnosing and resolving common issues, including detailed explanations of error messages, logging capabilities, and performance optimization techniques. This section helps you maintain and optimize your implementation over time.

Key Vocabulary & Concepts

Understanding the fundamental concepts and terminology used throughout the Empower for Dynamics library is essential for effective implementation. This section covers the core concepts that form the foundation of the library's functionality.

Operation Modes

The library supports several operation modes, each designed for specific data synchronization scenarios:

Insert Mode

Insert mode is specifically designed for creating new records in the target system.

Key features include:

  • Optional optimized bulk loading with configurable batch sizes
  • Parallel processing for improved performance
  • Built-in data validation and error handling

Upsert Mode

Upsert mode combines insert and update operations into a single, intelligent process. This mode automatically determines whether each record should be created as new or updated based on sophisticated matching logic. It supports both ID-based and business key-based matching strategies, allowing for flexible implementation based on your specific needs.

Key features include:

  • Smart record matching using configurable key fields
  • Optimized change detection to minimize unnecessary updates
  • Support for complex matching criteria
  • Efficient handling of large datasets through batching