Troubleshooting Guide

When processing data with Dynamics V2, understanding how to effectively diagnose and resolve issues is crucial for maintaining reliable operations. This guide explains common issues, why they occur, and how to investigate them systematically.

Understanding Processing Metrics

The library maintains comprehensive metrics through its batch logging system. These metrics are your first line of defense in understanding processing behavior and identifying issues.

Core Success Metrics

SELECT 
    run_id,
    source_table,
    sum(partition_size) as total_processed,
    sum(success_inserts) as success_inserts,
    sum(success_updates) as success_updates,
    sum(skipped) as skipped,
    min(start_time) as start_time
FROM empower_ddu.batch_log
GROUP BY run_id, source_table
ORDER BY min(start_time) DESC

This query provides insight into the fundamental operation of your data processing:

  • How many records were processed
  • The distribution between inserts and updates
  • How many records were skipped (unchanged)
  • When processing occurred

Understanding these metrics helps establish a baseline for normal operation and makes it easier to spot anomalies.

Common Issues and Their Root Causes

1. Unexpected Updates

When records are being updated more frequently than expected, it often points to data normalization issues. This happens because Dynamics V2 compares source and target values to determine if an update is needed.

Why This Happens:

  • Text data often contains hidden differences (whitespace, case, special characters)
  • Date/time values may have different precision between systems
  • Numerical data might use different scales or precision
  • Character encodings can cause apparent differences

Investigation Strategy:

SELECT target_values_changed, COUNT(*) 
FROM empower_ddu.contact_query_log 
WHERE run_id = '<run_id>' AND mode = 'update'
GROUP BY target_values_changed
ORDER BY count(*) DESC

This query reveals exactly which fields are changing and how often. Look for patterns like:

  • Fields with text values showing frequent changes
  • Date/time fields consistently updating
  • Fields that should rarely change showing frequent updates

2. Performance Patterns

Performance issues in Dynamics V2 typically stem from one of three areas:

  1. Resource utilization (workers and connections)
  2. Rate limit handling
  3. Data volume and batch sizing

Why Monitor Processing Time:

SELECT 
    run_id,
    cast(sum(total_processing_time)/60 as decimal(18,2)) as total_mins,
    cast(sum(deferred_processing_time)/60 as decimal(18,2)) as deferred_mins,
    source_table,
    min(start_time)
FROM empower_ddu.batch_log
GROUP BY run_id, source_table

This query helps understand:

  • Overall processing efficiency
  • Impact of rate limits (through deferred processing time)
  • Processing patterns over time

A high ratio of deferred processing time to total time indicates rate limit impacts, while consistently increasing processing times might indicate growing data volumes or system load.

3. Data Quality Investigation

Data quality issues manifest in several ways:

  • Invalid records failing validation
  • Unexpected skipped records
  • Inconsistent processing results

Understanding Error Patterns:

SELECT __ddu_error, COUNT(*) 
FROM empower_ddu.contact 
WHERE __ddu_run_id = '<run_id>'
GROUP BY __ddu_error

This query helps identify:

  • Common validation failures
  • Pattern of errors that might indicate systemic issues
  • Data quality problems in source systems

Optimization Strategy

When optimizing Dynamics V2 operations, focus on three key areas:

1. Worker Configuration

The number of workers determines parallel processing capacity. The formula num_sps * 50 is used because:

  • Each service principal supports 52 concurrent connections
  • We reserve 2 connections per SP for system operations
  • This maximizes throughput while respecting CRM limits

2. Batch Sizing

Batch size of 50,000 is recommended because:

  • Large enough for efficient processing
  • Small enough for manageable memory usage
  • Provides good granularity for error recovery
  • Balances throughput with resource usage

3. Data Handling

Data handling configuration should focus on:

  • Normalizing data appropriately for your use case
  • Managing memory efficiently
  • Ensuring consistent comparison behavior

Proactive Monitoring

The best troubleshooting is preventive. Establish regular monitoring of:

  1. Success rates and error patterns
  2. Processing time trends
  3. Rate limit frequency
  4. Resource utilization

This allows you to:

  • Identify issues before they become critical
  • Optimize configuration proactively
  • Maintain stable processing performance

Remember that Dynamics V2's extensive logging and metrics exist to help you understand and optimize your data processing. Use them regularly, not just when troubleshooting active issues.