Troubleshooting Guide
When processing data with Dynamics V2, understanding how to effectively diagnose and resolve issues is crucial for maintaining reliable operations. This guide explains common issues, why they occur, and how to investigate them systematically.
Understanding Processing Metrics
The library maintains comprehensive metrics through its batch logging system. These metrics are your first line of defense in understanding processing behavior and identifying issues.
Core Success Metrics
SELECT
run_id,
source_table,
sum(partition_size) as total_processed,
sum(success_inserts) as success_inserts,
sum(success_updates) as success_updates,
sum(skipped) as skipped,
min(start_time) as start_time
FROM empower_ddu.batch_log
GROUP BY run_id, source_table
ORDER BY min(start_time) DESC
This query provides insight into the fundamental operation of your data processing:
- How many records were processed
- The distribution between inserts and updates
- How many records were skipped (unchanged)
- When processing occurred
Understanding these metrics helps establish a baseline for normal operation and makes it easier to spot anomalies.
Common Issues and Their Root Causes
1. Unexpected Updates
When records are being updated more frequently than expected, it often points to data normalization issues. This happens because Dynamics V2 compares source and target values to determine if an update is needed.
Why This Happens:
- Text data often contains hidden differences (whitespace, case, special characters)
- Date/time values may have different precision between systems
- Numerical data might use different scales or precision
- Character encodings can cause apparent differences
Investigation Strategy:
SELECT target_values_changed, COUNT(*)
FROM empower_ddu.contact_query_log
WHERE run_id = '<run_id>' AND mode = 'update'
GROUP BY target_values_changed
ORDER BY count(*) DESC
This query reveals exactly which fields are changing and how often. Look for patterns like:
- Fields with text values showing frequent changes
- Date/time fields consistently updating
- Fields that should rarely change showing frequent updates
2. Performance Patterns
Performance issues in Dynamics V2 typically stem from one of three areas:
- Resource utilization (workers and connections)
- Rate limit handling
- Data volume and batch sizing
Why Monitor Processing Time:
SELECT
run_id,
cast(sum(total_processing_time)/60 as decimal(18,2)) as total_mins,
cast(sum(deferred_processing_time)/60 as decimal(18,2)) as deferred_mins,
source_table,
min(start_time)
FROM empower_ddu.batch_log
GROUP BY run_id, source_table
This query helps understand:
- Overall processing efficiency
- Impact of rate limits (through deferred processing time)
- Processing patterns over time
A high ratio of deferred processing time to total time indicates rate limit impacts, while consistently increasing processing times might indicate growing data volumes or system load.
3. Data Quality Investigation
Data quality issues manifest in several ways:
- Invalid records failing validation
- Unexpected skipped records
- Inconsistent processing results
Understanding Error Patterns:
SELECT __ddu_error, COUNT(*)
FROM empower_ddu.contact
WHERE __ddu_run_id = '<run_id>'
GROUP BY __ddu_error
This query helps identify:
- Common validation failures
- Pattern of errors that might indicate systemic issues
- Data quality problems in source systems
Optimization Strategy
When optimizing Dynamics V2 operations, focus on three key areas:
1. Worker Configuration
The number of workers determines parallel processing capacity. The formula num_sps * 50
is used because:
- Each service principal supports 52 concurrent connections
- We reserve 2 connections per SP for system operations
- This maximizes throughput while respecting CRM limits
2. Batch Sizing
Batch size of 50,000 is recommended because:
- Large enough for efficient processing
- Small enough for manageable memory usage
- Provides good granularity for error recovery
- Balances throughput with resource usage
3. Data Handling
Data handling configuration should focus on:
- Normalizing data appropriately for your use case
- Managing memory efficiently
- Ensuring consistent comparison behavior
Proactive Monitoring
The best troubleshooting is preventive. Establish regular monitoring of:
- Success rates and error patterns
- Processing time trends
- Rate limit frequency
- Resource utilization
This allows you to:
- Identify issues before they become critical
- Optimize configuration proactively
- Maintain stable processing performance
Remember that Dynamics V2's extensive logging and metrics exist to help you understand and optimize your data processing. Use them regularly, not just when troubleshooting active issues.
Updated about 14 hours ago