Database Synchronization System - Complete Manual

Table of Contents

  1. System Overview
  2. Core Synchronization Logic
  3. Sync Planning Algorithm
  4. Performance Optimizations
  5. Binary Search Engine
  6. Data Movement Pipeline
  7. Sleep and Scheduling Management
  8. Error Handling & Recovery
  9. Common Problems & Solutions
  10. Monitoring & Debugging
  11. Best Practices

System Overview

The database synchronization system (db-sync.lua) is a sophisticated, production-grade tool designed to synchronize data between multiple database systems with different schemas and architectures. It supports various database types including 4D, PostgreSQL, SQLite, and REST APIs.

Key Features

Architecture

flowchart TD A[Source Database<br/>SOURCE] --> C[Sync Engine<br/>db-sync.lua] C --> B[Target Database<br/>TARGET] C --> D[Configuration<br/>db-sync.json] subgraph "Database Types" A1[4D Database] A2[PostgreSQL] A3[SQLite] A4[REST API] end subgraph "Core Components" C1[Binary Search Engine] C2[Data Movement Pipeline] C3[Error Handling & Recovery] C4[Progress Monitoring] end A -.-> A1 A -.-> A2 A -.-> A3 A -.-> A4 C -.-> C1 C -.-> C2 C -.-> C3 C -.-> C4 style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style B fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000 style C fill:#fff3e0,stroke:#f57c00,stroke-width:4px,color:#000 style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000 style A1 fill:#e1f5fe,color:#000 style A2 fill:#e1f5fe,color:#000 style A3 fill:#e1f5fe,color:#000 style A4 fill:#e1f5fe,color:#000 style C1 fill:#fff3e0,color:#000 style C2 fill:#fff3e0,color:#000 style C3 fill:#fff3e0,color:#000 style C4 fill:#fff3e0,color:#000

Core Synchronization Logic

Main Execution Flow

flowchart TD A[Start] --> B[Load Configuration<br/>db-sync.json] B --> C[Establish Database<br/>Connections] C --> D[Validate Schema<br/>Compatibility] D --> E[Initialize Performance<br/>Counters] E --> F[Enumerate Tables<br/>to Synchronize] F --> G[Count Records in<br/>Source & Target] G --> H[Calculate Modification<br/>Timestamps] H --> I[Build Synchronization<br/>Plan] I --> J{Planning<br/>Complete?} J -->|No| F J -->|Yes| K[Analyze Count<br/>Differences] K --> L[Determine Sync Operations<br/>add/delete/modify] L --> M[Choose Optimization<br/>Strategies] M --> N[Generate Execution<br/>Plan] N --> O[Execute Sync Plan<br/>in Optimal Order] O --> P[Apply Binary Search<br/>Optimizations] P --> Q[Batch Process<br/>Data Movements] Q --> R[Track Progress &<br/>Performance] R --> S{More<br/>Tables?} S -->|Yes| O S -->|No| T[Validate Sync<br/>Results] T --> U[Check Record<br/>Counts] U --> V[Report Statistics<br/>& Errors] V --> W[End] style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style W fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style S fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style B fill:#f3e5f5,color:#000 style C fill:#f3e5f5,color:#000 style D fill:#f3e5f5,color:#000 style E fill:#f3e5f5,color:#000 style F fill:#e8f5e8,color:#000 style G fill:#e8f5e8,color:#000 style H fill:#e8f5e8,color:#000 style I fill:#e8f5e8,color:#000 style K fill:#fff8e1,color:#000 style L fill:#fff8e1,color:#000 style M fill:#fff8e1,color:#000 style N fill:#fff8e1,color:#000 style O fill:#e0f2f1,color:#000 style P fill:#e0f2f1,color:#000 style Q fill:#e0f2f1,color:#000 style R fill:#e0f2f1,color:#000 style T fill:#ffebee,color:#000 style U fill:#ffebee,color:#000 style V fill:#ffebee,color:#000
  1. Initialization Phase

  2. Discovery Phase

  3. Planning Phase

  4. Execution Phase

  5. Verification Phase

Table Processing Order

Tables are processed in dependency order to maintain referential integrity:

-- Example processing order
1. Reference tables (currency, terms_of_payment)
2. Master data (company, product)
3. Transactional data (orders, invoices)
4. Detail records (order_row, invoice_row)

Record Type Handling

The system supports multiple record types per table:

-- Example: product table with multiple record types
product-work          -- Work-related products
product-material      -- Material products
product-service       -- Service products

Sync Planning Algorithm

The sync planning algorithm intelligently determines what operations to perform based on record counts, timestamps, and configuration settings.

Decision Matrix

flowchart TD A[Compare Record Counts<br/>source_count vs target_count] --> B{target_count == 0<br/>AND source_count > 0?} B -->|Yes| C[ADD ALL<br/>Destination empty] B -->|No| D{from_count < to_count?} D -->|Yes| E[DELETE → ADD/MODIFY<br/>Destination has extras] D -->|No| F{from_count == to_count<br/>AND trusted_modify_id?} F -->|Yes| G[INCREMENTAL SYNC<br/>Trust timestamps] F -->|No| H{from_count == to_count<br/>AND !trusted_modify_id?} H -->|Yes| I[FULL COMPARE<br/>Verify all records] H -->|No| J{from_count > to_count<br/>AND changes > 0?} J -->|Yes| K[ADD → INCREMENTAL/MODIFY<br/>Source has more + changes] J -->|No| L[FULL COMPARE<br/>Complex scenario] style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style C fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style D fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style E fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000 style F fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style G fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style H fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style I fill:#fff3e0,stroke:#fbc02d,stroke-width:3px,color:#000 style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style K fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000 style L fill:#fce4ec,stroke:#7b1fa2,stroke-width:3px,color:#000
Condition Action Reason
target_count == 0 && source_count > 0 ADD ALL Target empty
source_count < target_count DELETEADD/MODIFY Target has extras
source_count == target_count && trusted_modify_id INCREMENTAL No count change, trust timestamps
source_count == target_count && !trusted_modify_id FULL COMPARE No count change, verify all
source_count > target_count && changes > 0 ADDINCREMENTAL/MODIFY Source has more + changes

Trust Modify ID Logic

local trustPrev = syncPrf.trust_modify_id == true and hasPrevModifyId
if trustPrev then
    -- Fast incremental sync using modify timestamps
    plan = {"incremental"}
else
    -- Full comparison required
    plan = {"add", "changed"}
end

Sync Plan Examples

Scenario 1: Initial Sync (Empty Destination)

source=5000, target=0, changed=5000
Plan: ["add"]
Reason: "target empty => add all"

Scenario 2: Records Deleted from Source

source=4990, target=5000, changed=0
Plan: ["delete"]
Reason: "target has more rows => delete extras"

Scenario 3: Trusted Incremental Sync

source=5010, target=5000, changed=10, trust_modify_id=true
Plan: ["add", "incremental"]
Reason: "source has more rows & trusted modify id"

Scenario 4: Untrusted Full Sync

source=5010, target=5000, changed=10, trust_modify_id=false
Plan: ["add", "incremental", "changed"]
Reason: "source has more rows & untrusted modify id"

Performance Optimizations

1. Binary Search Engine

The binary search optimization dramatically reduces query complexity for large datasets:

Traditional Approach: O(n) - Read all IDs from both databases Binary Search: O(log n) - Recursively narrow down differences

When Binary Search is Used:

Binary Search Process:

  1. Find midpoint record using OFFSET/LIMIT
  2. Count records in upper/lower halves in both databases
  3. If counts differ, recursively search that half
  4. Continue until range ≤ binary_search_read_batch (default: 500)
  5. Process final batch directly

2. Batch Processing

All data operations use configurable batch sizes for optimal performance:

-- Batch size configuration (per database type)
batch_size: 5000           -- General operations
batch_size_4d: 1000        -- 4D database operations
delete_batch_size: 1000    -- Delete operations
delete_batch_size_4d: 500  -- 4D delete operations
id_array_batch_size: 25000 -- ID array operations

3. Recent-First Optimization

When using binary search, the system can prioritize recent records:

binary_search_recent_first: 5000  -- Search last 5000 records first

This is highly effective because most changes occur in recently created records.

4. Connection Pooling

Database connections are reused across operations to minimize connection overhead:

-- Connection reuse pattern
local conn = dconn.connection({organizationId = dbId})
-- Multiple operations using same connection
dconn.disconnectAll() -- Clean up at end

5. Incremental Synchronization

When trust_modify_id = true, only records modified since the last sync are processed:

-- Query with modify time filter
WHERE modify_time > last_sync_modify_time

6. Memory Management

-- Explicit garbage collection between tables
collectgarbage()
-- Large arrays are processed in chunks
for i = 1, #largeArray, batchSize do
    local batch = table.slice(largeArray, i, i + batchSize - 1)
    processBatch(batch)
end

Binary Search Engine

Algorithm Overview

flowchart TD A[Start Binary Search<br/>Full Range: 1 to N] --> B[Initialize Stack with<br/>Full Range] B --> C{Stack<br/>Empty?} C -->|Yes| X[Return Results] C -->|No| D[Pop Range from Stack] D --> E{Range Size <=<br/>batch_size?} E -->|Yes| F[Process Range Directly<br/>Read & Compare Records] E -->|No| G[Find Midpoint ID<br/>using OFFSET/LIMIT] F --> H[Find Differences<br/>Add to Results] H --> C G --> I[Count Records in<br/>Lower Half 1 to mid] I --> J[Count Records in<br/>Upper Half mid to N] J --> K{Lower Half<br/>has differences?} K -->|Yes| L[Push Lower Range<br/>to Stack] K -->|No| M{Upper Half<br/>has differences?} L --> M M -->|Yes| N[Push Upper Range<br/>to Stack] M -->|No| C N --> C style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style X fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style F fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000 style G fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000 style B fill:#f3e5f5,color:#000 style C fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style D fill:#e0f2f1,color:#000 style E fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style H fill:#fff3e0,color:#000 style I fill:#e8f5e8,color:#000 style J fill:#e8f5e8,color:#000 style K fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style L fill:#e8f5e8,color:#000 style M fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style N fill:#e8f5e8,color:#000 subgraph "Progress Display" P1[2.8%↑5000 - Upper Range] P2[97%↓174197 - Lower Range] end style P1 fill:#e8f5e8,color:#000 style P2 fill:#ffebee,color:#000

Binary Search Process Visualization

flowchart TD subgraph "Database Range: 1 to 100,000" A1[Full Range<br/>1 ↔ 100,000<br/>Check counts differ] A1 --> B1[Lower Half<br/>1 ↔ 50,000<br/>Counts match ✓] A1 --> B2[Upper Half<br/>50,001 ↔ 100,000<br/>Counts differ ✗] B2 --> C1[Range: 50,001 ↔ 75,000<br/>Counts match ✓] B2 --> C2[Range: 75,001 ↔ 100,000<br/>Counts differ ✗] C2 --> D1[Range: 75,001 ↔ 87,500<br/>Counts differ ✗] C2 --> D2[Range: 87,501 ↔ 100,000<br/>Counts match ✓] D1 --> E1[Range: 75,001 ↔ 81,250<br/>Size ≤ 500<br/>Process directly] D1 --> E2[Range: 81,251 ↔ 87,500<br/>Size ≤ 500<br/>Process directly] end style A1 fill:#e1f5fe,stroke:#1976d2,stroke-width:2px,color:#000 style B1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000 style C1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000 style D2 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000 style B2 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000 style C2 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000 style D1 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000 style E1 fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000 style E2 fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000

Algorithm Details

The binary search engine is the most sophisticated optimization in the system:

function binarySearch(syncRec, fromId, toId, operation)
    local stack = {}
    local resultArr = {}
    -- Initialize with full range
    stack[#stack + 1] = {
        startId = "",
        endId = "",
        toCount = totalToCount,
        fromCount = totalFromCount,
        depth = 0
    }
    while #stack > 0 and #resultArr < expectedResults do
        local range = table.remove(stack)
        if range.toCount <= batchSize then
            -- Small range: process directly
            local records = getRecordsInRange(range.startId, range.endId)
            local differences = findDifferences(records)
            table.append(resultArr, differences)
        else
            -- Large range: subdivide
            local midId = getMidpointId(range.startId, range.endId)
            local lowerCounts = countRecordsInRange(startId, midId)
            local upperCounts = countRecordsInRange(midId, endId)

            -- Add ranges with differences to stack
            if hasRecordDifferences(lowerCounts) then
                stack[#stack + 1] = createLowerRange(range, midId, lowerCounts)
            end
            if hasRecordDifferences(upperCounts) then
                stack[#stack + 1] = createUpperRange(range, midId, upperCounts)
            end
        end
    end
    return resultArr
end

Binary Search Advantages

  1. Logarithmic Complexity: O(log n) vs O(n) for full scans
  2. Network Efficiency: Fewer large data transfers
  3. Memory Efficiency: Process small chunks instead of entire datasets
  4. Interruptible: Can stop early when target count reached
  5. Progress Tracking: Real-time feedback on search progress

Binary Search Limitations

  1. Query Overhead: Many small queries vs few large ones
  2. Not Always Optimal: Small differences may be faster with full scan
  3. Database Compatibility: Requires OFFSET/LIMIT support
  4. ID Ordering: Assumes ID-based ordering for range splits

Data Movement Pipeline

Pipeline Overview

flowchart LR subgraph "READ PHASE" A[Read Source Data<br/>in Batches] B[Read Target IDs<br/>for Comparison] A --> C[Source Data Array] B --> D[Target ID Index] end subgraph "TRANSFORM PHASE" C --> E[Compare Records] D --> E E --> F[Records to ADD] E --> G[Records to MODIFY] E --> H[Records to DELETE] end subgraph "WRITE PHASE" F --> I[Batch INSERT<br/>Operations] G --> J[Batch UPDATE<br/>Operations] H --> K[Batch DELETE<br/>Operations] end subgraph "SCHEMA TRANSLATION" L[Field Mapping] M[Type Conversion] N[Data Validation] L --> M --> N N --> I N --> J end style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000 style B fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000 style C fill:#e3f2fd,color:#000 style D fill:#e3f2fd,color:#000 style E fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#000 style F fill:#e8f5e8,color:#000 style G fill:#fff8e1,color:#000 style H fill:#ffebee,color:#000 style I fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000 style J fill:#fff8e1,stroke:#fbc02d,stroke-width:2px,color:#000 style K fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style L fill:#f3e5f5,color:#000 style M fill:#f3e5f5,color:#000 style N fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000

Data Flow Sequence

sequenceDiagram participant S as Source DB participant E as Sync Engine participant D as Destination DB Note over E: READ PHASE E->>S: Read batch 1 (records 1-5000) S-->>E: Return source records E->>S: Read batch 2 (records 5001-10000) S-->>E: Return source records E->>D: Read all IDs for comparison D-->>E: Return ID array Note over E: TRANSFORM PHASE E->>E: Build target ID index E->>E: Compare source vs target E->>E: Categorize: ADD/MODIFY/DELETE E->>E: Apply schema transformations Note over E: WRITE PHASE E->>D: Batch DELETE (if any) D-->>E: Confirm deletions E->>D: Batch INSERT (new records) D-->>E: Confirm insertions E->>D: Batch UPDATE (modified records) D-->>E: Confirm updates Note over E: VERIFICATION E->>D: Verify final record counts D-->>E: Return counts

Read Phase

-- 1. Read source data in batches
local sourceData = {}
for batchStart = 1, totalCount, batchSize do
    local batch = readBatch(fromId, batchStart, batchSize)
    table.append(sourceData, batch)
end
-- 2. Read destination IDs for comparison
local destIds = readIdArray(toId, idField)
local destIdIndex = invertTable(destIds)

Transform Phase

-- 3. Compare and categorize records
local toAdd = {}
local toModify = {}
local toDelete = {}
for _, sourceRecord in ipairs(sourceData) do
    local id = sourceRecord[idField]
    if destIdIndex[id] then
        if recordsAreModified(sourceRecord, destRecord) then
            toModify[#toModify + 1] = sourceRecord
        end
    else
        toAdd[#toAdd + 1] = sourceRecord
    end
end
-- Find records to delete (in destination but not source)
for destId in pairs(destIdIndex) do
    if not sourceIdIndex[destId] then
        toDelete[#toDelete + 1] = destId
    end
end

Write Phase

-- 4. Execute operations in optimal order
if #toDelete > 0 then
    executeBatchDeletes(toId, toDelete, deleteBatchSize)
end
if #toAdd > 0 then
    executeBatchInserts(toId, toAdd, insertBatchSize)
end
if #toModify > 0 then
    executeBatchUpdates(toId, toModify, updateBatchSize)
end

Schema Translation

-- 5. Handle schema differences during transform
local transformedRecord = {}
for fieldName, fieldValue in pairs(sourceRecord) do
    local destFieldName = fieldMapping[fieldName] or fieldName
    local destFieldType = destSchema[destFieldName]
    transformedRecord[destFieldName] = convertFieldValue(
        fieldValue,
        sourceSchema[fieldName],
        destFieldType
    )
end

This documentation provides a complete understanding of the database synchronization test suite and its role in validating the sync system.

The database synchronization system includes sophisticated sleep and scheduling capabilities that enable automated, recurring synchronization with intelligent timing management.

Sleep System Architecture

flowchart TD A[Sync Complete] --> B{Scheduling<br/>Configuration?} B -->|sleep_seconds| C[Interval-based Sleep] B -->|run_at_time| D[Time-based Sleep] B -->|None| E[Exit] C --> F[smartSleep Function] D --> F F --> G[Calculate Sleep Duration] G --> H{Duration > max_chunk?} H -->|Yes| I[Sleep in Chunks] H -->|No| J[Single Sleep] I --> K[Sleep max_chunk seconds] K --> L{Target Time<br/>Reached?} L -->|No| M[Progress Update] M --> N{More Sleep<br/>Needed?} N -->|Yes| K N -->|No| O[Wake Up] L -->|Yes| O J --> O O --> P[Start Next Sync]

Configuration Options

Interval-based Scheduling

sleep_seconds: Sets automatic recurring synchronization intervals

{
    "sleep_seconds": 3600    // Run every hour
}

Use Cases:

Time-based Scheduling

run_at_time: Schedule sync at specific times

{
    "run_at_time": "02:00"   // Run at 2:00 AM daily
}

Use Cases:

Chunked Sleep Management

max_sleep_chunk_seconds: Maximum sleep duration per chunk (default: 3600 seconds)

{
    "max_sleep_chunk_seconds": 3600  // 1-hour maximum chunks
}

Benefits:

Smart Sleep Implementation

Core Algorithm

function smartSleep(totalSeconds, maxChunkSeconds)
    maxChunkSeconds = maxChunkSeconds or 3600  -- Default 1 hour
    while totalSeconds > 0 do
        local chunkSeconds = math.min(totalSeconds, maxChunkSeconds)
        -- Sleep for this chunk
        util.sleep(chunkSeconds * 1000)
        totalSeconds = totalSeconds - chunkSeconds
        -- Check if target time reached (for run_at_time)
        if shouldWakeUpEarly() then
            break
        end
        -- Progress update
        if totalSeconds > 0 then
            print(string.format("Sleep progress: %d seconds remaining", totalSeconds))
        end
    end
end

Early Wake-up Logic

For run_at_time scheduling, the system can wake up early when the target time is reached:

-- Example: Scheduled for 02:00, but it's now 01:59
-- System wakes up 1 minute early instead of sleeping full hour

Scheduling Scenarios

Scenario 1: Hourly Sync

{
    "sleep_seconds": 3600,
    "max_sleep_chunk_seconds": 1800  // 30-minute chunks
}

Behavior:

Scenario 2: Daily 2 AM Sync

{
    "run_at_time": "02:00",
    "max_sleep_chunk_seconds": 3600  // 1-hour chunks
}

Behavior:

Scenario 3: Long Sleep with Monitoring

{
    "sleep_seconds": 86400,           // 24 hours
    "max_sleep_chunk_seconds": 3600   // 1-hour chunks
}

Behavior:

Progress Monitoring

Sleep Progress Display

Sync complete. Sleeping 86400 seconds until next run...
Sleep progress: 82800 seconds remaining (23.0 hours)
Sleep progress: 79200 seconds remaining (22.0 hours)
Sleep progress: 75600 seconds remaining (21.0 hours)
...
Sleep progress: 3600 seconds remaining (1.0 hours)
Sleep complete. Starting next sync cycle...

Time-based Progress

Sync complete. Waiting until 02:00 for next run...
Current time: 22:30, target: 02:00 (3.5 hours remaining)
Sleep progress: 10800 seconds remaining (3.0 hours)
Sleep progress: 7200 seconds remaining (2.0 hours)
Sleep progress: 3600 seconds remaining (1.0 hours)
Target time reached. Starting sync at 02:00...

Configuration Best Practices

Sleep Configuration Guidelines

  1. Choose appropriate chunk sizes:

  2. Monitor resource usage:

  3. Error handling integration:

Performance Considerations

Troubleshooting

Common Issues

Problem: Sleep seems to hang indefinitely Solution: Check max_sleep_chunk_seconds configuration

Problem: Sync doesn't start at expected time Solution: Verify run_at_time format and timezone settings

Problem: Too many progress updates Solution: Increase max_sleep_chunk_seconds value

Problem: System doesn't wake up early Solution: Ensure early wake-up logic is properly configured for run_at_time


Error Handling & Recovery

Error Recovery Flow

flowchart TD A[Operation Starts] --> B{Error<br/>Occurred?} B -->|No| C[Operation Successful] B -->|Yes| D[Categorize Error Type] D --> E{Connection<br/>Error?} D --> F{Schema<br/>Error?} D --> G{Data<br/>Error?} D --> H{Logic<br/>Error?} E -->|Yes| E1[Disconnect All<br/>Connections] E1 --> E2[Wait 30 seconds] E2 --> E3[Retry Connection] E3 --> I{Retry<br/>Successful?} F -->|Yes| F1[Check Schema<br/>Compatibility] F1 --> F2[Apply Schema<br/>Transformations] F2 --> J{Schema<br/>Fixed?} G -->|Yes| G1[Isolate Failed<br/>Record] G1 --> G2[Continue with<br/>Next Record] G2 --> K{Error Count <br/>max_error_count?} H -->|Yes| H1[Check Binary Search<br/>Settings] H1 --> H2{Fallback<br/>Enabled?} H2 -->|Yes| H3[Fallback to<br/>Full Compare] H2 -->|No| L[Fail Operation] I -->|Yes| M[Resume Operation] I -->|No| L J -->|Yes| M J -->|No| L K -->|Yes| M K -->|No| L H3 --> M M --> B C --> N[End Success] L --> O[End Failure] style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style C fill:#c8e6c9,color:#000 style D fill:#fff8e1,color:#000 style E fill:#ffebee,color:#000 style F fill:#e8f5e8,color:#000 style G fill:#f3e5f5,color:#000 style H fill:#fff3e0,color:#000 style E1 fill:#ffebee,color:#000 style E2 fill:#ffebee,color:#000 style E3 fill:#ffebee,color:#000 style F1 fill:#e8f5e8,color:#000 style F2 fill:#e8f5e8,color:#000 style G1 fill:#f3e5f5,color:#000 style G2 fill:#f3e5f5,color:#000 style H1 fill:#fff3e0,color:#000 style H2 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style H3 fill:#fff3e0,color:#000 style I fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style K fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style L fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000 style M fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000 style N fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style O fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000

Error Categories

  1. Connection Errors: Database unavailable, network issues
  2. Schema Errors: Missing tables, incompatible field types
  3. Data Errors: Constraint violations, invalid values
  4. Logic Errors: Unexpected conditions, algorithm failures

Error Recovery Mechanisms

1. Automatic Fallback

-- Binary search fallback to full compare
if binarySearchFailed and syncPrf.fallback_full_compare_on_mismatch then
    util.printWarning("Binary search failed, falling back to full compare")
    return syncCompare(syncRec, tbl, schema, fromId, toId, fldArr, stat, operation)
end

2. Batch Error Isolation

-- Continue processing other records when individual records fail
saveParam.parameter.continue_on_error = syncPrf.max_error_count
-- Process remaining batches even if one batch fails
if batchError and stat.errorCount < syncPrf.max_error_count then
    stat.errorCount = stat.errorCount + 1
    continue -- Process next batch
end

3. Connection Recovery

-- Automatic reconnection on connection loss
if connectionLost then
    dconn.disconnectAll()
    defaultConn = nil
    util.sleep(30000) -- Wait 30 seconds
    -- Retry connection on next iteration
end

4. Transaction Rollback

-- Rollback incomplete operations
if operationFailed then
    if transactionActive then
        rollbackTransaction()
    end
    -- Restore previous state
    restoreFromBackup()
end

Error Thresholds

max_error_count: 25      -- Stop after 25 errors
max_error_loop: 5        -- Stop after 5 consecutive error loops

Error Reporting

-- Comprehensive error tracking
local errorTbl = {}
errorTbl[#errorTbl + 1] = util.printError("Error description: %s", details)
-- Error summary at end
if #errorTbl > 0 then
    util.printRed("Sync completed with %d errors:", #errorTbl)
    for i, error in ipairs(errorTbl) do
        util.printRed("  %d. %s", i, error)
    end
end

Common Problems & Solutions

Problem 1: Slow Synchronization

Symptoms:

Diagnosis:

-- Check if binary search is being used
binary_search_for_add: true
binary_search_min_table_size: 1000  -- Lower for smaller tables
-- Monitor query counts in logs
-- Look for "using full compare" vs "using binary search"

Solutions:

  1. Enable binary search for large tables
  2. Increase batch sizes for better throughput
  3. Use incremental sync with trust_modify_id: true
  4. Optimize database indexes on ID and modify_time fields
  5. Reduce table scope with sync_table configuration

Problem 2: Memory Issues

Symptoms:

Solutions:

-- Reduce batch sizes
batch_size: 1000         -- Down from 5000
id_array_batch_size: 5000 -- Down from 25000
-- Force garbage collection
collectgarbage() -- Called between tables automatically

Problem 3: Data Inconsistencies

Symptoms:

Diagnosis:

-- Check count mismatches in logs
"counts mismatch after binary delete"
"fallback full compare on mismatch"
-- Verify ID field configuration
primary_key_field: "record_id"  -- Ensure correct field

Solutions:

  1. Verify ID field mapping between databases
  2. Check for ID conflicts (duplicate IDs)
  3. Enable fallback mechanism: fallback_full_compare_on_mismatch: true
  4. Use full compare mode temporarily: trust_modify_id: false

Problem 4: Schema Conflicts

Symptoms:

Solutions:

-- Map incompatible fields
"field_mapping": {
    "source_field": "dest_field",
    "varchar_field": "text_field"
}
-- Handle schema differences
set_default_value: true  -- Set defaults for missing fields
copy_all_json_keys: true -- Handle JSON field differences

Problem 5: Connection Timeouts

Symptoms:

Solutions:

-- Increase timeout settings (in connection config)
"timeout": 300,          -- 5 minutes
"retry_count": 3,        -- Retry failed operations
"retry_delay": 5000      -- Wait 5 seconds between retries

Monitoring & Debugging

Performance Metrics

The system provides comprehensive performance tracking:

-- Timing metrics
readTimeAll              -- Total time reading data
writeTimeAll             -- Total time writing data
deleteTimeAll            -- Total time deleting data
dataCompareTimeAll       -- Total time comparing data
binarySearchTimeAll      -- Total time in binary search
-- Operation counts
queryCountAll            -- Total database queries
recordCountAllTables     -- Total records processed
recordCountAllAdded      -- Total records added
recordCountAllModified   -- Total records modified
recordCountAllDeleted    -- Total records deleted
recordCountAllSkipped    -- Total records skipped
-- Binary search metrics
binarySearchCountAll     -- Number of binary searches
binarySearchIterationCount -- Total search iterations
binarySearchQueryCount   -- Queries used in binary search
binarySearchMaxDepthAll  -- Maximum search depth reached

Debug Configuration

-- Enable detailed logging
show_sql: true           -- Show SQL queries
debug_sql: true          -- Show SQL execution details
show_save_sql: true      -- Show save operation SQL
debug_connection_change: true  -- Show connection changes
-- Performance analysis
only_sync_plan: true     -- Plan-only mode (no actual sync)
check_all_tables: true   -- Check all tables regardless of changes

Progress Monitoring

Real-time progress display shows:

delete binary search 179196 records, search 5000 recent first, need to find 2:
2.8%↑5000 97%↓174197 49%↑87097 24%↑43547 12%↓21775 6.1%↑10886

Log Analysis

Key log patterns to monitor:

# Successful operations
"records to sync: 100 / 5000, records synced: 4900 / 5000"
"using binary search for adds (from=5100, to=5000, diff=100)"
# Performance warnings
"using full compare for adds (reason=large difference)"
"fallback full compare on mismatch"
# Error conditions
"Binary search: Invalid toCount=0"
"sync connection failed"
"maximum error count 25 was reached"

Best Practices

1. Configuration Optimization

Start with Conservative Settings:

{
    "batch_size": 1000,
    "binary_search_min_table_size": 5000,
    "trust_modify_id": false,
    "fallback_full_compare_on_mismatch": true
}

Scale Up Gradually:

{
    "batch_size": 5000,        -- Increase as performance allows
    "trust_modify_id": true,   -- Enable after initial sync
    "binary_search_recent_first": 5000  -- Focus on recent changes
}

2. Database Optimization

Essential Indexes:

-- Primary key index (usually automatic)
CREATE INDEX idx_table_record_id ON table_name (record_id);

-- Modify time index for incremental sync
CREATE INDEX idx_table_modify_time ON table_name (modify_time);

-- Composite index for range queries
CREATE INDEX idx_table_id_modify ON table_name (record_id, modify_time);

Statistics Updates:

-- Keep database statistics current
ANALYZE TABLE table_name;
UPDATE STATISTICS table_name;

3. Incremental Sync Strategy

Phase 1: Initial Sync

{
    "trust_modify_id": false,     -- Full compare for accuracy
    "binary_search_for_add": true, -- Use optimization
    "sync_all": true              -- Sync everything
}

Phase 2: Ongoing Sync

{
    "trust_modify_id": true,      -- Fast incremental
    "sleep_seconds": 3600,        -- Hourly sync
    "min_rows_to_sync": 1         -- Skip if no changes
}

4. Error Prevention

Validate Configuration:

-- Test with single table first
"sync_table": ["test_table"],
"only_sync_plan": true  -- Dry run mode
-- Enable all safety features
"fallback_full_compare_on_mismatch": true,
"max_error_count": 5,    -- Low threshold for testing
"continue_on_error": false -- Stop on first error

Monitor Resource Usage:

# Monitor memory usage
top -p $(pgrep lj)
# Monitor disk I/O
iostat -x 1
# Monitor network traffic
netstat -i

5. Maintenance Schedule

Daily:

Weekly:

Monthly:


Conclusion

The database synchronization system is a robust, production-ready solution for keeping multiple databases in sync. Its sophisticated algorithms, comprehensive error handling, and extensive optimization features make it suitable for demanding enterprise environments.

Key success factors:

  1. Proper configuration for your specific environment
  2. Adequate database indexing for performance
  3. Regular monitoring and maintenance
  4. Gradual optimization based on actual performance
  5. Comprehensive testing before production deployment

For additional support and configuration examples, refer to the db-sync.json configuration manual and the binary search progress display manual.