# Database Synchronization System - Complete Manual

## Table of Contents

1. [System Overview](#system-overview)
2. [Core Synchronization Logic](#core-synchronization-logic)
3. [Sync Planning Algorithm](#sync-planning-algorithm)
4. [Performance Optimizations](#performance-optimizations)
5. [Binary Search Engine](#binary-search-engine)
6. [Data Movement Pipeline](#data-movement-pipeline)
7. [Sleep and Scheduling Management](#sleep-and-scheduling-management)
8. [Error Handling & Recovery](#error-handling--recovery)
9. [Common Problems & Solutions](#common-problems--solutions)
10. [Monitoring & Debugging](#monitoring--debugging)
11. [Best Practices](#best-practices)

---

## System Overview

The database synchronization system (`db-sync.lua`) is a sophisticated, production-grade tool designed to synchronize data between multiple database systems with different schemas and architectures. It supports various database types including 4D, PostgreSQL, SQLite, and REST APIs.

### Key Features

- **Multi-database support**: 4D, PostgreSQL, SQLite, REST APIs
- **Schema-aware synchronization**: Handles different field types and constraints
- **Binary search optimization**: Efficient algorithms for large datasets
- **Incremental synchronization**: Uses modify timestamps for partial syncs
- **Batch processing**: Configurable batch sizes for optimal performance
- **Error recovery**: Automatic fallback mechanisms and retry logic
- **Real-time monitoring**: Progress tracking and performance metrics

### Architecture

```mermaid
flowchart TD
    A[Source Database<br/>SOURCE] --> C[Sync Engine<br/>db-sync.lua]
    C --> B[Target Database<br/>TARGET]
    C --> D[Configuration<br/>db-sync.json]

    subgraph "Database Types"
        A1[4D Database]
        A2[PostgreSQL]
        A3[SQLite]
        A4[REST API]
    end

    subgraph "Core Components"
        C1[Binary Search Engine]
        C2[Data Movement Pipeline]
        C3[Error Handling & Recovery]
        C4[Progress Monitoring]
    end

    A -.-> A1
    A -.-> A2
    A -.-> A3
    A -.-> A4

    C -.-> C1
    C -.-> C2
    C -.-> C3
    C -.-> C4

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000
    style C fill:#fff3e0,stroke:#f57c00,stroke-width:4px,color:#000
    style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
    style A1 fill:#e1f5fe,color:#000
    style A2 fill:#e1f5fe,color:#000
    style A3 fill:#e1f5fe,color:#000
    style A4 fill:#e1f5fe,color:#000
    style C1 fill:#fff3e0,color:#000
    style C2 fill:#fff3e0,color:#000
    style C3 fill:#fff3e0,color:#000
    style C4 fill:#fff3e0,color:#000
```

---

## Core Synchronization Logic

### Main Execution Flow

```mermaid
flowchart TD
    A[Start] --> B[Load Configuration<br/>db-sync.json]
    B --> C[Establish Database<br/>Connections]
    C --> D[Validate Schema<br/>Compatibility]
    D --> E[Initialize Performance<br/>Counters]

    E --> F[Enumerate Tables<br/>to Synchronize]
    F --> G[Count Records in<br/>Source & Target]
    G --> H[Calculate Modification<br/>Timestamps]
    H --> I[Build Synchronization<br/>Plan]

    I --> J{Planning<br/>Complete?}
    J -->|No| F
    J -->|Yes| K[Analyze Count<br/>Differences]

    K --> L[Determine Sync Operations<br/>add/delete/modify]
    L --> M[Choose Optimization<br/>Strategies]
    M --> N[Generate Execution<br/>Plan]

    N --> O[Execute Sync Plan<br/>in Optimal Order]
    O --> P[Apply Binary Search<br/>Optimizations]
    P --> Q[Batch Process<br/>Data Movements]
    Q --> R[Track Progress &<br/>Performance]

    R --> S{More<br/>Tables?}
    S -->|Yes| O
    S -->|No| T[Validate Sync<br/>Results]

    T --> U[Check Record<br/>Counts]
    U --> V[Report Statistics<br/>& Errors]
    V --> W[End]

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style W fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style S fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style B fill:#f3e5f5,color:#000
    style C fill:#f3e5f5,color:#000
    style D fill:#f3e5f5,color:#000
    style E fill:#f3e5f5,color:#000
    style F fill:#e8f5e8,color:#000
    style G fill:#e8f5e8,color:#000
    style H fill:#e8f5e8,color:#000
    style I fill:#e8f5e8,color:#000
    style K fill:#fff8e1,color:#000
    style L fill:#fff8e1,color:#000
    style M fill:#fff8e1,color:#000
    style N fill:#fff8e1,color:#000
    style O fill:#e0f2f1,color:#000
    style P fill:#e0f2f1,color:#000
    style Q fill:#e0f2f1,color:#000
    style R fill:#e0f2f1,color:#000
    style T fill:#ffebee,color:#000
    style U fill:#ffebee,color:#000
    style V fill:#ffebee,color:#000
```

1. **Initialization Phase**
   - Load configuration from `db-sync.json`
   - Establish database connections
   - Validate schema compatibility
   - Initialize performance counters

2. **Discovery Phase**
   - Enumerate tables to synchronize
   - Count records in source and destination
   - Calculate modification timestamps
   - Build synchronization plan

3. **Planning Phase**
   - Analyze count differences
   - Determine sync operations needed (add/delete/modify)
   - Choose optimization strategies
   - Generate execution plan

4. **Execution Phase**
   - Execute sync plan in optimal order
   - Apply binary search optimizations where beneficial
   - Batch process data movements
   - Track progress and performance

5. **Verification Phase**
   - Validate sync results
   - Check record counts
   - Report statistics and errors

### Table Processing Order

Tables are processed in dependency order to maintain referential integrity:

```lua
-- Example processing order
1. Reference tables (currency, terms_of_payment)
2. Master data (company, product)
3. Transactional data (orders, invoices)
4. Detail records (order_row, invoice_row)
```

### Record Type Handling

The system supports multiple record types per table:

```lua
-- Example: product table with multiple record types
product-work          -- Work-related products
product-material      -- Material products
product-service       -- Service products
```

---

## Sync Planning Algorithm

The sync planning algorithm intelligently determines what operations to perform based on record counts, timestamps, and configuration settings.

### Decision Matrix

```mermaid
flowchart TD
    A[Compare Record Counts<br/>source_count vs target_count] --> B{target_count == 0<br/>AND source_count > 0?}

    B -->|Yes| C[ADD ALL<br/>Destination empty]
    B -->|No| D{from_count < to_count?}

    D -->|Yes| E[DELETE → ADD/MODIFY<br/>Destination has extras]
    D -->|No| F{from_count == to_count<br/>AND trusted_modify_id?}

    F -->|Yes| G[INCREMENTAL SYNC<br/>Trust timestamps]
    F -->|No| H{from_count == to_count<br/>AND !trusted_modify_id?}

    H -->|Yes| I[FULL COMPARE<br/>Verify all records]
    H -->|No| J{from_count > to_count<br/>AND changes > 0?}

    J -->|Yes| K[ADD → INCREMENTAL/MODIFY<br/>Source has more + changes]
    J -->|No| L[FULL COMPARE<br/>Complex scenario]

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style C fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style D fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style E fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000
    style F fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style G fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style H fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style I fill:#fff3e0,stroke:#fbc02d,stroke-width:3px,color:#000
    style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style K fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000
    style L fill:#fce4ec,stroke:#7b1fa2,stroke-width:3px,color:#000
```

| Condition | Action | Reason |
|-----------|--------|--------|
| `target_count == 0 && source_count > 0` | **ADD ALL** | Target empty |
| `source_count < target_count` | **DELETE** → **ADD/MODIFY** | Target has extras |
| `source_count == target_count && trusted_modify_id` | **INCREMENTAL** | No count change, trust timestamps |
| `source_count == target_count && !trusted_modify_id` | **FULL COMPARE** | No count change, verify all |
| `source_count > target_count && changes > 0` | **ADD** → **INCREMENTAL/MODIFY** | Source has more + changes |

### Trust Modify ID Logic

```lua
local trustPrev = syncPrf.trust_modify_id == true and hasPrevModifyId
if trustPrev then
    -- Fast incremental sync using modify timestamps
    plan = {"incremental"}
else
    -- Full comparison required
    plan = {"add", "changed"}
end
```

### Sync Plan Examples

#### Scenario 1: Initial Sync (Empty Destination)

```text
source=5000, target=0, changed=5000
Plan: ["add"]
Reason: "target empty => add all"
```

#### Scenario 2: Records Deleted from Source

```text
source=4990, target=5000, changed=0
Plan: ["delete"]
Reason: "target has more rows => delete extras"
```

#### Scenario 3: Trusted Incremental Sync

```text
source=5010, target=5000, changed=10, trust_modify_id=true
Plan: ["add", "incremental"]
Reason: "source has more rows & trusted modify id"
```

#### Scenario 4: Untrusted Full Sync

```text
source=5010, target=5000, changed=10, trust_modify_id=false
Plan: ["add", "incremental", "changed"]
Reason: "source has more rows & untrusted modify id"
```

---

## Performance Optimizations

### 1. Binary Search Engine

The binary search optimization dramatically reduces query complexity for large datasets:

**Traditional Approach**: O(n) - Read all IDs from both databases
**Binary Search**: O(log n) - Recursively narrow down differences

**When Binary Search is Used**:

- Table size > `binary_search_min_table_size` (default: 1000)
- Count difference < `binary_search_max_diff_percent` (default: 50%)
- Not using incremental mode (`trust_modify_id != true`)

**Binary Search Process**:

1. Find midpoint record using OFFSET/LIMIT
2. Count records in upper/lower halves in both databases
3. If counts differ, recursively search that half
4. Continue until range ≤ `binary_search_read_batch` (default: 500)
5. Process final batch directly

### 2. Batch Processing

All data operations use configurable batch sizes for optimal performance:

```lua
-- Batch size configuration (per database type)
batch_size: 5000           -- General operations
batch_size_4d: 1000        -- 4D database operations
delete_batch_size: 1000    -- Delete operations
delete_batch_size_4d: 500  -- 4D delete operations
id_array_batch_size: 25000 -- ID array operations
```

### 3. Recent-First Optimization

When using binary search, the system can prioritize recent records:

```lua
binary_search_recent_first: 5000  -- Search last 5000 records first
```

This is highly effective because most changes occur in recently created records.

### 4. Connection Pooling

Database connections are reused across operations to minimize connection overhead:

```lua
-- Connection reuse pattern
local conn = dconn.connection({organizationId = dbId})
-- Multiple operations using same connection
dconn.disconnectAll() -- Clean up at end
```

### 5. Incremental Synchronization

When `trust_modify_id = true`, only records modified since the last sync are processed:

```lua
-- Query with modify time filter
WHERE modify_time > last_sync_modify_time
```

### 6. Memory Management

```lua
-- Explicit garbage collection between tables
collectgarbage()
-- Large arrays are processed in chunks
for i = 1, #largeArray, batchSize do
    local batch = table.slice(largeArray, i, i + batchSize - 1)
    processBatch(batch)
end
```

---

## Binary Search Engine

### Algorithm Overview

```mermaid
flowchart TD
    A[Start Binary Search<br/>Full Range: 1 to N] --> B[Initialize Stack with<br/>Full Range]

    B --> C{Stack<br/>Empty?}
    C -->|Yes| X[Return Results]
    C -->|No| D[Pop Range from Stack]

    D --> E{Range Size <=<br/>batch_size?}
    E -->|Yes| F[Process Range Directly<br/>Read & Compare Records]
    E -->|No| G[Find Midpoint ID<br/>using OFFSET/LIMIT]

    F --> H[Find Differences<br/>Add to Results]
    H --> C

    G --> I[Count Records in<br/>Lower Half 1 to mid]
    I --> J[Count Records in<br/>Upper Half mid to N]

    J --> K{Lower Half<br/>has differences?}
    K -->|Yes| L[Push Lower Range<br/>to Stack]
    K -->|No| M{Upper Half<br/>has differences?}

    L --> M
    M -->|Yes| N[Push Upper Range<br/>to Stack]
    M -->|No| C
    N --> C

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style X fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style F fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000
    style G fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000
    style B fill:#f3e5f5,color:#000
    style C fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style D fill:#e0f2f1,color:#000
    style E fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style H fill:#fff3e0,color:#000
    style I fill:#e8f5e8,color:#000
    style J fill:#e8f5e8,color:#000
    style K fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style L fill:#e8f5e8,color:#000
    style M fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style N fill:#e8f5e8,color:#000

    subgraph "Progress Display"
        P1[2.8%↑5000 - Upper Range]
        P2[97%↓174197 - Lower Range]
    end

    style P1 fill:#e8f5e8,color:#000
    style P2 fill:#ffebee,color:#000
```

### Binary Search Process Visualization

```mermaid
flowchart TD
    subgraph "Database Range: 1 to 100,000"
        A1[Full Range<br/>1 ↔ 100,000<br/>Check counts differ]

        A1 --> B1[Lower Half<br/>1 ↔ 50,000<br/>Counts match ✓]
        A1 --> B2[Upper Half<br/>50,001 ↔ 100,000<br/>Counts differ ✗]

        B2 --> C1[Range: 50,001 ↔ 75,000<br/>Counts match ✓]
        B2 --> C2[Range: 75,001 ↔ 100,000<br/>Counts differ ✗]

        C2 --> D1[Range: 75,001 ↔ 87,500<br/>Counts differ ✗]
        C2 --> D2[Range: 87,501 ↔ 100,000<br/>Counts match ✓]

        D1 --> E1[Range: 75,001 ↔ 81,250<br/>Size ≤ 500<br/>Process directly]
        D1 --> E2[Range: 81,251 ↔ 87,500<br/>Size ≤ 500<br/>Process directly]
    end

    style A1 fill:#e1f5fe,stroke:#1976d2,stroke-width:2px,color:#000
    style B1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style C1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style D2 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style B2 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000
    style C2 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000
    style D1 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000
    style E1 fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000
    style E2 fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000
```

### Algorithm Details

The binary search engine is the most sophisticated optimization in the system:

```lua
function binarySearch(syncRec, fromId, toId, operation)
    local stack = {}
    local resultArr = {}
    -- Initialize with full range
    stack[#stack + 1] = {
        startId = "",
        endId = "",
        toCount = totalToCount,
        fromCount = totalFromCount,
        depth = 0
    }
    while #stack > 0 and #resultArr < expectedResults do
        local range = table.remove(stack)
        if range.toCount <= batchSize then
            -- Small range: process directly
            local records = getRecordsInRange(range.startId, range.endId)
            local differences = findDifferences(records)
            table.append(resultArr, differences)
        else
            -- Large range: subdivide
            local midId = getMidpointId(range.startId, range.endId)
            local lowerCounts = countRecordsInRange(startId, midId)
            local upperCounts = countRecordsInRange(midId, endId)

            -- Add ranges with differences to stack
            if hasRecordDifferences(lowerCounts) then
                stack[#stack + 1] = createLowerRange(range, midId, lowerCounts)
            end
            if hasRecordDifferences(upperCounts) then
                stack[#stack + 1] = createUpperRange(range, midId, upperCounts)
            end
        end
    end
    return resultArr
end
```

### Binary Search Advantages

1. **Logarithmic Complexity**: O(log n) vs O(n) for full scans
2. **Network Efficiency**: Fewer large data transfers
3. **Memory Efficiency**: Process small chunks instead of entire datasets
4. **Interruptible**: Can stop early when target count reached
5. **Progress Tracking**: Real-time feedback on search progress

### Binary Search Limitations

1. **Query Overhead**: Many small queries vs few large ones
2. **Not Always Optimal**: Small differences may be faster with full scan
3. **Database Compatibility**: Requires OFFSET/LIMIT support
4. **ID Ordering**: Assumes ID-based ordering for range splits

---

## Data Movement Pipeline

### Pipeline Overview

```mermaid
flowchart LR
    subgraph "READ PHASE"
        A[Read Source Data<br/>in Batches]
        B[Read Target IDs<br/>for Comparison]
        A --> C[Source Data Array]
        B --> D[Target ID Index]
    end

    subgraph "TRANSFORM PHASE"
        C --> E[Compare Records]
        D --> E
        E --> F[Records to ADD]
        E --> G[Records to MODIFY]
        E --> H[Records to DELETE]
    end

    subgraph "WRITE PHASE"
        F --> I[Batch INSERT<br/>Operations]
        G --> J[Batch UPDATE<br/>Operations]
        H --> K[Batch DELETE<br/>Operations]
    end

    subgraph "SCHEMA TRANSLATION"
        L[Field Mapping]
        M[Type Conversion]
        N[Data Validation]
        L --> M --> N
        N --> I
        N --> J
    end

    style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
    style B fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000
    style C fill:#e3f2fd,color:#000
    style D fill:#e3f2fd,color:#000
    style E fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#000
    style F fill:#e8f5e8,color:#000
    style G fill:#fff8e1,color:#000
    style H fill:#ffebee,color:#000
    style I fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000
    style J fill:#fff8e1,stroke:#fbc02d,stroke-width:2px,color:#000
    style K fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000
    style L fill:#f3e5f5,color:#000
    style M fill:#f3e5f5,color:#000
    style N fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
```

### Data Flow Sequence

```mermaid
sequenceDiagram
    participant S as Source DB
    participant E as Sync Engine
    participant D as Destination DB

    Note over E: READ PHASE
    E->>S: Read batch 1 (records 1-5000)
    S-->>E: Return source records
    E->>S: Read batch 2 (records 5001-10000)
    S-->>E: Return source records
    E->>D: Read all IDs for comparison
    D-->>E: Return ID array

    Note over E: TRANSFORM PHASE
    E->>E: Build target ID index
    E->>E: Compare source vs target
    E->>E: Categorize: ADD/MODIFY/DELETE
    E->>E: Apply schema transformations

    Note over E: WRITE PHASE
    E->>D: Batch DELETE (if any)
    D-->>E: Confirm deletions
    E->>D: Batch INSERT (new records)
    D-->>E: Confirm insertions
    E->>D: Batch UPDATE (modified records)
    D-->>E: Confirm updates

    Note over E: VERIFICATION
    E->>D: Verify final record counts
    D-->>E: Return counts
```

### Read Phase

```lua
-- 1. Read source data in batches
local sourceData = {}
for batchStart = 1, totalCount, batchSize do
    local batch = readBatch(fromId, batchStart, batchSize)
    table.append(sourceData, batch)
end
-- 2. Read destination IDs for comparison
local destIds = readIdArray(toId, idField)
local destIdIndex = invertTable(destIds)
```

### Transform Phase

```lua
-- 3. Compare and categorize records
local toAdd = {}
local toModify = {}
local toDelete = {}
for _, sourceRecord in ipairs(sourceData) do
    local id = sourceRecord[idField]
    if destIdIndex[id] then
        if recordsAreModified(sourceRecord, destRecord) then
            toModify[#toModify + 1] = sourceRecord
        end
    else
        toAdd[#toAdd + 1] = sourceRecord
    end
end
-- Find records to delete (in destination but not source)
for destId in pairs(destIdIndex) do
    if not sourceIdIndex[destId] then
        toDelete[#toDelete + 1] = destId
    end
end
```

### Write Phase

```lua
-- 4. Execute operations in optimal order
if #toDelete > 0 then
    executeBatchDeletes(toId, toDelete, deleteBatchSize)
end
if #toAdd > 0 then
    executeBatchInserts(toId, toAdd, insertBatchSize)
end
if #toModify > 0 then
    executeBatchUpdates(toId, toModify, updateBatchSize)
end
```

### Schema Translation

```lua
-- 5. Handle schema differences during transform
local transformedRecord = {}
for fieldName, fieldValue in pairs(sourceRecord) do
    local destFieldName = fieldMapping[fieldName] or fieldName
    local destFieldType = destSchema[destFieldName]
    transformedRecord[destFieldName] = convertFieldValue(
        fieldValue,
        sourceSchema[fieldName],
        destFieldType
    )
end
```

*This documentation provides a complete understanding of the database synchronization test suite and its role in validating the sync system.*

The database synchronization system includes sophisticated sleep and scheduling capabilities that enable automated, recurring synchronization with intelligent timing management.

### Sleep System Architecture

```mermaid
flowchart TD
    A[Sync Complete] --> B{Scheduling<br/>Configuration?}
    B -->|sleep_seconds| C[Interval-based Sleep]
    B -->|run_at_time| D[Time-based Sleep]
    B -->|None| E[Exit]

    C --> F[smartSleep Function]
    D --> F

    F --> G[Calculate Sleep Duration]
    G --> H{Duration > max_chunk?}
    H -->|Yes| I[Sleep in Chunks]
    H -->|No| J[Single Sleep]

    I --> K[Sleep max_chunk seconds]
    K --> L{Target Time<br/>Reached?}
    L -->|No| M[Progress Update]
    M --> N{More Sleep<br/>Needed?}
    N -->|Yes| K
    N -->|No| O[Wake Up]
    L -->|Yes| O
    J --> O

    O --> P[Start Next Sync]
```

### Configuration Options

#### Interval-based Scheduling

**sleep_seconds**: Sets automatic recurring synchronization intervals

```json
{
    "sleep_seconds": 3600    // Run every hour
}
```

**Use Cases**:

- Regular data synchronization (hourly, daily)
- Continuous monitoring scenarios
- High-frequency updates

#### Time-based Scheduling

**run_at_time**: Schedule sync at specific times

```json
{
    "run_at_time": "02:00"   // Run at 2:00 AM daily
}
```

**Use Cases**:

- Off-peak hour synchronization
- Daily maintenance windows
- Scheduled batch processing

#### Chunked Sleep Management

**max_sleep_chunk_seconds**: Maximum sleep duration per chunk (default: 3600 seconds)

```json
{
    "max_sleep_chunk_seconds": 3600  // 1-hour maximum chunks
}
```

**Benefits**:

- Prevents excessively long sleep periods
- Enables progress monitoring during long waits
- Allows early wake-up when conditions change
- Provides better responsiveness

### Smart Sleep Implementation

#### Core Algorithm

```lua
function smartSleep(totalSeconds, maxChunkSeconds)
    maxChunkSeconds = maxChunkSeconds or 3600  -- Default 1 hour
    while totalSeconds > 0 do
        local chunkSeconds = math.min(totalSeconds, maxChunkSeconds)
        -- Sleep for this chunk
        util.sleep(chunkSeconds * 1000)
        totalSeconds = totalSeconds - chunkSeconds
        -- Check if target time reached (for run_at_time)
        if shouldWakeUpEarly() then
            break
        end
        -- Progress update
        if totalSeconds > 0 then
            print(string.format("Sleep progress: %d seconds remaining", totalSeconds))
        end
    end
end
```

#### Early Wake-up Logic

For `run_at_time` scheduling, the system can wake up early when the target time is reached:

```lua
-- Example: Scheduled for 02:00, but it's now 01:59
-- System wakes up 1 minute early instead of sleeping full hour
```

### Scheduling Scenarios

#### Scenario 1: Hourly Sync

```json
{
    "sleep_seconds": 3600,
    "max_sleep_chunk_seconds": 1800  // 30-minute chunks
}
```

**Behavior**:

- Sync completes at 10:00
- Sleep in two 30-minute chunks
- Progress update at 10:30
- Next sync starts at 11:00

#### Scenario 2: Daily 2 AM Sync

```json
{
    "run_at_time": "02:00",
    "max_sleep_chunk_seconds": 3600  // 1-hour chunks
}
```

**Behavior**:

- Sync completes at 10:00
- Sleep in hourly chunks until 02:00
- Progress updates every hour
- Early wake-up if system time reaches 02:00

#### Scenario 3: Long Sleep with Monitoring

```json
{
    "sleep_seconds": 86400,           // 24 hours
    "max_sleep_chunk_seconds": 3600   // 1-hour chunks
}
```

**Behavior**:

- Sleep divided into 24 one-hour chunks
- Progress update every hour
- Complete visibility into long sleep periods
- Ability to interrupt or modify scheduling

### Progress Monitoring

#### Sleep Progress Display

```text
Sync complete. Sleeping 86400 seconds until next run...
Sleep progress: 82800 seconds remaining (23.0 hours)
Sleep progress: 79200 seconds remaining (22.0 hours)
Sleep progress: 75600 seconds remaining (21.0 hours)
...
Sleep progress: 3600 seconds remaining (1.0 hours)
Sleep complete. Starting next sync cycle...
```

#### Time-based Progress

```text
Sync complete. Waiting until 02:00 for next run...
Current time: 22:30, target: 02:00 (3.5 hours remaining)
Sleep progress: 10800 seconds remaining (3.0 hours)
Sleep progress: 7200 seconds remaining (2.0 hours)
Sleep progress: 3600 seconds remaining (1.0 hours)
Target time reached. Starting sync at 02:00...
```

### Configuration Best Practices

#### Sleep Configuration Guidelines

1. **Choose appropriate chunk sizes**:
   - Short intervals (< 1 hour): Use default chunks
   - Medium intervals (1-6 hours): 30-60 minute chunks
   - Long intervals (> 6 hours): 1-hour chunks

2. **Monitor resource usage**:
   - Chunked sleeping reduces memory footprint
   - Progress updates help track system status
   - Early wake-up prevents unnecessary delays

3. **Error handling integration**:
   - Sleep system respects error conditions
   - Failed syncs don't affect scheduling
   - Robust recovery from sleep interruptions

#### Performance Considerations

- **Chunk overhead**: Each chunk has minimal overhead (~1ms)
- **Progress logging**: Updates are lightweight
- **Memory usage**: Constant regardless of sleep duration
- **System responsiveness**: Improved with chunked approach

### Troubleshooting

#### Common Issues

**Problem**: Sleep seems to hang indefinitely
**Solution**: Check `max_sleep_chunk_seconds` configuration

**Problem**: Sync doesn't start at expected time
**Solution**: Verify `run_at_time` format and timezone settings

**Problem**: Too many progress updates
**Solution**: Increase `max_sleep_chunk_seconds` value

**Problem**: System doesn't wake up early
**Solution**: Ensure early wake-up logic is properly configured for `run_at_time`

---

## Error Handling & Recovery

### Error Recovery Flow

```mermaid
flowchart TD
    A[Operation Starts] --> B{Error<br/>Occurred?}
    B -->|No| C[Operation Successful]
    B -->|Yes| D[Categorize Error Type]

    D --> E{Connection<br/>Error?}
    D --> F{Schema<br/>Error?}
    D --> G{Data<br/>Error?}
    D --> H{Logic<br/>Error?}

    E -->|Yes| E1[Disconnect All<br/>Connections]
    E1 --> E2[Wait 30 seconds]
    E2 --> E3[Retry Connection]
    E3 --> I{Retry<br/>Successful?}

    F -->|Yes| F1[Check Schema<br/>Compatibility]
    F1 --> F2[Apply Schema<br/>Transformations]
    F2 --> J{Schema<br/>Fixed?}

    G -->|Yes| G1[Isolate Failed<br/>Record]
    G1 --> G2[Continue with<br/>Next Record]
    G2 --> K{Error Count <br/>max_error_count?}

    H -->|Yes| H1[Check Binary Search<br/>Settings]
    H1 --> H2{Fallback<br/>Enabled?}
    H2 -->|Yes| H3[Fallback to<br/>Full Compare]
    H2 -->|No| L[Fail Operation]

    I -->|Yes| M[Resume Operation]
    I -->|No| L
    J -->|Yes| M
    J -->|No| L
    K -->|Yes| M
    K -->|No| L
    H3 --> M

    M --> B
    C --> N[End Success]
    L --> O[End Failure]

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style C fill:#c8e6c9,color:#000
    style D fill:#fff8e1,color:#000
    style E fill:#ffebee,color:#000
    style F fill:#e8f5e8,color:#000
    style G fill:#f3e5f5,color:#000
    style H fill:#fff3e0,color:#000
    style E1 fill:#ffebee,color:#000
    style E2 fill:#ffebee,color:#000
    style E3 fill:#ffebee,color:#000
    style F1 fill:#e8f5e8,color:#000
    style F2 fill:#e8f5e8,color:#000
    style G1 fill:#f3e5f5,color:#000
    style G2 fill:#f3e5f5,color:#000
    style H1 fill:#fff3e0,color:#000
    style H2 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style H3 fill:#fff3e0,color:#000
    style I fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style K fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style L fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000
    style M fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000
    style N fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style O fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000
```

### Error Categories

1. **Connection Errors**: Database unavailable, network issues
2. **Schema Errors**: Missing tables, incompatible field types
3. **Data Errors**: Constraint violations, invalid values
4. **Logic Errors**: Unexpected conditions, algorithm failures

### Error Recovery Mechanisms

#### 1. Automatic Fallback

```lua
-- Binary search fallback to full compare
if binarySearchFailed and syncPrf.fallback_full_compare_on_mismatch then
    util.printWarning("Binary search failed, falling back to full compare")
    return syncCompare(syncRec, tbl, schema, fromId, toId, fldArr, stat, operation)
end
```

#### 2. Batch Error Isolation

```lua
-- Continue processing other records when individual records fail
saveParam.parameter.continue_on_error = syncPrf.max_error_count
-- Process remaining batches even if one batch fails
if batchError and stat.errorCount < syncPrf.max_error_count then
    stat.errorCount = stat.errorCount + 1
    continue -- Process next batch
end
```

#### 3. Connection Recovery

```lua
-- Automatic reconnection on connection loss
if connectionLost then
    dconn.disconnectAll()
    defaultConn = nil
    util.sleep(30000) -- Wait 30 seconds
    -- Retry connection on next iteration
end
```

#### 4. Transaction Rollback

```lua
-- Rollback incomplete operations
if operationFailed then
    if transactionActive then
        rollbackTransaction()
    end
    -- Restore previous state
    restoreFromBackup()
end
```

### Error Thresholds

```lua
max_error_count: 25      -- Stop after 25 errors
max_error_loop: 5        -- Stop after 5 consecutive error loops
```

### Error Reporting

```lua
-- Comprehensive error tracking
local errorTbl = {}
errorTbl[#errorTbl + 1] = util.printError("Error description: %s", details)
-- Error summary at end
if #errorTbl > 0 then
    util.printRed("Sync completed with %d errors:", #errorTbl)
    for i, error in ipairs(errorTbl) do
        util.printRed("  %d. %s", i, error)
    end
end
```

---

## Common Problems & Solutions

### Problem 1: Slow Synchronization

**Symptoms**:

- Sync takes hours to complete
- High CPU/memory usage
- Many database queries

**Diagnosis**:

```lua
-- Check if binary search is being used
binary_search_for_add: true
binary_search_min_table_size: 1000  -- Lower for smaller tables
-- Monitor query counts in logs
-- Look for "using full compare" vs "using binary search"
```

**Solutions**:

1. **Enable binary search** for large tables
2. **Increase batch sizes** for better throughput
3. **Use incremental sync** with `trust_modify_id: true`
4. **Optimize database indexes** on ID and modify_time fields
5. **Reduce table scope** with `sync_table` configuration

### Problem 2: Memory Issues

**Symptoms**:

- Out of memory errors
- System becomes unresponsive
- Garbage collection warnings

**Solutions**:

```lua
-- Reduce batch sizes
batch_size: 1000         -- Down from 5000
id_array_batch_size: 5000 -- Down from 25000
-- Force garbage collection
collectgarbage() -- Called between tables automatically
```

### Problem 3: Data Inconsistencies

**Symptoms**:

- Record counts don't match after sync
- Missing or duplicate records
- "fallback full compare" messages

**Diagnosis**:

```lua
-- Check count mismatches in logs
"counts mismatch after binary delete"
"fallback full compare on mismatch"
-- Verify ID field configuration
primary_key_field: "record_id"  -- Ensure correct field
```

**Solutions**:

1. **Verify ID field mapping** between databases
2. **Check for ID conflicts** (duplicate IDs)
3. **Enable fallback mechanism**: `fallback_full_compare_on_mismatch: true`
4. **Use full compare mode** temporarily: `trust_modify_id: false`

### Problem 4: Schema Conflicts

**Symptoms**:

- Field type conversion errors
- Missing field errors
- Data truncation warnings

**Solutions**:

```lua
-- Map incompatible fields
"field_mapping": {
    "source_field": "dest_field",
    "varchar_field": "text_field"
}
-- Handle schema differences
set_default_value: true  -- Set defaults for missing fields
copy_all_json_keys: true -- Handle JSON field differences
```

### Problem 5: Connection Timeouts

**Symptoms**:

- "connection failed" errors
- Random disconnections
- Network timeout messages

**Solutions**:

```lua
-- Increase timeout settings (in connection config)
"timeout": 300,          -- 5 minutes
"retry_count": 3,        -- Retry failed operations
"retry_delay": 5000      -- Wait 5 seconds between retries
```

---

## Monitoring & Debugging

### Performance Metrics

The system provides comprehensive performance tracking:

```lua
-- Timing metrics
readTimeAll              -- Total time reading data
writeTimeAll             -- Total time writing data
deleteTimeAll            -- Total time deleting data
dataCompareTimeAll       -- Total time comparing data
binarySearchTimeAll      -- Total time in binary search
-- Operation counts
queryCountAll            -- Total database queries
recordCountAllTables     -- Total records processed
recordCountAllAdded      -- Total records added
recordCountAllModified   -- Total records modified
recordCountAllDeleted    -- Total records deleted
recordCountAllSkipped    -- Total records skipped
-- Binary search metrics
binarySearchCountAll     -- Number of binary searches
binarySearchIterationCount -- Total search iterations
binarySearchQueryCount   -- Queries used in binary search
binarySearchMaxDepthAll  -- Maximum search depth reached
```

### Debug Configuration

```lua
-- Enable detailed logging
show_sql: true           -- Show SQL queries
debug_sql: true          -- Show SQL execution details
show_save_sql: true      -- Show save operation SQL
debug_connection_change: true  -- Show connection changes
-- Performance analysis
only_sync_plan: true     -- Plan-only mode (no actual sync)
check_all_tables: true   -- Check all tables regardless of changes
```

### Progress Monitoring

Real-time progress display shows:

```text
delete binary search 179196 records, search 5000 recent first, need to find 2:
2.8%↑5000 97%↓174197 49%↑87097 24%↑43547 12%↓21775 6.1%↑10886
```

- **Percentage**: Portion of total dataset
- **Arrow**: Direction (↑ recent, ↓ older)
- **Count**: Records in this range

### Log Analysis

Key log patterns to monitor:

```bash
# Successful operations
"records to sync: 100 / 5000, records synced: 4900 / 5000"
"using binary search for adds (from=5100, to=5000, diff=100)"
# Performance warnings
"using full compare for adds (reason=large difference)"
"fallback full compare on mismatch"
# Error conditions
"Binary search: Invalid toCount=0"
"sync connection failed"
"maximum error count 25 was reached"
```

---

## Best Practices

### 1. Configuration Optimization

**Start with Conservative Settings**:

```lua
{
    "batch_size": 1000,
    "binary_search_min_table_size": 5000,
    "trust_modify_id": false,
    "fallback_full_compare_on_mismatch": true
}
```

**Scale Up Gradually**:

```lua
{
    "batch_size": 5000,        -- Increase as performance allows
    "trust_modify_id": true,   -- Enable after initial sync
    "binary_search_recent_first": 5000  -- Focus on recent changes
}
```

### 2. Database Optimization

**Essential Indexes**:

```sql
-- Primary key index (usually automatic)
CREATE INDEX idx_table_record_id ON table_name (record_id);

-- Modify time index for incremental sync
CREATE INDEX idx_table_modify_time ON table_name (modify_time);

-- Composite index for range queries
CREATE INDEX idx_table_id_modify ON table_name (record_id, modify_time);
```

**Statistics Updates**:

```sql
-- Keep database statistics current
ANALYZE TABLE table_name;
UPDATE STATISTICS table_name;
```

### 3. Incremental Sync Strategy

#### Phase 1: Initial Sync

```lua
{
    "trust_modify_id": false,     -- Full compare for accuracy
    "binary_search_for_add": true, -- Use optimization
    "sync_all": true              -- Sync everything
}
```

#### Phase 2: Ongoing Sync

```lua
{
    "trust_modify_id": true,      -- Fast incremental
    "sleep_seconds": 3600,        -- Hourly sync
    "min_rows_to_sync": 1         -- Skip if no changes
}
```

### 4. Error Prevention

**Validate Configuration**:

```lua
-- Test with single table first
"sync_table": ["test_table"],
"only_sync_plan": true  -- Dry run mode
-- Enable all safety features
"fallback_full_compare_on_mismatch": true,
"max_error_count": 5,    -- Low threshold for testing
"continue_on_error": false -- Stop on first error
```

**Monitor Resource Usage**:

```bash
# Monitor memory usage
top -p $(pgrep lj)
# Monitor disk I/O
iostat -x 1
# Monitor network traffic
netstat -i
```

### 5. Maintenance Schedule

**Daily**:

- Monitor error logs
- Check sync completion status
- Verify record counts

**Weekly**:

- Analyze performance metrics
- Review configuration changes
- Update database statistics

**Monthly**:

- Full sync validation
- Configuration optimization review
- Capacity planning assessment

---

## Conclusion

The database synchronization system is a robust, production-ready solution for keeping multiple databases in sync. Its sophisticated algorithms, comprehensive error handling, and extensive optimization features make it suitable for demanding enterprise environments.

Key success factors:

1. **Proper configuration** for your specific environment
2. **Adequate database indexing** for performance
3. **Regular monitoring** and maintenance
4. **Gradual optimization** based on actual performance
5. **Comprehensive testing** before production deployment

For additional support and configuration examples, refer to the `db-sync.json` configuration manual and the binary search progress display manual.
