# Database Synchronization Configuration Manual

## Table of Contents

1. [Configuration Overview](#configuration-overview)
2. [Core Settings](#core-settings)
3. [Database Connection Settings](#database-connection-settings)
4. [Table Selection Settings](#table-selection-settings)
5. [Performance Settings](#performance-settings)
6. [Binary Search Settings](#binary-search-settings)
7. [Batch Processing Settings](#batch-processing-settings)
8. [Error Handling Settings](#error-handling-settings)
9. [Debug and Monitoring Settings](#debug-and-monitoring-settings)
10. [Scheduling Settings](#scheduling-settings)
11. [Advanced Options](#advanced-options)
12. [Configuration Examples](#configuration-examples)

---

## Configuration Overview

The `db-sync.json` file controls all aspects of the database synchronization process. It uses a hierarchical structure where settings can be enabled/disabled using the `-` suffix convention.

### Configuration Structure

```mermaid
flowchart TD
    A[db-sync.json<br/>Configuration File] --> B[Core Settings]
    A --> C[Connection Settings]
    A --> D[Sync Control]
    A --> E[Performance & Error Handling]

    B --> B1[✓ sync_all<br/>❌ check_all_tables<br/>🔄 trust_modify_id<br/>📋 only_sync_plan]

    C --> C1[📍 Source Database<br/>db_source_id]
    C --> C2[🎯 Target Database<br/>db_target_id]

    D --> D1[📊 Table Selection<br/>sync_table<br/>sync_table_query<br/>no_sync_table]
    D --> D2[🔍 Table Methods<br/>sync_table_method]

    E --> E1[⚡ Performance<br/>Batch Sizes<br/>Binary Search<br/>Optimization]
    E --> E2[🛡️ Error Handling<br/>max_error_count<br/>fallback_options]
    E --> E3[⏰ Scheduling<br/>sleep_seconds<br/>run_at_time]

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:4px,color:#000,font-size:16px
    style B fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000
    style C fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#000
    style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000
    style E fill:#fff8e1,stroke:#fbc02d,stroke-width:3px,color:#000

    style B1 fill:#e8f5e8,color:#000,font-size:12px
    style C1 fill:#fff3e0,color:#000,font-size:12px
    style C2 fill:#fff3e0,color:#000,font-size:12px
    style D1 fill:#f3e5f5,color:#000,font-size:12px
    style D2 fill:#f3e5f5,color:#000,font-size:12px
    style E1 fill:#fff8e1,color:#000,font-size:12px
    style E2 fill:#ffebee,color:#000,font-size:12px
    style E3 fill:#e0f2f1,color:#000,font-size:12px
```

#### Configuration Categories

The configuration is organized into logical groups for easier management:

| Category | Icon | Purpose | Key Settings |
|----------|------|---------|--------------|
| **Core Settings** | ✓ | Basic sync behavior | `sync_all`, `trust_modify_id`, `only_sync_plan` |
| **Connection Settings** | 🔗 | Database connections | `db_source_id`, `db_target_id` |
| **Sync Control** | 📊 | What to synchronize | `sync_table`, `sync_table_query`, `no_sync_table` |
| **Performance** | ⚡ | Speed optimizations | Batch sizes, binary search, caching |
| **Error Handling** | 🛡️ | Error management | `max_error_count`, fallback options |
| **Scheduling** | ⏰ | When to run sync | `sleep_seconds`, `run_at_time` |

### Setting Naming Convention

```mermaid
flowchart LR
    A[Configuration Setting] --> B{Active or<br/>Disabled?}

    B -->|Active| C["setting_name": value]
    B -->|Disabled| D["setting_name-": value]
    B -->|Alternative| E["setting_name_suffix": value]

    subgraph "Examples"
        F["batch_size": 5000]
        G["batch_size-": 1000]
        H["batch_size_test": 500]
    end

    C -.-> F
    D -.-> G
    E -.-> H

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:2px,color:#000
    style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style C fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000
    style D fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000
    style E fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000
    style F fill:#c8e6c9,color:#000
    style G fill:#ffcdd2,color:#000
    style H fill:#fff3e0,color:#000
```

- **Active setting**: `"setting_name": value`
- **Disabled setting**: `"setting_name-": value` (note the `-` suffix)
- **Alternative values**: Multiple versions with different suffixes for easy switching

Example:

```json
{
    "batch_size": 5000,        // Currently active
    "batch_size-": 1000,       // Disabled alternative
    "batch_size_test": 500     // Test configuration
}
```

---

## Core Settings

### sync_all

Controls whether to sync all available tables or use selective sync.

```json
"sync_all": false
```

**Values**:

- `true`: Sync all tables found in the database (ignores `sync_table` settings)
- `false`: Use `sync_table` and `sync_table_query` for selective sync

**Impact**: When `true`, disables `sync_table_method`, `sync_table_query` and `sync_table` filters.

### check_all_tables

Forces checking of all tables regardless of count changes.

```json
"check_all_tables": false
```

**Values**:

- `true`: Check every table even if counts suggest no changes
- `false`: Skip tables that appear unchanged (performance optimization)

**Usage**: Enable for debugging or when data integrity is suspected.

### trust_modify_id

Enables incremental synchronization using modify timestamps.

```json
"trust_modify_id": true
```

**Values**:

- `true`: Trust modify timestamps for incremental sync (faster)
- `false`: Always perform full comparison (safer, slower)

**Critical**: Only enable after verifying modify timestamps are reliable in your environment.

### only_sync_plan

Plan-only mode for testing synchronization strategy.

```json
"only_sync_plan": false
```

**Values**:

- `true`: Generate and display sync plan without executing
- `false`: Execute the actual synchronization

**Usage**: Enable for testing configuration changes or understanding sync strategy.

---

## Database Connection Settings

### Database Connection Arrays

Define source and target databases:

```json
{
    "db_source_id": ["demo-4d-0"],
    "db_target_id": ["demo-fi_ferrum-0"]
}
```

**Multiple Targets**: The system can sync to multiple targets:

```json
{
    "db_source_id": ["source-db-0"],
    "db_target_id": [
        "target-db1-0",
        "target-db2-0",
        "target-db3-0"
    ]
}
```

**Alternative Configurations**: Use suffixes for different environments:

```json
{
    "db_source_id": ["production-db-0"],           // Active
    "db_source_id1": ["staging-db-0"],             // Staging
    "db_target_id": ["production-target-0"],         // Active
    "db_target_id-": ["staging-target-0"]            // Disabled
}
```

### Connection ID Format

Connection IDs follow the pattern: `organization-environment_database-instance`

Examples:

- `demo-4d-0`: Demo organization, 4D database, instance 0
- `odoo-fi_odoo_demo-0`: Odoo organization, Finnish demo database, instance 0
- `capacic-fi_capacic4d-0`: Capacic organization, Finnish 4D database, instance 0

---

## Table Selection Settings

### sync_table

Specifies which tables to synchronize when `sync_all` is false.

```json
"sync_table": ["order_row-sales"]
```

**Array of table names**: Each entry can be:

- Simple table name: `"product"`
- Table with record type: `"order_row-sales"`
- Multiple record types: `["product", "product-work", "product-material"]`

### sync_table_query

Maps tables to specific query configurations.

```json
"sync_table_query": {
    "product": ["form/nc/erp-sync/qlik/query/product-all.json"],
    "company-supplier": ["form/nc/erp-sync/qlik/query/company-supplier.json"],
    "order-sales": ["form/nc/erp-sync/qlik/query/order-sales.json"]
}
```

**Structure**: `"table_name": ["query_file_path"]`
**Multiple queries**: Some tables may use multiple query files for complex data.

### sync_table_method

Defines synchronization method per table.

```json
"sync_table_method-": {
    "product": "new",
    "product_language": "new",
    "product_price-purchase": "new"
}
```

**Methods**:

- `"new"`: Always insert as new records
- `"update"`: Update existing records
- `"sync"`: Smart sync (default behavior)

### no_sync_table

Excludes specific tables from synchronization.

```json
"no_sync_table": [
    "audit.log",
    "currency_rate",
    "keyword",
    "preference-4dreport",
    "preference-4dmanual"
]
```

**Use cases**:

- System tables that shouldn't be synced
- Log tables with local-only data
- Configuration tables specific to each environment

---

## Performance Settings

### min_rows_to_sync

Minimum number of changed rows required to trigger sync.

```json
"min_rows_to_sync": 1
```

**Values**:

- `0`: Sync even if no changes detected
- `1+`: Skip sync if fewer than N rows changed

**Optimization**: Set higher values to skip frequent syncs with minimal changes.

### full_compare

Forces full comparison mode.

```json
"full_compare": 0
```

**Values**:

- `0`: Use optimized sync strategies
- `N`: Force full compare every N runs
- `1`: Always use full compare (overrides optimizations)

### fallback_full_compare_on_mismatch

Automatically fallback to full compare when binary search results don't match expectations.

```json
"fallback_full_compare_on_mismatch": true
```

**Values**:

- `true`: Automatic fallback for safety (recommended)
- `false`: Fail fast on mismatches (for debugging)

---

## Binary Search Settings

### binary_search_read_batch

Batch size for final binary search operations.

```json
"binary_search_read_batch": 500
```

**Range**: 10-2000
**Impact**:

- **Lower values**: More granular progress, more queries
- **Higher values**: Fewer queries, less granular progress

### binary_search_recent_first

Number of recent records to search first.

```json
"binary_search_recent_first": 5000
```

**Values**:

- `0`: Disable recent-first optimization
- `1000-10000`: Search N most recent records first

**Optimization**: Most changes occur in recent records, so searching them first often finds differences quickly.

### binary_search_for_add

Enable binary search for add operations.

```json
"binary_search_for_add": true
```

**Values**:

- `true`: Use binary search for finding records to add
- `false`: Use traditional full scan for adds

### binary_search_min_table_size

Minimum table size to trigger binary search.

```json
"binary_search_min_table_size": 1000
```

**Logic**: Tables smaller than this threshold use full scan instead of binary search.

### binary_search_max_diff_percent

Maximum difference percentage to use binary search.

```json
"binary_search_max_diff_percent": 50
```

**Logic**: If difference between databases exceeds this percentage, use full scan instead of binary search.

---

## Batch Processing Settings

### Batch Sizes by Operation Type

```json
{
    "batch_size": 5000,                    // General operations
    "batch_size_4d": 1000,                 // 4D database operations
    "delete_batch_size": 1000,             // Delete operations
    "delete_batch_size_4d": 500,           // 4D delete operations
    "id_array_batch_size": 25000,          // ID array operations
    "id_array_batch_size_4d": 10000        // 4D ID array operations
}
```

### Batch Size Guidelines

**4D Databases**: Generally require smaller batch sizes

- Standard: 1000-5000 records
- Deletes: 500-1000 records

**PostgreSQL/SQLite**: Can handle larger batches

- Standard: 5000-10000 records
- Deletes: 1000-5000 records

**Network Considerations**:

- **High latency**: Use larger batches to reduce round trips
- **Limited bandwidth**: Use smaller batches to avoid timeouts

### save_batch_size

Controls how many records are saved in a single transaction.

```json
"save_batch_size-": 5
```

**Note**: This setting is often disabled (`-` suffix) to use default behavior.

---

## Error Handling Settings

### max_error_count

Maximum number of errors before stopping synchronization.

```json
"max_error_count": 25
```

**Behavior**:

- Sync continues until this many errors accumulate
- Then stops to prevent infinite error loops

### max_error_loop

Maximum number of consecutive error loops.

```json
"max_error_loop": 5
```

**Behavior**:

- If sync fails completely N times in a row, stop
- Prevents endless retry loops

### missing_id_max_count

Maximum number of missing IDs to report before truncating.

```json
"missing_id_max_count": 500
```

**Purpose**: Prevents excessive log output when many records are missing.

### Data Integrity Settings

```json
{
    "set_default_value": true,              // Set default values for missing fields
    "disable_audit_log_new": true,          // Skip audit logging for new records
    "disable_audit_log_change": true,       // Skip audit logging for changes
    "copy_all_json_keys": true              // Copy all JSON field keys
}
```

---

## Debug and Monitoring Settings

### SQL Debugging

```json
{
    "show_sql": false,                      // Display SQL queries
    "debug_sql": false,                     // Detailed SQL debugging
    "show_save_sql": false,                 // Show save operation SQL
    "debug_connection_change": false        // Log connection changes
}
```

**Warning**: Enable SQL debugging only for troubleshooting as it generates massive log output.

### Language Setting

```json
"language": "fi"
```

**Values**: `"en"`, `"fi"`, etc.
**Purpose**: Controls language for user messages and error reporting.

---

## Scheduling Settings

### Sleep-based Scheduling

```json
{
    "sleep_seconds": 0,                     // Currently disabled
    "sleep_seconds-": 3600,                 // Alternative: hourly sync
    "max_sleep_chunk_seconds": 3600         // Maximum sleep chunk (1 hour)
}
```

**Values**:

- `sleep_seconds`:
  - `0`: No automatic scheduling (run once)
  - `N`: Sleep N seconds between sync cycles
- `max_sleep_chunk_seconds`: Maximum seconds to sleep in one chunk (default: 3600 = 1 hour)
  - Prevents excessively long sleep periods
  - Enables progress monitoring and early wake-up

### Time-based Scheduling

```json
{
    "run_at_time": ["02:05:00"]             // Run daily at 2:05 AM
}
```

**Format**: Array of time strings in `"HH:MM:SS"` format
**Behavior**: When `sleep_seconds: 0`, use time-based scheduling instead

**Multiple Times**:

```json
"run_at_time": ["02:05:00", "14:05:00"]    // 2:05 AM and 2:05 PM
```

---

## Advanced Options

### Schema-specific Settings

```json
{
    "sort_id_schema": { "odoo": true },     // Sort IDs for Odoo databases
    "no_order_need_driver": { "rest4d": true }  // Skip ordering for REST APIs
}
```

### Query Parameters

```json
"query_parameter-": {
    "transfer_id": "%GNIEZNO%",
    "start_date": "2023-01-01"
}
```

**Purpose**: Pass parameters to query files
**Note**: Often disabled (`-` suffix) unless specific queries need parameters.

### Default Query

```json
"default_query": "plugin/db-sync/query/default-query.json"
```

**Purpose**: Fallback query configuration when table-specific queries aren't defined.

---

## Configuration Examples

### Example 1: Initial Setup (Conservative)

```json
{
    "sync_all": false,
    "check_all_tables": true,
    "trust_modify_id": false,
    "only_sync_plan": true,

    "db_source_id": ["source-db-0"],
    "db_target_id": ["target-db-0"],

    "sync_table": ["product"],

    "batch_size": 1000,
    "batch_size_4d": 500,
    "binary_search_min_table_size": 5000,

    "max_error_count": 5,
    "fallback_full_compare_on_mismatch": true,

    "show_sql": true,
    "debug_sql": false
}
```

### Example 2: Production Optimized

```json
{
    "sync_all": true,
    "check_all_tables": false,
    "trust_modify_id": true,
    "only_sync_plan": false,

    "db_source_id": ["production-source-0"],
    "db_target_id": ["production-target-0"],

    "batch_size": 5000,
    "batch_size_4d": 1000,
    "binary_search_recent_first": 5000,
    "binary_search_for_add": true,

    "sleep_seconds": 3600,
    "min_rows_to_sync": 10,

    "max_error_count": 25,
    "fallback_full_compare_on_mismatch": true,

    "show_sql": false,
    "debug_sql": false
}
```

### Example 3: High-Performance Setup

```json
{
    "sync_all": true,
    "trust_modify_id": true,

    "batch_size": 10000,
    "batch_size_4d": 2000,
    "id_array_batch_size": 50000,
    "id_array_batch_size_4d": 20000,

    "binary_search_recent_first": 10000,
    "binary_search_min_table_size": 1000,
    "binary_search_max_diff_percent": 30,

    "sleep_seconds": 1800,
    "min_rows_to_sync": 50
}
```

### Example 4: Development/Testing

```json
{
    "sync_all": false,
    "only_sync_plan": true,
    "check_all_tables": true,

    "sync_table": ["test_table"],

    "batch_size": 100,
    "max_error_count": 1,

    "show_sql": true,
    "debug_sql": true,
    "debug_connection_change": true
}
```

---

## Configuration Best Practices

### Configuration Optimization Flow

```mermaid
flowchart TD
    A[Start Configuration] --> B[Choose Environment Type]

    B --> C{Development?}
    B --> D{Staging?}
    B --> E{Production?}

    C -->|Yes| C1[Small Batches<br/>batch_size: 100-1000]
    C1 --> C2[Verbose Logging<br/>show_sql: true]
    C2 --> C3[Plan Only Mode<br/>only_sync_plan: true]

    D -->|Yes| D1[Medium Batches<br/>batch_size: 1000-5000]
    D1 --> D2[Moderate Logging<br/>show_sql: false]
    D2 --> D3[Safety Enabled<br/>fallback_full_compare: true]

    E -->|Yes| E1[Large Batches<br/>batch_size: 5000-10000]
    E1 --> E2[Minimal Logging<br/>debug_sql: false]
    E2 --> E3[Optimizations Enabled<br/>trust_modify_id: true]

    C3 --> F[Test Configuration]
    D3 --> F
    E3 --> F

    F --> G{Performance<br/>Acceptable?}
    G -->|No| H[Identify Bottleneck]
    G -->|Yes| I[Deploy Configuration]

    H --> H1{Sync Too Slow?}
    H --> H2{Memory Issues?}
    H --> H3{Too Many Errors?}

    H1 -->|Yes| J1[↑ Batch Sizes<br/>Enable Binary Search<br/>Set trust_modify_id: true]
    H2 -->|Yes| J2[↓ Batch Sizes<br/>↓ id_array_batch_size]
    H3 -->|Yes| J3[↑ max_error_count<br/>Enable fallbacks]

    J1 --> F
    J2 --> F
    J3 --> F

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style I fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
    style C1 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style D1 fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000
    style E1 fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
    style B fill:#fff8e1,stroke:#fbc02d,stroke-width:2px,color:#000
    style C fill:#fff3e0,color:#000
    style D fill:#e8f5e8,color:#000
    style E fill:#f3e5f5,color:#000
    style F fill:#e0f2f1,stroke:#00796b,stroke-width:2px,color:#000
    style G fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style H fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000
    style C2 fill:#fff3e0,color:#000
    style C3 fill:#fff3e0,color:#000
    style D2 fill:#e8f5e8,color:#000
    style D3 fill:#e8f5e8,color:#000
    style E2 fill:#f3e5f5,color:#000
    style E3 fill:#f3e5f5,color:#000
    style H1 fill:#ffebee,color:#000
    style H2 fill:#ffebee,color:#000
    style H3 fill:#ffebee,color:#000
    style J1 fill:#e8f5e8,color:#000
    style J2 fill:#fff3e0,color:#000
    style J3 fill:#ffcdd2,color:#000
```

### 1. Start Conservative

- Begin with small batch sizes
- Enable all safety features
- Use plan-only mode for testing

### 2. Monitor and Adjust

- Watch performance metrics
- Adjust batch sizes based on results
- Enable optimizations gradually

### 3. Environment-Specific Tuning

- **Development**: Small batches, verbose logging
- **Staging**: Medium batches, moderate logging
- **Production**: Large batches, minimal logging

### 4. Safety First

- Always enable `fallback_full_compare_on_mismatch`
- Set reasonable `max_error_count`
- Test configuration changes in non-production first

### 5. Performance Optimization Order

```mermaid
flowchart LR
    A[Initial Sync] --> B[Enable trust_modify_id]
    B --> C[Increase Batch Sizes]
    C --> D[Enable Binary Search]
    D --> E[Fine-tune Parameters]
    E --> F[Optimize Scheduling]

    style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000
    style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000
    style C fill:#fff8e1,stroke:#fbc02d,stroke-width:2px,color:#000
    style D fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000
    style E fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
    style F fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000
```

1. Enable `trust_modify_id` after initial sync
2. Increase batch sizes based on performance
3. Enable binary search optimizations
4. Fine-tune binary search parameters
5. Adjust scheduling for optimal resource usage

---

## Troubleshooting Configuration

### Common Issues

**Sync Too Slow**:

- Increase batch sizes
- Enable binary search
- Set `trust_modify_id: true`

**Memory Issues**:

- Decrease batch sizes
- Reduce `id_array_batch_size`

**Too Many Errors**:

- Increase `max_error_count`
- Enable `fallback_full_compare_on_mismatch`
- Check database connectivity

**Missing Tables**:

- Verify `sync_table` entries
- Check `no_sync_table` exclusions
- Ensure tables exist in source database

### Configuration Validation

Before deploying:

1. Test with `only_sync_plan: true`
2. Run with single table first
3. Monitor resource usage
4. Verify results with small dataset
5. Gradually scale up

---

## Summary

The `db-sync.json` configuration file provides fine-grained control over every aspect of database synchronization. Proper configuration is crucial for:

- **Performance**: Optimal batch sizes and algorithms
- **Reliability**: Error handling and fallback mechanisms
- **Safety**: Data integrity and validation
- **Maintenance**: Monitoring and debugging capabilities

Start with conservative settings and gradually optimize based on your specific environment and requirements.
