The database synchronization system (db-sync.lua) is a sophisticated, production-grade tool designed to synchronize data between multiple database systems with different schemas and architectures. It supports various database types including 4D, PostgreSQL, SQLite, and REST APIs.
flowchart TD A[Source Database<br/>SOURCE] --> C[Sync Engine<br/>db-sync.lua] C --> B[Target Database<br/>TARGET] C --> D[Configuration<br/>db-sync.json] subgraph "Database Types" A1[4D Database] A2[PostgreSQL] A3[SQLite] A4[REST API] end subgraph "Core Components" C1[Binary Search Engine] C2[Data Movement Pipeline] C3[Error Handling & Recovery] C4[Progress Monitoring] end A -.-> A1 A -.-> A2 A -.-> A3 A -.-> A4 C -.-> C1 C -.-> C2 C -.-> C3 C -.-> C4 style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style B fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000 style C fill:#fff3e0,stroke:#f57c00,stroke-width:4px,color:#000 style D fill:#f3e5f5,stroke:#7b1fa2,stroke-width:3px,color:#000 style A1 fill:#e1f5fe,color:#000 style A2 fill:#e1f5fe,color:#000 style A3 fill:#e1f5fe,color:#000 style A4 fill:#e1f5fe,color:#000 style C1 fill:#fff3e0,color:#000 style C2 fill:#fff3e0,color:#000 style C3 fill:#fff3e0,color:#000 style C4 fill:#fff3e0,color:#000
flowchart TD A[Start] --> B[Load Configuration<br/>db-sync.json] B --> C[Establish Database<br/>Connections] C --> D[Validate Schema<br/>Compatibility] D --> E[Initialize Performance<br/>Counters] E --> F[Enumerate Tables<br/>to Synchronize] F --> G[Count Records in<br/>Source & Target] G --> H[Calculate Modification<br/>Timestamps] H --> I[Build Synchronization<br/>Plan] I --> J{Planning<br/>Complete?} J -->|No| F J -->|Yes| K[Analyze Count<br/>Differences] K --> L[Determine Sync Operations<br/>add/delete/modify] L --> M[Choose Optimization<br/>Strategies] M --> N[Generate Execution<br/>Plan] N --> O[Execute Sync Plan<br/>in Optimal Order] O --> P[Apply Binary Search<br/>Optimizations] P --> Q[Batch Process<br/>Data Movements] Q --> R[Track Progress &<br/>Performance] R --> S{More<br/>Tables?} S -->|Yes| O S -->|No| T[Validate Sync<br/>Results] T --> U[Check Record<br/>Counts] U --> V[Report Statistics<br/>& Errors] V --> W[End] style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style W fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style S fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style B fill:#f3e5f5,color:#000 style C fill:#f3e5f5,color:#000 style D fill:#f3e5f5,color:#000 style E fill:#f3e5f5,color:#000 style F fill:#e8f5e8,color:#000 style G fill:#e8f5e8,color:#000 style H fill:#e8f5e8,color:#000 style I fill:#e8f5e8,color:#000 style K fill:#fff8e1,color:#000 style L fill:#fff8e1,color:#000 style M fill:#fff8e1,color:#000 style N fill:#fff8e1,color:#000 style O fill:#e0f2f1,color:#000 style P fill:#e0f2f1,color:#000 style Q fill:#e0f2f1,color:#000 style R fill:#e0f2f1,color:#000 style T fill:#ffebee,color:#000 style U fill:#ffebee,color:#000 style V fill:#ffebee,color:#000
Initialization Phase
db-sync.jsonDiscovery Phase
Planning Phase
Execution Phase
Verification Phase
Tables are processed in dependency order to maintain referential integrity:
-- Example processing order
1. Reference tables (currency, terms_of_payment)
2. Master data (company, product)
3. Transactional data (orders, invoices)
4. Detail records (order_row, invoice_row)
The system supports multiple record types per table:
-- Example: product table with multiple record types
product-work -- Work-related products
product-material -- Material products
product-service -- Service products
The sync planning algorithm intelligently determines what operations to perform based on record counts, timestamps, and configuration settings.
flowchart TD A[Compare Record Counts<br/>source_count vs target_count] --> B{target_count == 0<br/>AND source_count > 0?} B -->|Yes| C[ADD ALL<br/>Destination empty] B -->|No| D{from_count < to_count?} D -->|Yes| E[DELETE → ADD/MODIFY<br/>Destination has extras] D -->|No| F{from_count == to_count<br/>AND trusted_modify_id?} F -->|Yes| G[INCREMENTAL SYNC<br/>Trust timestamps] F -->|No| H{from_count == to_count<br/>AND !trusted_modify_id?} H -->|Yes| I[FULL COMPARE<br/>Verify all records] H -->|No| J{from_count > to_count<br/>AND changes > 0?} J -->|Yes| K[ADD → INCREMENTAL/MODIFY<br/>Source has more + changes] J -->|No| L[FULL COMPARE<br/>Complex scenario] style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style C fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style D fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style E fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000 style F fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style G fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style H fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style I fill:#fff3e0,stroke:#fbc02d,stroke-width:3px,color:#000 style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style K fill:#e8f5e8,stroke:#388e3c,stroke-width:3px,color:#000 style L fill:#fce4ec,stroke:#7b1fa2,stroke-width:3px,color:#000
| Condition | Action | Reason |
|---|---|---|
target_count == 0 && source_count > 0 |
ADD ALL | Target empty |
source_count < target_count |
DELETE → ADD/MODIFY | Target has extras |
source_count == target_count && trusted_modify_id |
INCREMENTAL | No count change, trust timestamps |
source_count == target_count && !trusted_modify_id |
FULL COMPARE | No count change, verify all |
source_count > target_count && changes > 0 |
ADD → INCREMENTAL/MODIFY | Source has more + changes |
local trustPrev = syncPrf.trust_modify_id == true and hasPrevModifyId
if trustPrev then
-- Fast incremental sync using modify timestamps
plan = {"incremental"}
else
-- Full comparison required
plan = {"add", "changed"}
end
source=5000, target=0, changed=5000
Plan: ["add"]
Reason: "target empty => add all"
source=4990, target=5000, changed=0
Plan: ["delete"]
Reason: "target has more rows => delete extras"
source=5010, target=5000, changed=10, trust_modify_id=true
Plan: ["add", "incremental"]
Reason: "source has more rows & trusted modify id"
source=5010, target=5000, changed=10, trust_modify_id=false
Plan: ["add", "incremental", "changed"]
Reason: "source has more rows & untrusted modify id"
The binary search optimization dramatically reduces query complexity for large datasets:
Traditional Approach: O(n) - Read all IDs from both databases Binary Search: O(log n) - Recursively narrow down differences
When Binary Search is Used:
binary_search_min_table_size (default: 1000)binary_search_max_diff_percent (default: 50%)trust_modify_id != true)Binary Search Process:
binary_search_read_batch (default: 500)All data operations use configurable batch sizes for optimal performance:
-- Batch size configuration (per database type)
batch_size: 5000 -- General operations
batch_size_4d: 1000 -- 4D database operations
delete_batch_size: 1000 -- Delete operations
delete_batch_size_4d: 500 -- 4D delete operations
id_array_batch_size: 25000 -- ID array operations
When using binary search, the system can prioritize recent records:
binary_search_recent_first: 5000 -- Search last 5000 records first
This is highly effective because most changes occur in recently created records.
Database connections are reused across operations to minimize connection overhead:
-- Connection reuse pattern
local conn = dconn.connection({organizationId = dbId})
-- Multiple operations using same connection
dconn.disconnectAll() -- Clean up at end
When trust_modify_id = true, only records modified since the last sync are processed:
-- Query with modify time filter
WHERE modify_time > last_sync_modify_time
-- Explicit garbage collection between tables
collectgarbage()
-- Large arrays are processed in chunks
for i = 1, #largeArray, batchSize do
local batch = table.slice(largeArray, i, i + batchSize - 1)
processBatch(batch)
end
flowchart TD A[Start Binary Search<br/>Full Range: 1 to N] --> B[Initialize Stack with<br/>Full Range] B --> C{Stack<br/>Empty?} C -->|Yes| X[Return Results] C -->|No| D[Pop Range from Stack] D --> E{Range Size <=<br/>batch_size?} E -->|Yes| F[Process Range Directly<br/>Read & Compare Records] E -->|No| G[Find Midpoint ID<br/>using OFFSET/LIMIT] F --> H[Find Differences<br/>Add to Results] H --> C G --> I[Count Records in<br/>Lower Half 1 to mid] I --> J[Count Records in<br/>Upper Half mid to N] J --> K{Lower Half<br/>has differences?} K -->|Yes| L[Push Lower Range<br/>to Stack] K -->|No| M{Upper Half<br/>has differences?} L --> M M -->|Yes| N[Push Upper Range<br/>to Stack] M -->|No| C N --> C style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style X fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style F fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000 style G fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000 style B fill:#f3e5f5,color:#000 style C fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style D fill:#e0f2f1,color:#000 style E fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style H fill:#fff3e0,color:#000 style I fill:#e8f5e8,color:#000 style J fill:#e8f5e8,color:#000 style K fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style L fill:#e8f5e8,color:#000 style M fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style N fill:#e8f5e8,color:#000 subgraph "Progress Display" P1[2.8%↑5000 - Upper Range] P2[97%↓174197 - Lower Range] end style P1 fill:#e8f5e8,color:#000 style P2 fill:#ffebee,color:#000
flowchart TD subgraph "Database Range: 1 to 100,000" A1[Full Range<br/>1 ↔ 100,000<br/>Check counts differ] A1 --> B1[Lower Half<br/>1 ↔ 50,000<br/>Counts match ✓] A1 --> B2[Upper Half<br/>50,001 ↔ 100,000<br/>Counts differ ✗] B2 --> C1[Range: 50,001 ↔ 75,000<br/>Counts match ✓] B2 --> C2[Range: 75,001 ↔ 100,000<br/>Counts differ ✗] C2 --> D1[Range: 75,001 ↔ 87,500<br/>Counts differ ✗] C2 --> D2[Range: 87,501 ↔ 100,000<br/>Counts match ✓] D1 --> E1[Range: 75,001 ↔ 81,250<br/>Size ≤ 500<br/>Process directly] D1 --> E2[Range: 81,251 ↔ 87,500<br/>Size ≤ 500<br/>Process directly] end style A1 fill:#e1f5fe,stroke:#1976d2,stroke-width:2px,color:#000 style B1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000 style C1 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000 style D2 fill:#c8e6c9,stroke:#388e3c,stroke-width:2px,color:#000 style B2 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000 style C2 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000 style D1 fill:#ffcdd2,stroke:#d32f2f,stroke-width:2px,color:#000 style E1 fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000 style E2 fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000
The binary search engine is the most sophisticated optimization in the system:
function binarySearch(syncRec, fromId, toId, operation)
local stack = {}
local resultArr = {}
-- Initialize with full range
stack[#stack + 1] = {
startId = "",
endId = "",
toCount = totalToCount,
fromCount = totalFromCount,
depth = 0
}
while #stack > 0 and #resultArr < expectedResults do
local range = table.remove(stack)
if range.toCount <= batchSize then
-- Small range: process directly
local records = getRecordsInRange(range.startId, range.endId)
local differences = findDifferences(records)
table.append(resultArr, differences)
else
-- Large range: subdivide
local midId = getMidpointId(range.startId, range.endId)
local lowerCounts = countRecordsInRange(startId, midId)
local upperCounts = countRecordsInRange(midId, endId)
-- Add ranges with differences to stack
if hasRecordDifferences(lowerCounts) then
stack[#stack + 1] = createLowerRange(range, midId, lowerCounts)
end
if hasRecordDifferences(upperCounts) then
stack[#stack + 1] = createUpperRange(range, midId, upperCounts)
end
end
end
return resultArr
end
flowchart LR subgraph "READ PHASE" A[Read Source Data<br/>in Batches] B[Read Target IDs<br/>for Comparison] A --> C[Source Data Array] B --> D[Target ID Index] end subgraph "TRANSFORM PHASE" C --> E[Compare Records] D --> E E --> F[Records to ADD] E --> G[Records to MODIFY] E --> H[Records to DELETE] end subgraph "WRITE PHASE" F --> I[Batch INSERT<br/>Operations] G --> J[Batch UPDATE<br/>Operations] H --> K[Batch DELETE<br/>Operations] end subgraph "SCHEMA TRANSLATION" L[Field Mapping] M[Type Conversion] N[Data Validation] L --> M --> N N --> I N --> J end style A fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000 style B fill:#e3f2fd,stroke:#1976d2,stroke-width:2px,color:#000 style C fill:#e3f2fd,color:#000 style D fill:#e3f2fd,color:#000 style E fill:#fff3e0,stroke:#f57c00,stroke-width:3px,color:#000 style F fill:#e8f5e8,color:#000 style G fill:#fff8e1,color:#000 style H fill:#ffebee,color:#000 style I fill:#e8f5e8,stroke:#388e3c,stroke-width:2px,color:#000 style J fill:#fff8e1,stroke:#fbc02d,stroke-width:2px,color:#000 style K fill:#ffebee,stroke:#d32f2f,stroke-width:2px,color:#000 style L fill:#f3e5f5,color:#000 style M fill:#f3e5f5,color:#000 style N fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px,color:#000
sequenceDiagram participant S as Source DB participant E as Sync Engine participant D as Destination DB Note over E: READ PHASE E->>S: Read batch 1 (records 1-5000) S-->>E: Return source records E->>S: Read batch 2 (records 5001-10000) S-->>E: Return source records E->>D: Read all IDs for comparison D-->>E: Return ID array Note over E: TRANSFORM PHASE E->>E: Build target ID index E->>E: Compare source vs target E->>E: Categorize: ADD/MODIFY/DELETE E->>E: Apply schema transformations Note over E: WRITE PHASE E->>D: Batch DELETE (if any) D-->>E: Confirm deletions E->>D: Batch INSERT (new records) D-->>E: Confirm insertions E->>D: Batch UPDATE (modified records) D-->>E: Confirm updates Note over E: VERIFICATION E->>D: Verify final record counts D-->>E: Return counts
-- 1. Read source data in batches
local sourceData = {}
for batchStart = 1, totalCount, batchSize do
local batch = readBatch(fromId, batchStart, batchSize)
table.append(sourceData, batch)
end
-- 2. Read destination IDs for comparison
local destIds = readIdArray(toId, idField)
local destIdIndex = invertTable(destIds)
-- 3. Compare and categorize records
local toAdd = {}
local toModify = {}
local toDelete = {}
for _, sourceRecord in ipairs(sourceData) do
local id = sourceRecord[idField]
if destIdIndex[id] then
if recordsAreModified(sourceRecord, destRecord) then
toModify[#toModify + 1] = sourceRecord
end
else
toAdd[#toAdd + 1] = sourceRecord
end
end
-- Find records to delete (in destination but not source)
for destId in pairs(destIdIndex) do
if not sourceIdIndex[destId] then
toDelete[#toDelete + 1] = destId
end
end
-- 4. Execute operations in optimal order
if #toDelete > 0 then
executeBatchDeletes(toId, toDelete, deleteBatchSize)
end
if #toAdd > 0 then
executeBatchInserts(toId, toAdd, insertBatchSize)
end
if #toModify > 0 then
executeBatchUpdates(toId, toModify, updateBatchSize)
end
-- 5. Handle schema differences during transform
local transformedRecord = {}
for fieldName, fieldValue in pairs(sourceRecord) do
local destFieldName = fieldMapping[fieldName] or fieldName
local destFieldType = destSchema[destFieldName]
transformedRecord[destFieldName] = convertFieldValue(
fieldValue,
sourceSchema[fieldName],
destFieldType
)
end
This documentation provides a complete understanding of the database synchronization test suite and its role in validating the sync system.
The database synchronization system includes sophisticated sleep and scheduling capabilities that enable automated, recurring synchronization with intelligent timing management.
flowchart TD A[Sync Complete] --> B{Scheduling<br/>Configuration?} B -->|sleep_seconds| C[Interval-based Sleep] B -->|run_at_time| D[Time-based Sleep] B -->|None| E[Exit] C --> F[smartSleep Function] D --> F F --> G[Calculate Sleep Duration] G --> H{Duration > max_chunk?} H -->|Yes| I[Sleep in Chunks] H -->|No| J[Single Sleep] I --> K[Sleep max_chunk seconds] K --> L{Target Time<br/>Reached?} L -->|No| M[Progress Update] M --> N{More Sleep<br/>Needed?} N -->|Yes| K N -->|No| O[Wake Up] L -->|Yes| O J --> O O --> P[Start Next Sync]
sleep_seconds: Sets automatic recurring synchronization intervals
{
"sleep_seconds": 3600 // Run every hour
}
Use Cases:
run_at_time: Schedule sync at specific times
{
"run_at_time": "02:00" // Run at 2:00 AM daily
}
Use Cases:
max_sleep_chunk_seconds: Maximum sleep duration per chunk (default: 3600 seconds)
{
"max_sleep_chunk_seconds": 3600 // 1-hour maximum chunks
}
Benefits:
function smartSleep(totalSeconds, maxChunkSeconds)
maxChunkSeconds = maxChunkSeconds or 3600 -- Default 1 hour
while totalSeconds > 0 do
local chunkSeconds = math.min(totalSeconds, maxChunkSeconds)
-- Sleep for this chunk
util.sleep(chunkSeconds * 1000)
totalSeconds = totalSeconds - chunkSeconds
-- Check if target time reached (for run_at_time)
if shouldWakeUpEarly() then
break
end
-- Progress update
if totalSeconds > 0 then
print(string.format("Sleep progress: %d seconds remaining", totalSeconds))
end
end
end
For run_at_time scheduling, the system can wake up early when the target time is reached:
-- Example: Scheduled for 02:00, but it's now 01:59
-- System wakes up 1 minute early instead of sleeping full hour
{
"sleep_seconds": 3600,
"max_sleep_chunk_seconds": 1800 // 30-minute chunks
}
Behavior:
{
"run_at_time": "02:00",
"max_sleep_chunk_seconds": 3600 // 1-hour chunks
}
Behavior:
{
"sleep_seconds": 86400, // 24 hours
"max_sleep_chunk_seconds": 3600 // 1-hour chunks
}
Behavior:
Sync complete. Sleeping 86400 seconds until next run...
Sleep progress: 82800 seconds remaining (23.0 hours)
Sleep progress: 79200 seconds remaining (22.0 hours)
Sleep progress: 75600 seconds remaining (21.0 hours)
...
Sleep progress: 3600 seconds remaining (1.0 hours)
Sleep complete. Starting next sync cycle...
Sync complete. Waiting until 02:00 for next run...
Current time: 22:30, target: 02:00 (3.5 hours remaining)
Sleep progress: 10800 seconds remaining (3.0 hours)
Sleep progress: 7200 seconds remaining (2.0 hours)
Sleep progress: 3600 seconds remaining (1.0 hours)
Target time reached. Starting sync at 02:00...
Choose appropriate chunk sizes:
Monitor resource usage:
Error handling integration:
Problem: Sleep seems to hang indefinitely
Solution: Check max_sleep_chunk_seconds configuration
Problem: Sync doesn't start at expected time
Solution: Verify run_at_time format and timezone settings
Problem: Too many progress updates
Solution: Increase max_sleep_chunk_seconds value
Problem: System doesn't wake up early
Solution: Ensure early wake-up logic is properly configured for run_at_time
flowchart TD A[Operation Starts] --> B{Error<br/>Occurred?} B -->|No| C[Operation Successful] B -->|Yes| D[Categorize Error Type] D --> E{Connection<br/>Error?} D --> F{Schema<br/>Error?} D --> G{Data<br/>Error?} D --> H{Logic<br/>Error?} E -->|Yes| E1[Disconnect All<br/>Connections] E1 --> E2[Wait 30 seconds] E2 --> E3[Retry Connection] E3 --> I{Retry<br/>Successful?} F -->|Yes| F1[Check Schema<br/>Compatibility] F1 --> F2[Apply Schema<br/>Transformations] F2 --> J{Schema<br/>Fixed?} G -->|Yes| G1[Isolate Failed<br/>Record] G1 --> G2[Continue with<br/>Next Record] G2 --> K{Error Count <br/>max_error_count?} H -->|Yes| H1[Check Binary Search<br/>Settings] H1 --> H2{Fallback<br/>Enabled?} H2 -->|Yes| H3[Fallback to<br/>Full Compare] H2 -->|No| L[Fail Operation] I -->|Yes| M[Resume Operation] I -->|No| L J -->|Yes| M J -->|No| L K -->|Yes| M K -->|No| L H3 --> M M --> B C --> N[End Success] L --> O[End Failure] style A fill:#e1f5fe,stroke:#1976d2,stroke-width:3px,color:#000 style B fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style C fill:#c8e6c9,color:#000 style D fill:#fff8e1,color:#000 style E fill:#ffebee,color:#000 style F fill:#e8f5e8,color:#000 style G fill:#f3e5f5,color:#000 style H fill:#fff3e0,color:#000 style E1 fill:#ffebee,color:#000 style E2 fill:#ffebee,color:#000 style E3 fill:#ffebee,color:#000 style F1 fill:#e8f5e8,color:#000 style F2 fill:#e8f5e8,color:#000 style G1 fill:#f3e5f5,color:#000 style G2 fill:#f3e5f5,color:#000 style H1 fill:#fff3e0,color:#000 style H2 fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style H3 fill:#fff3e0,color:#000 style I fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style J fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style K fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#000 style L fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000 style M fill:#fff3e0,stroke:#fbc02d,stroke-width:2px,color:#000 style N fill:#c8e6c9,stroke:#388e3c,stroke-width:3px,color:#000 style O fill:#ffcdd2,stroke:#d32f2f,stroke-width:3px,color:#000
-- Binary search fallback to full compare
if binarySearchFailed and syncPrf.fallback_full_compare_on_mismatch then
util.printWarning("Binary search failed, falling back to full compare")
return syncCompare(syncRec, tbl, schema, fromId, toId, fldArr, stat, operation)
end
-- Continue processing other records when individual records fail
saveParam.parameter.continue_on_error = syncPrf.max_error_count
-- Process remaining batches even if one batch fails
if batchError and stat.errorCount < syncPrf.max_error_count then
stat.errorCount = stat.errorCount + 1
continue -- Process next batch
end
-- Automatic reconnection on connection loss
if connectionLost then
dconn.disconnectAll()
defaultConn = nil
util.sleep(30000) -- Wait 30 seconds
-- Retry connection on next iteration
end
-- Rollback incomplete operations
if operationFailed then
if transactionActive then
rollbackTransaction()
end
-- Restore previous state
restoreFromBackup()
end
max_error_count: 25 -- Stop after 25 errors
max_error_loop: 5 -- Stop after 5 consecutive error loops
-- Comprehensive error tracking
local errorTbl = {}
errorTbl[#errorTbl + 1] = util.printError("Error description: %s", details)
-- Error summary at end
if #errorTbl > 0 then
util.printRed("Sync completed with %d errors:", #errorTbl)
for i, error in ipairs(errorTbl) do
util.printRed(" %d. %s", i, error)
end
end
Symptoms:
Diagnosis:
-- Check if binary search is being used
binary_search_for_add: true
binary_search_min_table_size: 1000 -- Lower for smaller tables
-- Monitor query counts in logs
-- Look for "using full compare" vs "using binary search"
Solutions:
trust_modify_id: truesync_table configurationSymptoms:
Solutions:
-- Reduce batch sizes
batch_size: 1000 -- Down from 5000
id_array_batch_size: 5000 -- Down from 25000
-- Force garbage collection
collectgarbage() -- Called between tables automatically
Symptoms:
Diagnosis:
-- Check count mismatches in logs
"counts mismatch after binary delete"
"fallback full compare on mismatch"
-- Verify ID field configuration
primary_key_field: "record_id" -- Ensure correct field
Solutions:
fallback_full_compare_on_mismatch: truetrust_modify_id: falseSymptoms:
Solutions:
-- Map incompatible fields
"field_mapping": {
"source_field": "dest_field",
"varchar_field": "text_field"
}
-- Handle schema differences
set_default_value: true -- Set defaults for missing fields
copy_all_json_keys: true -- Handle JSON field differences
Symptoms:
Solutions:
-- Increase timeout settings (in connection config)
"timeout": 300, -- 5 minutes
"retry_count": 3, -- Retry failed operations
"retry_delay": 5000 -- Wait 5 seconds between retries
The system provides comprehensive performance tracking:
-- Timing metrics
readTimeAll -- Total time reading data
writeTimeAll -- Total time writing data
deleteTimeAll -- Total time deleting data
dataCompareTimeAll -- Total time comparing data
binarySearchTimeAll -- Total time in binary search
-- Operation counts
queryCountAll -- Total database queries
recordCountAllTables -- Total records processed
recordCountAllAdded -- Total records added
recordCountAllModified -- Total records modified
recordCountAllDeleted -- Total records deleted
recordCountAllSkipped -- Total records skipped
-- Binary search metrics
binarySearchCountAll -- Number of binary searches
binarySearchIterationCount -- Total search iterations
binarySearchQueryCount -- Queries used in binary search
binarySearchMaxDepthAll -- Maximum search depth reached
-- Enable detailed logging
show_sql: true -- Show SQL queries
debug_sql: true -- Show SQL execution details
show_save_sql: true -- Show save operation SQL
debug_connection_change: true -- Show connection changes
-- Performance analysis
only_sync_plan: true -- Plan-only mode (no actual sync)
check_all_tables: true -- Check all tables regardless of changes
Real-time progress display shows:
delete binary search 179196 records, search 5000 recent first, need to find 2:
2.8%↑5000 97%↓174197 49%↑87097 24%↑43547 12%↓21775 6.1%↑10886
Key log patterns to monitor:
# Successful operations
"records to sync: 100 / 5000, records synced: 4900 / 5000"
"using binary search for adds (from=5100, to=5000, diff=100)"
# Performance warnings
"using full compare for adds (reason=large difference)"
"fallback full compare on mismatch"
# Error conditions
"Binary search: Invalid toCount=0"
"sync connection failed"
"maximum error count 25 was reached"
Start with Conservative Settings:
{
"batch_size": 1000,
"binary_search_min_table_size": 5000,
"trust_modify_id": false,
"fallback_full_compare_on_mismatch": true
}
Scale Up Gradually:
{
"batch_size": 5000, -- Increase as performance allows
"trust_modify_id": true, -- Enable after initial sync
"binary_search_recent_first": 5000 -- Focus on recent changes
}
Essential Indexes:
-- Primary key index (usually automatic)
CREATE INDEX idx_table_record_id ON table_name (record_id);
-- Modify time index for incremental sync
CREATE INDEX idx_table_modify_time ON table_name (modify_time);
-- Composite index for range queries
CREATE INDEX idx_table_id_modify ON table_name (record_id, modify_time);
Statistics Updates:
-- Keep database statistics current
ANALYZE TABLE table_name;
UPDATE STATISTICS table_name;
{
"trust_modify_id": false, -- Full compare for accuracy
"binary_search_for_add": true, -- Use optimization
"sync_all": true -- Sync everything
}
{
"trust_modify_id": true, -- Fast incremental
"sleep_seconds": 3600, -- Hourly sync
"min_rows_to_sync": 1 -- Skip if no changes
}
Validate Configuration:
-- Test with single table first
"sync_table": ["test_table"],
"only_sync_plan": true -- Dry run mode
-- Enable all safety features
"fallback_full_compare_on_mismatch": true,
"max_error_count": 5, -- Low threshold for testing
"continue_on_error": false -- Stop on first error
Monitor Resource Usage:
# Monitor memory usage
top -p $(pgrep lj)
# Monitor disk I/O
iostat -x 1
# Monitor network traffic
netstat -i
Daily:
Weekly:
Monthly:
The database synchronization system is a robust, production-ready solution for keeping multiple databases in sync. Its sophisticated algorithms, comprehensive error handling, and extensive optimization features make it suitable for demanding enterprise environments.
Key success factors:
For additional support and configuration examples, refer to the db-sync.json configuration manual and the binary search progress display manual.