# Cross-Process Locking Mechanisms (Windows vs. macOS/Linux)

Choosing the right cross-process locking mechanism is crucial for building robust, scalable, and maintainable applications. A poor choice can lead to performance bottlenecks, data corruption, deadlocks, or orphaned resources after a process crash. This document analyzes various locking primitives available on Windows, macOS, and Linux to guide the selection of the most appropriate strategy for our needs.

---

## 1. Guiding Principles

Our goal is to select primitives that satisfy the following criteria:

1. **Crash Safety:** The system must remain consistent, and resources must be recoverable if a process holding a lock crashes.
2. **Low Deadlock & Leak Risk:** The chosen mechanism should minimize the risk of algorithmic deadlocks and leaked kernel objects.
3. **Sufficient Performance:** It must be fast enough for its intended use case without becoming a bottleneck.
4. **Simplicity:** The implementation should be straightforward to use, understand, and maintain.
5. **Portability:** Where possible, we prefer solutions that work across platforms or have clear, reliable fallbacks.

---

## 2. Taxonomy of Locking Primitives

| Category | Primitive | Win | macOS | Linux | Speed* | Crash Safety | Deadlock / Leak Risk** | Portability | Notes |
|----------|----------|-----|-------|-------|--------|--------------|------------------------|-------------|-------|
| Kernel object | Named Mutex | ✅ | ❌ | ❌ | A | Abandoned detection | Normal (ordering still matters) | Low | Best Windows general choice |
| Kernel object | Named Semaphore (sem_open) | ❌ | ✅ | ✅ | A/B | No owner death info | Low/Normal | Medium | Good portable Unix choice |
| Kernel object | Robust POSIX Mutex (pthread_mutexattr_setrobust + shared) | ❌ | ❌ | ✅ | A | Detects owner death | Low | Medium | Linux only; macOS lacks robust mutex support |
| Kernel object | System V Semaphore set | ❌ | ✅ | ✅ | A | Survives crashes (must remove) | Medium (leak risk) | Medium | Legacy but fast |
| Kernel futex | Futex (raw) | ❌ | ❌ | ✅ | A+ | Depends on protocol | High (DIY protocol) | Low | Lowest overhead; complex |
| File system | flock(2) (advisory) | ❌ | ✅ | ✅ | B/C | Auto release on close | Medium (ordering, NFS quirks) | High | Simple; whole-file; NFS unreliable |
| File system | fcntl record lock | ❌ | ✅ | ✅ | B/C | Auto release on close | Medium (gotchas, per‑process semantics) | High | Byte‑range; advisory |
| File system | Windows LockFile(Ex) | ✅ | ❌ | ❌ | B/C | Auto release on close | Medium | Low | Byte‑range; advisory semantics differ |
| Lock file pattern | create(O_EXCL) / atomic rename | ✅ | ✅ | ✅ | C | Crash leaves stale file | Medium/High (needs stale cleanup) | High | Simple mutual exclusion, not re‑entrant |
| Shared memory + atomics | Spin / backoff lock | ✅ | ✅ | ✅ | A+ (low contention) | None (no owner tracking) | High (livelock, priority inversion) | High | Only for very short critical sections |
| Higher level | SQLite transaction (BEGIN IMMEDIATE) | ✅ | ✅ | ✅ | C | WAL handles crash | Low | High | Heavy but very robust |

*\*Speed buckets (rough, uncontended, same machine): A+ ~ sub‑microsecond, A ~ 1–5µs, B ~ 5–25µs, C ~ 25–250µs. Numbers vary by kernel/version/CPU.*

*\*\*Deadlock / leak risk here reflects how easy it is to misuse leading to permanent wait or leaked semaphore objects – NOT that the primitive itself magically prevents algorithmic deadlock (ordering problems always can deadlock any blocking primitive).*

---

## 3. Key Platform Differences & Crash Recovery

The most significant differentiator for robust locking is how a primitive behaves when a process crashes.

| Primitive | Signal of Owner Death | Required Action by New Acquirer |
|-----------|-----------------------|---------------------------------|
| Win Named Mutex | `WAIT_ABANDONED` | Repair shared state, then call `ReleaseMutex`. |
| Robust pthread mutex | `EOWNERDEAD` on lock | Call `pthread_mutex_consistent` after repair. |
| POSIX named semaphore | None | Needs an external watchdog or timeout to detect stale locks. |
| flock / fcntl | Auto-release by OS | No special action needed by application. |

Windows **Named Mutexes** and POSIX **Robust Mutexes** are superior for building resilient systems because they explicitly inform the next process that the previous owner died. This allows the application to run a data recovery/consistency check routine before proceeding. File locks are simpler because the OS handles cleanup, but they offer no direct signal to the application that a crash occurred.

---

## 4. Deadlock: Prevention vs. Recovery

It is critical to understand that **no locking primitive inherently prevents logical deadlocks**. A deadlock is an algorithmic problem, typically caused by multiple processes acquiring locks in an inconsistent order.

- **Prevention** is an architectural concern. You prevent deadlocks by enforcing a strict lock acquisition order, using `try_lock` patterns with backoffs, or designing lock-free data structures.
- **Recovery** is what the advanced primitives offer. They don't prevent the deadlock itself, but they can detect when a lock-holding process has *crashed*. This allows another process to acquire the "abandoned" lock and perform cleanup, preventing the resource from being permanently locked.

---

## 5. Performance (Typical Orders of Magnitude)

Uncontended path (approximate):

| Primitive | Typical Cost | Notes |
|-----------|--------------|-------|
| Spin + atomic (short) | < 100 ns | Risk: burns CPU, bad under high contention. |
| Futex / robust mutex (fast path) | ~200–700 ns | Kernel-assisted user-space lock. |
| Windows Named Mutex (fast path) | ~0.8–2 µs | Kernel object, but highly optimized. |
| POSIX named semaphore | ~1–5 µs | macOS often toward the higher end. |
| flock / fcntl (cached inode, local FS) | ~5–40 µs | Filesystem overhead is significant. |
| File lock on network / cold cache | 100 µs – ms | Network latency dominates. |

Contention increases cost sharply for all primitives. Those that sleep in the kernel (mutex, futex, semaphore) scale better under load than pure spinlocks.

---

## 6. Recommendations (Local Single Host)

| Use Case | Windows | macOS/Linux | Rationale |
|----------|---------|-------------|-----------|
| **General Mutual Exclusion** | **Named Mutex** | **Robust Shared `pthread_mutex`** (preferred) or POSIX Named Semaphore | Best combination of crash visibility, performance, and simplicity. |
| High-Frequency Critical Section | (Intra-process) SRW/Critical Section; else Named Mutex | Robust Shared Mutex or a futex-based library | Lower latency for performance-critical paths. |
| Simple, Portable Fallback | File Lock (`LockFileEx`) | `flock` (whole file) | Minimal dependencies, universally available. |
| Non-Blocking Optional Work | `TryWait` on Mutex (timeout=0) | `pthread_mutex_trylock` / `sem_trywait` | Avoids stalling a thread when a resource is busy. |
| **Strict Crash Detection** | **Named Mutex** | **Robust Shared `pthread_mutex`** | The only primitives that provide an explicit "owner died" signal. |
| Avoid Filesystem Dependency | Named Mutex | Shared Memory + Robust Mutex | Fewer I/O failure modes, not subject to filesystem quirks. |

---

## 7. Cross-Platform Abstraction Strategy

A tiered approach provides the best combination of features and portability:

```lua
-- Pseudocode for selecting the best available lock primitive
function create_lock(name)
  if is_windows() then
    return create_windows_named_mutex(name)      -- Tier 1: Best features on Windows
  elseif supports_robust_mutexes() then
    return create_robust_shared_mutex(name)      -- Tier 1: Best features on modern Linux/BSD
  elseif has_posix_named_semaphores() then
    return create_posix_named_semaphore(name)    -- Tier 2: Good fallback, but no crash detection
  else
    return create_flock_file_lock(name)          -- Tier 3: Safest, most portable last resort
  end
end
```

**Enhancements for a robust implementation:**

1. **Uniform Naming:** Use a consistent prefix, e.g., `nc:lock:v1:job-queue`, to avoid collisions.
2. **Diagnostic Logging:** Log detailed information (lock name, waiter PID, owner PID) if a wait exceeds a threshold (e.g., >100ms).
3. **Metrics:** Instrument the code to count acquisitions, wait durations, timeouts, and abandoned/`EOWNERDEAD` events.

---

## 8. When File Locks Are Still a Good Choice

Despite being slower, file locking (`flock`, `fcntl`) is perfectly acceptable and often preferred for:

1. **Low-frequency tasks:** Migrations, cron jobs, or single-writer data generation where performance is not critical.
2. **Simplicity and Portability:** When you need a lock that "just works" everywhere with minimal code.
3. **External Visibility:** Admins can often inspect a lock file on the filesystem.

**Constraints:** Always use on a local filesystem. File locking behavior over network filesystems like NFS is notoriously unreliable.

---

## 9. Common Mistakes & How To Avoid Them

| Mistake | How to Fix |
|---------|------------|
| Treating semaphores as mutexes | Semaphores lack ownership; a thread can unlock a semaphore it didn't lock. Use a real mutex for mutual exclusion. |
| Re-locking a non-recursive mutex | A thread re-locking a standard Windows Mutex will deadlock. Use recursive mutexes only when necessary and with caution. |
| Assuming `flock` is reliable on NFS | It isn't. Avoid it or use specialized NFS-aware locking protocols. |
| **Ignoring `EOWNERDEAD` / `WAIT_ABANDONED`** | This is the most critical error. **Always** check for these return codes, repair data, and mark the state as consistent. |
| Spinning under high contention | Use a blocking mutex to yield the CPU instead of burning it in a spin-loop. |

---

## 10. Minimal Pseudocode Snippets (LuaJIT FFI)

These examples illustrate the core concepts using LuaJIT's FFI. The patterns are translatable to any language with C interop.

### Robust Shared Mutex (Unix)

```lua
-- For full example, see original document. This is a conceptual summary.
local ffi = require('ffi')
-- ... (cdefs for shm_open, mmap, pthread_mutex_*) ...

function create_robust_mutex(name)
  -- 1. shm_open() to create/open a shared memory segment
  -- 2. ftruncate() to set its size
  -- 3. mmap() to map it into the process address space
  -- 4. Initialize a pthread_mutexattr_t with PTHREAD_PROCESS_SHARED and PTHREAD_MUTEX_ROBUST
  -- 5. pthread_mutex_init() the mutex in the shared memory with the attributes
  -- Return the mapped memory address
end

function lock(mutex_ptr)
  local r = C.pthread_mutex_lock(mutex_ptr)
  if r == EOWNERDEAD then
    print("Lock acquired, but previous owner died. Repairing state...")
    -- ... repair shared data structures ...
    C.pthread_mutex_consistent(mutex_ptr)
  elseif r ~= 0 then
    error("Failed to lock mutex: " .. r)
  end
end
```

### Windows Named Mutex

```lua
-- For full example, see original document. This is a conceptual summary.
local ffi = require('ffi')
-- ... (cdefs for CreateMutexW, WaitForSingleObject, ReleaseMutex) ...

function create_named_mutex(name)
  -- 1. Create a UTF-16 version of the name, e.g., "Global\\my_app_lock"
  -- 2. Call CreateMutexW()
  -- Return the HANDLE
end

function acquire(mutex_handle)
  local r = C.WaitForSingleObject(mutex_handle, INFINITE)
  if r == WAIT_ABANDONED then
    print("Lock acquired, but previous owner died. Repairing state...")
    -- ... repair shared data structures ...
    -- NOTE: You still own the lock now. Release it when done.
    return "abandoned"
  elseif r == WAIT_OBJECT_0 then
    return "ok"
  else
    error("Failed to acquire mutex")
  end
end
```

---

## 11. Summary & Actionable Next Steps

**Summary:**

- **What are the best options?** For robust, high-performance locking, **Windows Named Mutexes** and **POSIX Robust Shared Mutexes** are the top choices. They are the only common primitives that directly report owner death.
- **Which are fastest?** For raw speed, futex-based user-space locks are fastest, but complex. For safe and practical speed, robust mutexes and Windows named mutexes are excellent. File locks are an order of magnitude slower but acceptable for many tasks.
- **Which have the least problems?** Primitives with crash detection (Named Mutex, robust mutex) have the least risk of creating permanently orphaned resources. File locks are also very safe in this regard, as the OS cleans them up automatically.

**Actionable Next Steps for This Project:**

1. **Implement a cross-platform locking library** using the tiered abstraction strategy outlined in section 7.
2. **Prioritize Robust Mutexes:** Implement the `shm_open` + robust `pthread_mutex` pattern for Unix-like systems.
3. **Provide Fallbacks:** Ensure the library falls back gracefully to POSIX semaphores and finally to `flock` if modern features are unavailable.
4. **Add a Test Harness:** Create tests that simulate a process crash (e.g., with `kill -9`) to verify that the `EOWNERDEAD` / `WAIT_ABANDONED` recovery path works correctly.
5. **Instrument Code:** Add metrics and logging to monitor lock contention, wait times, and crash recovery events in production.

---

## 12. References / Further Reading

For deeper research, consult the official man pages and documentation:

1. `pthread_mutexattr_setrobust(3)`
2. `sem_overview(7)`, `futex(7)`
3. Windows Docs: `CreateMutex`, `WaitForSingleObject`, Synchronization Objects
4. `flock(2)`, `fcntl(2)`

---

*Last updated: 2025-08-15.*
