Fix deadlock when canceling queries stuck in VDBE or busy handler by mjc · Pull Request #343 · elixir-sqlite/exqlite

mjc · 2026-02-26T03:40:27Z

Partially addresses elixir-sqlite/exqlite#192.

Summary

This PR fixes a deadlock where DBConnection.disconnect/2 could block forever while closing a connection whose dirty NIF thread was still inside SQLite. The dirty NIF execution model is unchanged; the fix is to make the blocked SQLite operation cancellable so it can release conn->mutex and allow close to finish.

The deadlock can happen in two important cases: a long-running statement inside SQLite VDBE execution, or a statement sleeping in SQLite's busy handler while waiting for a lock. In both cases the NIF call holds conn->mutex. If the pool times out and disconnect tries to close the connection, close also needs conn->mutex, so it waits behind the stuck operation.

What changed

This adds Exqlite.Sqlite3.cancel/1, which sets a connection cancellation flag and calls sqlite3_interrupt(). Exqlite.Connection.disconnect/2 now calls cancel/1 before close/1, so teardown can interrupt a running statement or cause a busy wait to stop retrying.

This replaces SQLite's default busy handler with an Exqlite busy handler. The handler preserves SQLite's default retry delay schedule, stores the timeout in conn->busy_timeout_ms, sleeps with sqlite3_sleep(), and checks the cancellation flag between sleeps. This avoids the non-cancellable behavior of SQLite's default busy handler while keeping the implementation portable.

This adds Exqlite.Sqlite3.set_busy_timeout/2 and changes connection setup to use it instead of PRAGMA busy_timeout. PRAGMA busy_timeout calls sqlite3_busy_timeout(), which installs SQLite's default busy handler and replaces any custom handler. The new API updates Exqlite's stored timeout without destroying the cancellable busy handler.

This adds a configurable progress handler interval through Exqlite.Sqlite3.set_progress_handler_steps/2 and the :progress_handler_steps connection option. The default is 1000 VDBE steps. Values less than 1 disable the progress handler for callers that want to avoid that overhead.

The shared cancellation flag is protected by interrupt_mutex. The busy handler, progress handler, cancel/1, and close/destructor paths read or write that flag under the same lock. The busy-timeout and progress-handler-step fields are also updated under interrupt_mutex.

User-facing behavior

High-level users who configure :busy_timeout on Exqlite.start_link/1 or an Ecto SQLite repo should not need to change anything. The connection setup path now applies that timeout through Exqlite.Sqlite3.set_busy_timeout/2.

Low-level users who execute PRAGMA busy_timeout = ... directly should migrate to Exqlite.Sqlite3.set_busy_timeout/2 if they want to keep cancellation of busy waits. Executing the pragma still works as SQLite behavior, but it replaces Exqlite's custom busy handler with SQLite's default handler, so cancel/1 can no longer stop the busy-handler retry loop before SQLite's busy timeout expires.

Exqlite.Sqlite3.interrupt/1 remains available for SQLite interruption. Exqlite.Sqlite3.cancel/1 is the stronger teardown-oriented API because it sets Exqlite's cancellation flag for the busy/progress handlers and also calls sqlite3_interrupt().

Implementation notes

The busy handler uses SQLite's default delay schedule: 1, 2, 5, 10, 15, 20, 25, 25, 25, 50, 50 ms, then 50 ms for later retries. Cancellation is observed between sleep intervals, so a busy wait exits after the current sleep finishes, with later sleeps capped at 50 ms.

The NIF stores the calling environment and pid while SQLite operations are running so the busy handler can detect when the caller process has died. If the caller is gone, the busy handler sets the cancellation flag and stops retrying.

cancel/1 and interrupt/1 avoid taking conn->mutex while a query may be running. A running SQLite call holds that mutex, so taking it in the cancellation path would block behind the operation we are trying to cancel. The timeout/progress configuration setters still take conn->mutex because they are not used to break a currently running SQLite call.

The NIF resource destructor and close path coordinate through conn->mutex and interrupt_mutex so connection teardown does not race with cancellation state updates or a concurrent interrupt.

The Mix project now tracks the C source files as external resources, and the Makefile emits dependency files for C headers, so C/NIF changes are rebuilt more reliably.

Tests

This PR adds regression coverage for busy-timeout behavior, low-level set_busy_timeout/2, configurable progress-handler steps, cancel/1, connection reuse after cancellation, pool recovery after timeouts, cancellation of busy-handler waits, and close/cancel races.

It also adds sanitizer-tagged stress tests for the cancellation and busy-timeout update paths. Those tests are excluded from the default suite and can be run explicitly with --include sanitizer.

Copilot

Pull request overview

This PR addresses a deadlock that can occur when DBConnection.disconnect/2 closes a SQLite connection while a dirty NIF thread is blocked in VDBE execution or the busy handler, by adding a cancellable busy handler plus a new Sqlite3.cancel/1 path to wake/interrupt blocked operations before close/1.

Changes:

Add a custom busy handler (condvar-based) + progress handler, and new NIF APIs set_busy_timeout/2 and cancel/1 to support fast cancellation.
Update Exqlite.Connection.disconnect/2 to call Sqlite3.cancel/1 before close/1, and migrate busy-timeout setting away from PRAGMA busy_timeout.
Expand tests around busy-timeout and cancellation behavior; add build dependency tracking for NIF compilation.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
test/exqlite/sqlite3_test.exs	Adds baseline busy-timeout tests plus coverage for new `set_busy_timeout/2` and `cancel/1` behavior.
test/exqlite/integration_test.exs	Updates timeout test expectations given query interruption on disconnect.
test/exqlite/cancellation_test.exs	Adds a larger cancellation/deadlock regression suite (tagged `:slow_test`).
lib/exqlite/sqlite3_nif.ex	Adds NIF function specs for `set_busy_timeout/2` and `cancel/1`, plus external resource tracking.
lib/exqlite/sqlite3.ex	Exposes `set_busy_timeout/2` and `cancel/1` in the public Elixir API.
lib/exqlite/connection.ex	Calls `Sqlite3.cancel/1` during disconnect and switches busy-timeout setup to the new NIF.
c_src/sqlite3_nif.c	Implements cancellable busy handler, progress handler, and the new NIFs.
Makefile	Adds `-MMD -MP` and includes `.d` files for header dependency tracking.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

warmwaffles · 2026-02-26T03:58:21Z

This does not fix that issue that is open. There's stuff in that issue that pertains to this, but this does not replace the Dirty NIF execution model.

I'll need to check this out and understand what is going on before we merge it.

mjc · 2026-02-26T04:03:05Z

Fair point on the execution model — this PR doesn't replace dirty NIFs and doesn't attempt to. The scope is narrower: fixing the specific deadlock where disconnect/2 blocks forever because the dirty NIF thread is stuck sleeping in SQLite's busy handler (with the mutex held), so close() can never acquire it.

The timed_wait_t abstraction is essentially a one-shot binary semaphore — condvar + mutex used purely for signaling. You're right that it has that shape. I chose condvar+mutex over a POSIX semaphore mainly because sem_timedwait has the same CLOCK_REALTIME portability constraint as condvars, and OTP already uses pthreads internally, so it's a closer match to what the runtime does. Happy to rename it cancel_sem_t or add a comment calling it out if that makes the intent clearer.

What this does fix:

The deadlock: disconnect/2 → cancel() → signals the condvar → busy handler wakes, returns 0 → SQLite releases db->mutex → close() proceeds. Previously it would block until the full busy_timeout expired (up to 30+ seconds).
VDBE stuck queries: progress handler + sqlite3_interrupt() together let the runtime interrupt long-running in-process queries.

What this does NOT do:

Replace the dirty NIF execution model
Fix any of the non-deadlock concerns from Replace DirtyNIF execution model with a different mechanism #192 (e.g., general throughput, non-dirty scheduling)

warmwaffles · 2026-02-26T04:29:25Z

I would like to stick to using the built in cond and mutex afforded by erl_nif. I don't know what that means for the pthread_cond_timedwait implementation and how that looks with replicating it, but if we stick to the primitives that we have, this will mean we don't have to worry about cross compilation issues as much.

mjc · 2026-02-26T04:42:11Z

The reason we use pthread_cond_timedwait / SleepConditionVariableCS directly is that OTP doesn't provide enif_cond_timedwait — it was deliberately omitted due to CLOCK_REALTIME portability concerns on POSIX. The busy handler needs "sleep up to N ms, or wake early if cancelled," and enif_cond_wait blocks indefinitely with no timeout parameter, so it can't do this alone.

Worth noting: the CLOCK_REALTIME concern that led OTP to omit enif_cond_timedwait
doesn't really apply here — a clock jump during a 50ms busy handler wait just means
one slightly longer iteration before we check the cancel flag. The total timeout is
tracked independently so it self-corrects on the next wake. But the polling approach
sidesteps this entirely since sqlite3_sleep() is a relative sleep.

The #ifdef _WIN32 / POSIX split covers the same two threading backends OTP uses internally — enif_cond_wait itself is a thin wrapper around pthread_cond_wait on POSIX and SleepConditionVariableCS on Windows. Any target that can run BEAM already provides one of these two (there's no third backend — OTP fails to compile without one). So cross-compiling for e.g. Nerves/ARM works the same as native — the target toolchain provides pthread.h.

Two alternatives if you'd prefer to avoid raw pthreads:

enif_cond_wait + watchdog thread: a per-connection enif_thread signals the cond after the timeout expires. Stays within OTP primitives but adds more moving parts.
Polling: drop the condvar entirely — busy handler calls sqlite3_sleep(10) in a loop and checks a volatile int cancelled flag after each sleep. No platform #ifdef, no threading primitives at all. Cancel latency goes from instant to worst-case 10ms, which is fine since disconnect timeouts are measured in seconds. This is the simplest approach by far.

Happy to refactor to either if you have a preference.

warmwaffles · 2026-02-26T04:53:54Z

I'll need to think on this a little more.

I peeked at a pthread_cond_timedwait implementation. Don't know if enif_monotonic_time would be a good choice for the clock portability issue.

mjc · 2026-02-26T05:05:24Z

If you're open to the polling approach, the busy handler simplifies to roughly:

static int exqlite_busy_handler(void *arg, int count) {
    connection_t *conn = (connection_t *)arg;
    
    if (conn->cancelled) return 0;
    if (!enif_is_process_alive(conn->callback_env, &conn->caller_pid)) return 0;
    
    conn->busy_elapsed_ms += 10;
    if (conn->busy_elapsed_ms >= conn->busy_timeout_ms) return 0;
    
    sqlite3_sleep(10);
    return !conn->cancelled;
}

The entire timed_wait_t platform abstraction goes away — no condvar, no mutex, no #ifdef, no clock concerns. cancel just sets conn->cancelled = 1 and calls sqlite3_interrupt. Happy to refactor to this if it works for you.

warmwaffles · 2026-02-26T13:29:47Z

The polling solution is a busy wait and would eat unnecessary cpu.

mjc · 2026-02-26T15:38:55Z

It's not a busy wait — sqlite3_sleep(10) calls through to the SQLite VFS xSleep, which calls usleep()/nanosleep() on POSIX and Sleep() on Windows. The thread is fully descheduled by the OS and consumes zero CPU while sleeping.

In fact, this is exactly what SQLite's own built-in busy handler ( sqliteDefaultBusyCallback does) on the released version of exqlite right now — it calls sqlite3OsSleep() in a loop with increasing delays (1, 2, 5, 10, 15, 20, 25, 50, 100ms). Our polling handler would do the same thing with a fixed 10ms interval, plus a cancellation flag check. The only trade-off vs. a condvar is up to 10ms latency before responding to cancellation.

We could even replicate the same delay ramp that sqliteDefaultBusyCallback uses if desired — since that's what sqlite3_busy_timeout() already does internally, just without the ability to interrupt.

I just liked the condvar solution because it felt more elegant.

warmwaffles · 2026-02-26T16:17:46Z

Does this current implementation work on Apple silicon?

mjc · 2026-02-26T16:35:59Z

@warmwaffles yes; I built the condvar version on an M1 Mac originally and tested on Ryzen 9 5950X on Linux. I have not tested windows yet but I can if you wish, I just don't have elixir on my windows machine atm so it'll take a minute.

The sqlite_sleep(n) approach should necessarily work since that's what exqlite was already doing, it's definitely the most portable and simplest choice.

warmwaffles · 2026-02-26T16:57:36Z

In my other failed game projects I had to use mach_timebase_info_data to get the absolute time.

mach_timebase_info_data_t clock_timebase;
mach_timebase_info(&clock_timebase);

uint64_t mach_absolute = mach_absolute_time();

uint64_t nanos = (double)(mach_absolute * (uint64_t)clock_timebase.numer) / (double)clock_timebase.denom;
return nanos / 1.0e9;

mjc · 2026-02-26T17:29:48Z

Yeah, mach_absolute_time() is the only way to get monotonic time on macOS since it doesn't support pthread_condattr_setclock(CLOCK_MONOTONIC) for condvars.

But we don't actually need it for either approach:

Condvar approach: We use CLOCK_REALTIME (the pthread_cond_init default) and it works fine — I built and tested it on an M1 Mac. A clock jump during a 50ms condvar wait just means one iteration sleeps slightly too long or too short; the total timeout is tracked by counting elapsed ms across iterations (same as sqliteDefaultBusyCallback), so it self-corrects. Worst case is one extra retry or one early timeout by a few tens of ms, which doesn't meaningfully affect busy handler behavior.

Polling approach: sqlite3_sleep() is a relative sleep with no clock dependency at all — SQLite handles the platform #ifdef internally. We'd replicate sqliteDefaultBusyCallback's backoff ramp (1→2→5→10→...→100ms) and check the cancel flag between each sleep.

Either way, mach_absolute_time isn't needed.

mjc · 2026-02-26T18:32:34Z

I did a deep dive into the alternative execution models from #192 to see if any of them change what's needed here:

Yielding NIFs: Not possible — sqlite3_step() is a single opaque C call that can't be paused/resumed, only aborted.
Threaded NIFs: Worker thread is still inside sqlite3_step() when the caller times out, so you need the exact same cancel flag + sqlite3_interrupt() machinery — but now you've also added a worker thread per connection, a work queue with mutex/condvar, argument marshaling into process-independent envs on every call, and enif_send/receive overhead.
ADBC's GenServer pattern: ADBC works because its NIFs are non-blocking (return a stream ref, read batches incrementally). sqlite3_step blocks for the entire write/busy-wait, so a GenServer would just deadlock itself.
NIF resource destructors: Fire on GC, but the process can't be GC'd while blocked on a dirty scheduler — that is the deadlock.

Every execution model ends up requiring the same SQLite-level cancellation: custom busy handler with a cancel flag + sqlite3_interrupt(). The dirty NIF model just determines where the caller blocks, not how you cancel. This matches where #192 landed — José said "I don't think you can replace dirty nifs, that's the way to do C integration."

mjc · 2026-03-18T21:13:37Z

sorry about all the force pushes; noticed somehow I didn't sign two of the commits. current version has the simpler sleep-based approach.

warmwaffles · 2026-03-19T15:52:09Z

I don't really mind signed or unsigned commits ¯\_(ツ)_/¯

I just haven't had time to look at this lately

mjc · 2026-03-19T20:42:38Z

No worries at all, this fix works for me locally so I have no need to rush it to get merged or anything.

Tests validate query cancellation and interrupt behavior from issue elixir-sqlite#192: - Connection can be closed/interrupted during long-running queries - cancel/interrupt work during VDBE execution - cancel/interrupt work during busy handler waits - Pool recovery after stuck queries (2 tests currently FAIL - demonstrating the bug) - close() returns quickly even with in-flight queries - Multiple simultaneous operations - Busy timeout expiration behavior - Transaction rollback during interruption 22 tests total, tagged :slow_test (take ~78s). At this commit, 20 pass and 2 fail: - FAIL: 'pool recovers after long query interrupted on disconnect' (timeout) - FAIL: 'pool recovers when query stuck in busy handler' (stuck waiting) These failures demonstrate the deadlock described in elixir-sqlite#192. Subsequent commits implement the fix that makes all 22 tests pass.

When DBConnection times out a long-running query, it calls disconnect/2 on the connection process. Previously, disconnect only called Sqlite3.close(), which blocks on conn->mutex held by the still-running dirty NIF — deadlocking the pool permanently. Now disconnect calls Sqlite3.interrupt(db) first, which sets a flag checked at every VDBE loop iteration (OP_Next, OP_Prev, etc.). The running query aborts with SQLITE_INTERRUPT, releases the mutex, and close() proceeds. This is safe because PR elixir-sqlite#342 (v0.35.0) added interrupt_mutex to the NIF layer, which prevents a use-after-free race between interrupt() and close(). The interrupt_mutex ensures interrupt reads the db pointer atomically even if close() is freeing it concurrently. Test results before this change: 149 tests, 20 pass, 2 fail (pool recovery test deadlocks at 30s timeout) Test results after this change: 160 tests, 2 failures - 21 of 22 cancellation tests now pass (busy handler test still fails as expected) - The integration test "exceeding timeout" fails with a pattern match error (expects {:ok, _, _} but gets {:error, "interrupted"} — will be fixed later) The passing cancellation tests demonstrate that interrupt successfully unblocks queries stuck in VDBE execution, fixing the primary deadlock issue. Known limitation: does not fix the case where a query is stuck in SQLite's busy handler sleep loop (waiting for a lock held by another connection). That requires a custom busy handler (separate change). Fixes the primary deadlock described in elixir-sqlite#192.

- Add timed_wait_t abstraction over pthread_cond_timedwait (POSIX) and SleepConditionVariableCS (Windows) - ~30 lines covering 100% of BEAM platforms - Extend connection_t with cancel_tw condvar, cancelled flag, busy_timeout_ms, and env+pid stashing for process-alive checks - Replace sqliteDefaultBusyCallback with exqlite_busy_handler that waits on condvar (instantly cancellable) and checks enif_is_process_alive - Add exqlite_progress_handler to interrupt long-running VDBE execution - Add exqlite_set_busy_timeout NIF (stores timeout without destroying handler, unlike PRAGMA busy_timeout which silently replaces custom handlers) - Add exqlite_cancel NIF (sets flag, signals condvar, calls sqlite3_interrupt) - Make cancel and set_busy_timeout non-dirty NIFs (must run on normal scheduler to avoid queueing behind stuck dirty NIFs) - Update destructor to signal cancel before close - Install handlers in exqlite_open, stash env+pid at 5 busy-triggerable call sites Fixes the busy handler plane of elixir-sqlite#192 - disconnect can now wake sleeping handlers instantly instead of waiting for busy_timeout to expire.

- Add NIF stubs: set_busy_timeout/2, cancel/1 - Add public wrappers: Sqlite3.set_busy_timeout/2, Sqlite3.cancel/1 - cancel/1 includes nil guard (matches close/1 and interrupt/1 convention) - Fix bind_parameter_count/1 typespec: integer -> non_neg_integer() | {:error, reason()} - Fix bind_parameter_index/2 typespec: integer -> non_neg_integer() | {:error, reason()} The typespec fixes address pre-existing dialyzer warnings - both NIFs can return {:error, :invalid_statement} when statement->statement is NULL.

- Replace Sqlite3.interrupt(db) with Sqlite3.cancel(db) in disconnect/2 (cancel is a superset that also wakes the busy handler condvar) - Replace PRAGMA busy_timeout with Sqlite3.set_busy_timeout NIF call (PRAGMA internally calls sqlite3_busy_timeout which destroys custom handlers) Together these changes complete the fix for elixir-sqlite#192 - disconnect now properly cancels all three blocking planes (VDBE execution, busy handler sleep, and mutex contention as a side effect of the first two).

Tests cover: - Default busy_timeout baseline (2000ms) - set_busy_timeout changes timeout value - set_busy_timeout to 0 disables retries - Custom handler retries on contention - Custom handler respects timeout limit - cancel breaks through busy handler sleep instantly - cancel returns ok on idle connection - cancel(nil) returns :ok - Cancelled connection can be reused (flag resets) - cancel on closed connection returns error - interrupt still works independently Also fix unused default argument warning in setup_write_conflict/2 (all call sites provide opts explicitly). All 159 tests pass (11 doctests + 148 tests).

- Add @external_resource declarations for all C files so mix compile detects changes without needing 'mix clean' first - Add -MMD -MP flags to Makefile for automatic header dependency tracking - Clean .d files in 'make clean' target

Snapshot conn->cancelled under lock in both the busy handler (after waking from condvar) and the progress handler. Previously both paths read the field outside the lock, which is a data race with exqlite_cancel writing to it on another scheduler thread. Also run clang-format to satisfy the lint check.

Only :ok and interrupted are expected results when a query races with its connection timeout. Accepting OOM would mask real regressions.

Callers assume it always returns a non_neg_integer. The | {:error, reason()} union was incorrect and misleading.

C89-style: cancelled was declared twice in the same function scope after the data-race fix split the lock sections. Remove the second 'int' to reuse the variable declared at the top of the function.

With a 1ms checkout timeout, disconnect can race with an in-progress SQLite allocation and return SQLITE_NOMEM before the interrupt lands. This is an inherent race at that timeout granularity, not a regression.

Replace the timed_wait_t platform abstraction (pthread_cond_timedwait on POSIX, SleepConditionVariableCS on Windows) with polling using sqlite3_sleep(). This eliminates raw platform-specific threading primitives in favor of OTP and SQLite APIs. The busy handler now sleeps for short intervals (1-50ms) and checks conn->cancelled between iterations. Cancel latency is at most ~10ms (one sleep interval), which is acceptable since disconnects are measured in seconds. Changes: - Delete timed_wait_t abstraction and all platform-specific code (~124 lines) - Remove cancel_tw field from connection_t - Rewrite exqlite_busy_handler to use sqlite3_sleep() in a polling loop - Simplify exqlite_progress_handler, connection_stash_caller, exqlite_set_busy_timeout, exqlite_cancel to use volatile reads instead of locks - All 159 tests pass; linting passes - Net reduction: 166 lines (183 deleted, 17 added)

mjc · 2026-04-23T02:00:21Z

rebased to address a conflict. I could also clean up the history when you're ready to give it another look. but also no rush as I can just keep using my branch

warmwaffles · 2026-05-06T19:00:50Z

revert this

warmwaffles · 2026-05-06T19:01:11Z

I'll be getting to this soon. I've been extremely busy lately.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

warmwaffles · 2026-05-20T13:37:50Z

    int authorizer_deny[AUTHORIZER_DENY_SIZE];
+
+    // Custom busy handler state
+    volatile int cancelled; // guarded by interrupt_mutex


I am probably going to merge this and work on it some in another branch.

What is the benefit of throwing volatile here versus just having it be a basic int? I know that volatile is a signal to ensure the compiler doesn't re-arrange order of execution, but what I want to know, is this actually a risk?

That was an oversight from when I backed out the initial proposed handler. No real benefit now that all reads/writes are guarded by interrupt_mutex. volatile does not make this thread-safe, and the mutex is what provides the synchronization here, so I removed it.

warmwaffles · 2026-05-20T17:47:49Z

@mjc can you update the PR description with the most accurate information you can? I'm going to proof it and then use it for the commit message.

mjc · 2026-05-20T18:16:33Z

Updated! Happy to help as much as I can, also.

I've been running the previous version (with the condvar) since Feb 1 in a bunch of environments with none of the previous issues and no segfaults.

I still get busy timeouts but that is largely due to not yet quite having exactly the right concurrency model in that specific distributed app - the timeouts make sense and are traced to obvious things I know I need to fix. down like 90% from before though

warmwaffles · 2026-05-20T18:40:08Z

I still get busy timeouts but that is largely due to not yet quite having exactly the right concurrency model in that specific distributed

I still get this as well, it's been a minor irritant because it backs an MCP server I am running.

warmwaffles · 2026-05-20T18:56:41Z

If you figure it out, don't hesitate to open another PR.

Copilot AI review requested due to automatic review settings February 26, 2026 03:40

Copilot started reviewing on behalf of mjc February 26, 2026 03:40 View session

mjc commented Feb 26, 2026

View reviewed changes

Comment thread test/exqlite/cancellation_test.exs

Copilot AI reviewed Feb 26, 2026

View reviewed changes

Comment thread c_src/sqlite3_nif.c Outdated

Comment thread c_src/sqlite3_nif.c Outdated

Comment thread test/exqlite/integration_test.exs Outdated

Comment thread test/exqlite/cancellation_test.exs

Comment thread lib/exqlite/sqlite3_nif.ex Outdated

warmwaffles reviewed Feb 26, 2026

View reviewed changes

Comment thread c_src/sqlite3_nif.c Outdated

mjc commented Feb 26, 2026

View reviewed changes

Comment thread test/exqlite/integration_test.exs Outdated

mjc force-pushed the fix/deadlock-investigation branch 2 times, most recently from 8936b7b to 4630766 Compare March 18, 2026 21:12

mjc added 6 commits April 22, 2026 19:52

mjc added 11 commits April 22, 2026 19:54

Improve build tracking for C source changes

70c2124

- Add @external_resource declarations for all C files so mix compile detects changes without needing 'mix clean' first - Add -MMD -MP flags to Makefile for automatic header dependency tracking - Clean .d files in 'make clean' target

Trim verbose C comments

4f1df42

Remove out-of-memory from acceptable query outcomes

bdd1521

Only :ok and interrupted are expected results when a query races with its connection timeout. Accepting OOM would mask real regressions.

Fix bind_parameter_index/2 return type spec

35a1247

Callers assume it always returns a non_neg_integer. The | {:error, reason()} union was incorrect and misleading.

mix format: expand case arms in integration test

5a2979c

Fix duplicate local variable declaration in busy handler

53b49a7

C89-style: cancelled was declared twice in the same function scope after the data-race fix split the lock sections. Remove the second 'int' to reuse the variable declared at the top of the function.

Restore out-of-memory as valid outcome in timeout test

4daf69c

With a 1ms checkout timeout, disconnect can race with an in-progress SQLite allocation and return SQLITE_NOMEM before the interrupt lands. This is an inherent race at that timeout granularity, not a regression.

fix: clang-format alignment in exqlite_busy_handler

bc3beee

fix: remove stale condvar references in comments

2cfa8e6

mjc force-pushed the fix/deadlock-investigation branch from 4630766 to 2cfa8e6 Compare April 23, 2026 01:57

warmwaffles reviewed Apr 23, 2026

View reviewed changes

Comment thread c_src/sqlite3_nif.c Outdated

Comment thread c_src/sqlite3_nif.c Outdated

Comment thread c_src/sqlite3_nif.c

Comment thread c_src/sqlite3_nif.c Outdated

Comment thread c_src/sqlite3_nif.c

Comment thread c_src/sqlite3_nif.c Outdated

mjc added 4 commits April 24, 2026 19:42

fix: address PR review feedback

8d71ee2

style: format regression tests

5c0a69e

style: format sqlite3_nif for clang-format

bd26a9b

docs: clarify cancellation and connection options

4a883c2

warmwaffles reviewed May 6, 2026

View reviewed changes

Comment thread .github/workflows/ci.yml

Copy link
Copy Markdown

Member

warmwaffles May 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this

mjc reacted with thumbs up emoji

Tighten timeout cancellation regression

1dc222a

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

mjc force-pushed the fix/deadlock-investigation branch from 04baa4b to 1dc222a Compare May 10, 2026 16:42

warmwaffles reviewed May 20, 2026

View reviewed changes

fix: remove unnecessary volatile from cancellation flag

c14d21c

warmwaffles merged commit 3e45199 into elixir-sqlite:main May 20, 2026
11 checks passed

mjc deleted the fix/deadlock-investigation branch May 21, 2026 03:37

Conversation

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

User-facing behavior

Implementation notes

Tests

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

warmwaffles commented Feb 26, 2026

Uh oh!

Uh oh!

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

warmwaffles commented Feb 26, 2026

Uh oh!

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

warmwaffles commented Feb 26, 2026

Uh oh!

mjc commented Feb 26, 2026

Uh oh!

warmwaffles commented Feb 26, 2026

Uh oh!

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

warmwaffles commented Feb 26, 2026

Uh oh!

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

warmwaffles commented Feb 26, 2026

Uh oh!

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjc commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjc commented Mar 18, 2026

Uh oh!

warmwaffles commented Mar 19, 2026

Uh oh!

mjc commented Mar 19, 2026

Uh oh!

mjc commented Apr 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

warmwaffles May 6, 2026

Choose a reason for hiding this comment

Uh oh!

warmwaffles commented May 6, 2026

Uh oh!

warmwaffles May 20, 2026

Choose a reason for hiding this comment

Uh oh!

mjc May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mjc commented Feb 26, 2026 •

edited

Loading

mjc commented Feb 26, 2026 •

edited

Loading

mjc commented Feb 26, 2026 •

edited

Loading

mjc commented Feb 26, 2026 •

edited

Loading

mjc commented Feb 26, 2026 •

edited

Loading

mjc commented Feb 26, 2026 •

edited

Loading

mjc commented Feb 26, 2026 •

edited

Loading

mjc May 20, 2026 •

edited

Loading

mjc commented May 20, 2026 •

edited

Loading