Optimize coroutine exception handling and fix gen_throw traceback by youknowone · Pull Request #7166 · RustPython/RustPython

youknowone · 2026-02-16T00:33:05Z

Replace ExceptionStack linked list with Vec for O(1) push/pop
Introduce FramePtr (NonNull<Py>) to avoid Arc clone in frame stack
Add resume_gen_frame for lightweight generator/coroutine resume
Replace Coro.exception PyMutex with PyAtomicRef for lock-free swap
Add with_frame_exc to support initial exception state for generators
Use scopeguard for panic-safe cleanup in with_frame_exc/resume_gen_frame
Add traceback entries in gen_throw delegate/close error paths
Treat EndAsyncFor and CleanupThrow as reraise in handle_exception
Update current_frame()/current_globals() to return owned values
Add ThreadSlot with atomic exception field for sys._current_exceptions()

Summary by CodeRabbit

Refactor
- Reworked frame and per-thread state handling to enable safer, more consistent generator/coroutine resume behavior and cross-thread state management.
Bug Fixes
- More accurate tracebacks and exception chaining across nested yield/await, async-for, and generator unwind paths; improved cross-thread exception reporting.
Performance
- Reduced unnecessary cloning and refined locking/atomic access to lower overhead for frame, globals, and thread operations.

coderabbitai · 2026-02-16T00:33:13Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Replaces per-thread frame storage with ThreadSlot (frames + atomic exception), introduces FramePtr, updates VM frame stack and resume APIs (with_frame_exc, resume_gen_frame), converts coroutine exception storage to atomic refs, and updates all frame-access sites (faulthandler, sys, thread, builtins, frame traceback) to the new patterns.

Changes

Cohort / File(s)	Summary
VM core — frame & exception model `crates/vm/src/vm/mod.rs`	Adds `FramePtr` (NonNull<Py>), switches `VirtualMachine.frames` to `FramePtr`, changes exception-stack layout, and adds `with_frame_exc`, `with_frame`, and `resume_gen_frame`.
Per-thread slot & threading utilities `crates/vm/src/vm/thread.rs`, `crates/vm/src/stdlib/thread.rs`	Introduces `ThreadSlot` ( `frames: Mutex<Vec<FramePtr>>` + atomic exception), updates CURRENT_THREAD_SLOT alias, and changes push/pop/fork/reinit/cleanup and cross-thread exception helpers to use `slot.frames.lock()` and atomic exception handling.
Faulthandler & signal paths `crates/stdlib/src/faulthandler.rs`	Replaces ThreadFrameSlot alias with `Arc<ThreadSlot>`, changes `dump_frame_from_ref` to take `&Py<Frame>`, and updates call sites to iterate `FramePtr` slices, use `slot.frames.lock()`, and call `unsafe { fp.as_ref() }`.
Coroutine / generator exception handling `crates/vm/src/coroutine.rs`	Changes `Coro.exception` from `PyMutex<Option<PyBaseExceptionRef>>` to `PyAtomicRef<Option<PyBaseException>>`, adapts traversal and `run_with_context` to atomic swaps and `vm.resume_gen_frame`, and updates closure/frame parameter types.
Frame traceback & control-flow `crates/vm/src/frame.rs`	Adds additional traceback entries at unwind/yield/await and reraise-like points, changes `prev_line` to `u32`, and expands reraise recognition to `EndAsyncFor`/`CleanupThrow`.
Frame lookup & access sites `crates/vm/src/builtins/frame.rs`, `crates/vm/src/stdlib/sys.rs`, `crates/vm/src/protocol/callable.rs`, `crates/vm/src/warn.rs`	Switches f_back lookup to pointer-identity with thread-registry fallback, uses `FramePtr`/`Py<Frame>` borrowing patterns (`unsafe as_ref()`), and simplifies current-frame/global borrow semantics.
Minor / simple changes `crates/vm/src/stdlib/builtins.rs`	Removes unnecessary cloning of globals dict, returning the original `PyDictRef` instead.

Sequence Diagram(s)

sequenceDiagram
    participant SignalHandler as SignalHandler
    participant Faulthandler as Faulthandler
    participant ThreadRegistry as ThreadRegistry
    participant ThreadSlot as ThreadSlot
    participant VM as VM

    SignalHandler->>Faulthandler: request dump_all_threads
    Faulthandler->>ThreadRegistry: iterate thread slots
    ThreadRegistry->>ThreadSlot: provide Arc<ThreadSlot>
    ThreadSlot->>ThreadSlot: lock frames (frames.lock())
    Faulthandler->>VM: for each FramePtr call dump_frame_from_ref(&Py<Frame>)
    VM->>VM: unsafe as_ref() to access frame fields
    VM-->>Faulthandler: frame data written to fd

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Faulthandler #6400: Faulthandler changes involving dump_frame_from_ref and migration to Py<Frame>/ThreadSlot — strong code overlap.
Upgrade threading to 3.13.11; sys.setprofile & impl more threading #6691: Per-thread slot replacement and call-site updates to slot.frames.lock() and thread-frame bookkeeping.
Update asyncio to 3.14.2 #6902: Overlapping frame/coroutine handling and generator lifecycle API changes.

Suggested reviewers

ShaharNaveh
fanninpm

"I'm a rabbit in the VM patch-lair,
I swapped the chains for pointers fair,
Threads tuck frames and exceptions tight,
Tracebacks hop clearer in morning light,
I nibble bugs and leave the code in air." 🐇✨

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Merge Conflict Detection	⚠️ Warning	❌ Merge conflicts detected (12 files): ⚔️ `Lib/test/test_unpack.py` (content) ⚔️ `crates/stdlib/src/faulthandler.rs` (content) ⚔️ `crates/vm/src/builtins/frame.rs` (content) ⚔️ `crates/vm/src/coroutine.rs` (content) ⚔️ `crates/vm/src/frame.rs` (content) ⚔️ `crates/vm/src/protocol/callable.rs` (content) ⚔️ `crates/vm/src/stdlib/builtins.rs` (content) ⚔️ `crates/vm/src/stdlib/sys.rs` (content) ⚔️ `crates/vm/src/stdlib/thread.rs` (content) ⚔️ `crates/vm/src/vm/mod.rs` (content) ⚔️ `crates/vm/src/vm/thread.rs` (content) ⚔️ `crates/vm/src/warn.rs` (content) These conflicts must be resolved before merging into `main`.	Resolve conflicts locally and push changes to this branch.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Optimize coroutine exception handling and fix gen_throw traceback' clearly summarizes the primary changes: improvements to exception handling in coroutines (switching to PyAtomicRef, introducing FramePtr) and traceback augmentation in gen_throw. It is specific, concise, and directly reflects the main objectives of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

⚔️ Resolve merge conflicts (beta)

Auto-commit resolved conflicts to branch coroutine-exception
Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-02-16T00:33:34Z

Code has been automatically formatted

The code in this PR has been formatted using:

cargo fmt --all
Please pull the latest changes before pushing again:

git pull origin coroutine-exception

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@crates/vm/src/stdlib/sys.rs`:
- Around line 974-981: The subtraction frames.len() - 1 can underflow when the
call stack is empty; change the guard to compare offset against frames.len()
(e.g., if offset >= frames.len() { return Err(vm.new_value_error("call stack is
not deep enough")); }) so you never perform len() - 1 on an empty Vec, then
compute idx = frames.len() - offset - 1 as before and continue to safely create
the Frame reference (py: &crate::Py<Frame>) — this avoids the usize underflow
while preserving the original logic.
- Around line 989-998: The subtraction `frames.len() - 1` can underflow when
`frames` is empty in `_getframemodulename`; change the guard to compute `let len
= frames.len(); if depth >= len { return Ok(vm.ctx.none()); }` and then compute
`let idx = len - depth - 1;` so the check prevents underflow before indexing the
`frames` Vec (affecting the block using `let frames = vm.frames.borrow()`,
`depth`, `idx`, the unsafe `frames[idx].as_ref()`, and
`frame.func_obj.clone()`).

🧹 Nitpick comments (4)

crates/vm/src/frame.rs (1)
543-557: Reraise classification looks right; please run required build/lint steps.

Treating EndAsyncFor/CleanupThrow as reraise preserves traceback semantics. Since bytecode-instruction handling changed here, please run the clean build plus rustfmt/clippy steps before merge:
rm -r target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r
cargo fmt
cargo clippy
As per coding guidelines: “When modifying bytecode instructions, perform a clean build by running: rm -r target/debug/build/rustpython-* && find . | grep -E ".pyc$" | xargs rm -r”, “Follow the default rustfmt code style by running cargo fmt to format Rust code”, and “Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints that are introduced by your changes.”
crates/vm/src/vm/mod.rs (3)
1088-1142: resume_gen_frame intentionally skips thread frame updates — verify cross-thread visibility.

This lightweight path skips push_thread_frame/pop_thread_frame and update_thread_exception, meaning generator/coroutine frames won't appear in sys._current_frames() or sys._current_exceptions() from other threads. These frames do appear in vm.frames (so sys._getframe() works correctly from the owning thread).

If this omission is intentional for performance, consider documenting the trade-off in the doc comment. If generators should be visible in _current_frames(), the thread frame push/pop needs to be added back.

Also, a minor note: Line 1113 pushes directly to self.exceptions.borrow_mut().stack rather than calling self.push_exception(exc), which also means the thread exception slot doesn't update. When the generator yields or returns, set_exception is typically called to save/restore the exception state, so this is likely fine — but it means a concurrent sys._current_exceptions() call won't see the generator's in-flight exception.

1121-1128: Misleading panic message in scopeguard cleanup.

Line 1125 panics with "pop_exception() without nested exc stack", but this is a direct Vec::pop(), not a call to pop_exception(). Consider updating the message for clarity.
✏️ Suggested wording
             self.exceptions.borrow_mut().stack
                 .pop()
-                .expect("pop_exception() without nested exc stack");
+                .expect("resume_gen_frame: exception stack underflow");
1454-1467: set_exception pushes when the stack is empty — is this intentional?

When excs.stack is empty (no frame has been entered), set_exception pushes a new entry rather than panicking. This is a defensive fallback that silently papers over a potential caller bug. If it's intentional (e.g., for generator throw before first next), a brief comment would clarify the rationale.

crates/vm/src/stdlib/sys.rs

coderabbitai

🧹 Nitpick comments (2)

crates/vm/src/vm/mod.rs (2)

1097-1129: Panic-safety gap between state mutations and scopeguard setup.

Lines 1101–1117 mutate recursion_depth, frames, exceptions, and frame.owner before the scopeguard::defer! on line 1121. If any operation in that window panics (e.g., a borrow_mut on an already-borrowed RefCell), the state won't be rolled back.

In practice these are simple operations unlikely to panic, but you could tighten this by incrementing recursion_depth inside the defer setup or using an early-established guard pattern. Low priority given the hot-path nature of this code.

1454-1467: Defensive fallback in set_exception when stack is empty.

Lines 1461–1463 push a new entry when the stack is empty. This avoids a panic, but it silently grows the exception stack without a corresponding frame push. A later pop_exception may then pop an entry that wasn't pushed by push_exception/with_frame_exc, potentially causing misalignment.

Consider adding a debug_assert!(!excs.stack.is_empty()) to catch unexpected empty-stack calls during development, while keeping the push fallback for release builds.

- Replace ExceptionStack linked list with Vec for O(1) push/pop - Introduce FramePtr (NonNull<Py<Frame>>) to avoid Arc clone in frame stack - Add resume_gen_frame for lightweight generator/coroutine resume - Replace Coro.exception PyMutex with PyAtomicRef for lock-free swap - Add with_frame_exc to support initial exception state for generators - Use scopeguard for panic-safe cleanup in with_frame_exc/resume_gen_frame - Add traceback entries in gen_throw delegate/close error paths - Treat EndAsyncFor and CleanupThrow as reraise in handle_exception - Update current_frame()/current_globals() to return owned values - Add ThreadSlot with atomic exception field for sys._current_exceptions()

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@crates/vm/src/builtins/frame.rs`:
- Around line 208-212: Update the SAFETY comment above the unsafe block to
remove the outdated "FrameRef" wording and instead state the actual invariant:
that vm.frames now stores FramePtr entries and the caller (or the vm)
maintains/owns those FramePtr lifetimes so that fp.as_ref() yields a valid
&crate::Py<Frame>; reference the symbols fp.as_ref(), FramePtr, and vm.frames in
the comment and briefly describe that the FramePtr ownership in vm.frames
guarantees the pointer remains alive while it's in the Vec.

In `@crates/vm/src/coroutine.rs`:
- Around line 39-50: The code sets the coroutine's running flag to true but
resets it with a plain store(false) that won't run if resume_gen_frame or its
inner closure panics; update the code around the running flag in the function
that calls resume_gen_frame (the send()/resume path) to install a panic-safe
guard (e.g. scopeguard::defer! or a RAII guard) immediately after setting
running.store(true) so that running.store(false) is executed on all exit paths;
reference the running atomic, the send()/resume function, and resume_gen_frame
to locate where to add the guard and ensure the current plain
running.store(false) is removed or made unreachable by the new guard.

In `@crates/vm/src/frame.rs`:
- Around line 543-557: The is_reraise classification correctly treats
Instruction::EndAsyncFor and Instruction::CleanupThrow as re-raise-like, but you
must add targeted tests that exercise the re-raise branch for each: one test
that runs EndAsyncFor with a non-StopAsyncIteration exception and asserts no new
traceback entry is added, and one test that runs CleanupThrow with a
non-StopIteration exception and asserts the same; add these unit/integration
tests around the VM instruction execution harness (target the is_reraise
behavior) and run a clean build before testing (rm -r
target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r) to
ensure fresh artifacts.

In `@crates/vm/src/vm/mod.rs`:
- Around line 1454-1471: The debug_assert in set_exception claims
exceptions.stack must be non-empty but the code still has an else that pushes
into the stack, hiding a logic bug in release builds; enforce the invariant by
replacing debug_assert! with assert! (or otherwise ensure failure in release
builds) and remove the unreachable else branch so set_exception only replaces
the top entry via last_mut() and never pushes a new entry; update the code paths
around exceptions.borrow_mut(), core::mem::replace(top, exc), and
thread::update_thread_exception(self.topmost_exception()) accordingly to
preserve existing behavior when the stack is non-empty.

🧹 Nitpick comments (7)

crates/vm/src/vm/thread.rs (2)
4-25: Consider encapsulating ThreadSlot fields to protect invariants (esp. exception swap/drop rules).

ThreadSlot is pub and both frames and exception are pub, but exception usage has a strict “owning-thread” drop contract (to avoid dropping a Py*Ref outside a VM context). Even though misuse would require unsafe for swap, narrowing the surface (e.g., pub(crate) fields + small accessor methods) would make it harder to accidentally violate the intended model.

Also: please ensure this lands cargo fmt/cargo clippy cleanly since it touches core VM threading primitives.

As per coding guidelines: “Follow the default rustfmt code style by running cargo fmt to format Rust code” and “Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints that are introduced by your changes”.

Also applies to: 31-33

120-142: get_all_current_exceptions() holds the registry lock longer than the docstring implies.

The comment says “acquires … briefly, then reads each slot’s exception atomically”, but the current implementation keeps the lock while reading each slot. This is probably fine today, but it’s easy to make it match the intent and reduce contention by snapshotting (tid, Arc<ThreadSlot>) first, dropping the lock, then reading atomics.
Proposed refactor (drop registry lock before per-slot reads)
 pub fn get_all_current_exceptions(vm: &VirtualMachine) -> Vec<(u64, Option<PyBaseExceptionRef>)> {
-    let registry = vm.state.thread_frames.lock();
-    registry
-        .iter()
-        .map(|(id, slot)| (*id, slot.exception.to_owned()))
-        .collect()
+    let slots: Vec<(u64, CurrentFrameSlot)> = {
+        let registry = vm.state.thread_frames.lock();
+        registry
+            .iter()
+            .map(|(&id, slot)| (id, Arc::clone(slot)))
+            .collect()
+    };
+    slots
+        .into_iter()
+        .map(|(id, slot)| (id, slot.exception.to_owned()))
+        .collect()
 }
crates/vm/src/frame.rs (1)

659-712: Traceback injection in gen_throw error paths is good; consider factoring the repeated “append traceback entry” block.

You’re doing the same (idx calc → bounds check → PyTraceback::new → set_traceback_typed) in multiple places now. A tiny helper (even a local closure inside gen_throw) would reduce duplication and make it less likely that one path drifts (e.g., missing bounds checks or using a slightly different idx policy).
crates/stdlib/src/faulthandler.rs (2)
328-345: dump_frame_from_ref should avoid relying on current_location() indexing (prefer a bounds-checked lineno derivation).

Even though this isn’t the signal-handler path, faulthandler should be best-effort and not panic if it ever sees an unexpected lasti/locations state. You already do a defensive bounds check in dump_frame_from_raw; it’d be good to mirror that here too.
Proposed defensive lineno computation
 fn dump_frame_from_ref(fd: i32, frame: &crate::vm::Py<Frame>) {
     let funcname = frame.code.obj_name.as_str();
     let filename = frame.code.source_path().as_str();
-    let lineno = if frame.lasti() == 0 {
-        frame.code.first_line_number.map(|n| n.get()).unwrap_or(1) as u32
-    } else {
-        frame.current_location().line.get() as u32
-    };
+    let lasti = frame.lasti();
+    let lineno = if lasti == 0 {
+        frame.code.first_line_number.map(|n| n.get()).unwrap_or(1) as u32
+    } else {
+        let idx = (lasti as usize).saturating_sub(1);
+        frame
+            .code
+            .locations
+            .get(idx)
+            .map(|(loc, _)| loc.line.get() as u32)
+            .unwrap_or_else(|| frame.code.first_line_number.map(|n| n.get()).unwrap_or(1) as u32)
+    };
401-428: Avoid holding vm.state.thread_frames lock while doing blocking I/O in dump_all_threads / watchdog dumps.

dump_all_threads() currently holds the registry lock across per-thread printing. If the fd blocks (pipe, slow terminal), that can unnecessarily stall thread cleanup/registration. You already snapshot slots for dump_traceback_later; doing the same here would reduce contention and matches the intended “debug dump shouldn’t freeze unrelated VM operations” behavior.
Sketch refactor: snapshot registry entries before writing
 #[cfg(feature = "threading")]
 {
     let current_tid = rustpython_vm::stdlib::thread::get_ident();
-    let registry = vm.state.thread_frames.lock();
+    let slots: Vec<(u64, ThreadFrameSlot)> = {
+        let registry = vm.state.thread_frames.lock();
+        registry.iter().map(|(&tid, slot)| (tid, Arc::clone(slot))).collect()
+    };

     // First dump non-current threads, then current thread last
-    for (&tid, slot) in registry.iter() {
+    for (tid, slot) in &slots {
         if tid == current_tid {
             continue;
         }
         let frames_guard = slot.frames.lock();
-        dump_traceback_thread_frames(fd, tid, false, &frames_guard);
+        dump_traceback_thread_frames(fd, *tid, false, &frames_guard);
         puts(fd, "\n");
     }
Also applies to: 871-876
crates/vm/src/coroutine.rs (1)
90-116: Make running flag cleanup panic-safe (scopeguard) and consider avoiding the raw pointer capture.

If func(f) (or anything inside resume_gen_frame) ever panics, self.running stays true, permanently bricking the coroutine (“already executing”). A small scopeguard::defer! would make this robust at near-zero cost.

Also, if borrowck allows, capturing &self.exception directly is clearer than a *const PyAtomicRef<_> (and keeps the unsafe surface smaller).
Possible adjustment
     fn run_with_context<F>(
         &self,
         jen: &PyObject,
         vm: &VirtualMachine,
         func: F,
     ) -> PyResult<ExecutionResult>
     where
         F: FnOnce(&Py<Frame>) -> PyResult<ExecutionResult>,
     {
         if self.running.compare_exchange(false, true).is_err() {
             return Err(vm.new_value_error(format!("{} already executing", gen_name(jen, vm))));
         }
+        scopeguard::defer! {
+            self.running.store(false);
+        }

         // SAFETY: running.compare_exchange guarantees exclusive access
         let gen_exc = unsafe { self.exception.swap(None) };
-        let exception_ptr = &self.exception as *const PyAtomicRef<Option<PyBaseException>>;
+        let exception_ref = &self.exception;

         let result = vm.resume_gen_frame(&self.frame, gen_exc, |f| {
             let result = func(f);
             // SAFETY: exclusive access guaranteed by running flag
-            let _old = unsafe { (*exception_ptr).swap(vm.current_exception()) };
+            let _old = unsafe { exception_ref.swap(vm.current_exception()) };
             result
         });

-        self.running.store(false);
         result
     }
crates/vm/src/vm/mod.rs (1)

1088-1142: Consider extracting shared frame-setup/teardown logic to reduce duplication.

resume_gen_frame and with_frame_exc share ~30 lines of nearly identical boilerplate (push FramePtr, set current frame, store previous, swap owner, scopeguard cleanup, trace events). Differences are: (1) thread-frame push/pop, (2) per-frame trace setup, (3) thread exception update, (4) with_recursion vs manual check.

A shared helper parameterized on those flags (or a builder/options struct) would consolidate the invariants in one place and reduce the risk of them drifting apart. Understandably this may be deferred if the perf-critical path needs to stay fully inlined.

As per coding guidelines: "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."

crates/vm/src/builtins/frame.rs

coderabbitai · 2026-02-16T14:27:02Z

crates/vm/src/coroutine.rs

+    exception: PyAtomicRef<Option<PyBaseException>>, // exc_state
 }

 unsafe impl Traverse for Coro {
    fn traverse(&self, tracer_fn: &mut TraverseFn<'_>) {
        self.frame.traverse(tracer_fn);
        self.name.traverse(tracer_fn);
        self.qualname.traverse(tracer_fn);
-        self.exception.traverse(tracer_fn);
+        if let Some(exc) = self.exception.deref() {
+            exc.traverse(tracer_fn);
+        }
    }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "*.rs" -type f | xargs grep -l "struct PyAtomicRef" | head -5

Repository: RustPython/RustPython

Length of output: 94

🏁 Script executed:

rg -n "struct PyAtomicRef|type PyAtomicRef" --type rs

Repository: RustPython/RustPython

Length of output: 91

🏁 Script executed:

# Get context around lines 39-50 and 64-74 in coroutine.rs cat -n crates/vm/src/coroutine.rs | head -80

Repository: RustPython/RustPython

Length of output: 3152

🏁 Script executed:

# Search for PyAtomicRef and Traverse definitions rg -n "trait Traverse|fn traverse" crates/vm/src --type rs -A 3

Repository: RustPython/RustPython

Length of output: 91

🏁 Script executed:

rg "struct PyAtomicRef" crates/vm/src/object/ -A 10

Repository: RustPython/RustPython

Length of output: 725

🏁 Script executed:

rg "impl.*PyAtomicRef" crates/vm/src/object/ -A 5

Repository: RustPython/RustPython

Length of output: 5675

🏁 Script executed:

rg "fn deref" crates/vm/src/object/ -B 2 -A 5

Repository: RustPython/RustPython

Length of output: 4162

🏁 Script executed:

rg "trait Traverse" crates/vm/src/object/ -A 5

Repository: RustPython/RustPython

Length of output: 678

🏁 Script executed:

rg -n "run_with_context\|exception.*swap\|exception.*deref" crates/vm/src/coroutine.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

cat -n crates/vm/src/coroutine.rs | sed -n '85,130p'

Repository: RustPython/RustPython

Length of output: 1825

🏁 Script executed:

rg -n "\.exception\." crates/vm/src/coroutine.rs

Repository: RustPython/RustPython

Length of output: 183

🏁 Script executed:

# Check how Traverse is called (where traversal happens) rg -n "\.traverse\(" crates/vm/src/ -B 2 -A 2 | head -50

Repository: RustPython/RustPython

Length of output: 3083

🏁 Script executed:

rg -n "fn.*gc|GC" crates/vm/src/ --type rs | head -20

Repository: RustPython/RustPython

Length of output: 91

🏁 Script executed:

rg -n "\.traverse(" crates/vm/src/object/ -B 3 -A 1 | head -80

Repository: RustPython/RustPython

Length of output: 137

🏁 Script executed:

# Look for GC collection cycle and when it calls traverse rg -n "impl.*Traverse\|traverse\(" crates/vm/src/gc.rs 2>/dev/null | head -30

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

find crates -name "gc.rs" -type f

Repository: RustPython/RustPython

Length of output: 91

🏁 Script executed:

rg -n "traverse" crates/vm/src/stdlib/gc.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Look at coroutine integration with GC cat crates/vm/src/stdlib/gc.rs | head -100

Repository: RustPython/RustPython

Length of output: 3099

🏁 Script executed:

# Search for when GC collect is called during execution rg -n "collect\|gc_collect" crates/vm/src/ --type-list

Repository: RustPython/RustPython

Length of output: 454

🏁 Script executed:

# Better search: look for Traverse implementations and understand execution model rg -n "unsafe impl Traverse" crates/vm/src/ | head -30

Repository: RustPython/RustPython

Length of output: 2313

🏁 Script executed:

# Let's look at the full context of run_with_context and exception handling cat -n crates/vm/src/coroutine.rs | sed -n '90,120p'

Repository: RustPython/RustPython

Length of output: 1298

🏁 Script executed:

# Check how resume_gen_frame works rg -n "fn resume_gen_frame" crates/vm/src/ -A 15

Repository: RustPython/RustPython

Length of output: 1122

🏁 Script executed:

# Look at how exception is used in the frame/vm context rg -n "current_exception\|set_exception" crates/vm/src/ -B 1 -A 2 | head -80

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Check if there are any comments or discussions about atomic ref safety rg -n "PyAtomicRef" crates/vm/src/ -B 2 -A 2 | head -60

Repository: RustPython/RustPython

Length of output: 3902

🏁 Script executed:

# Check if there's any GC triggering during frame execution rg -n "gc_collect\|collect_force" crates/vm/src/ -B 3 -A 3 | head -80

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Look at how the running flag's exclusive access guarantee applies to Traverse cat -n crates/vm/src/coroutine.rs | sed -n '42,52p'

Repository: RustPython/RustPython

Length of output: 460

🏁 Script executed:

# Check the PyAtomicRef safety documentation and contracts cat -n crates/vm/src/object/ext.rs | sed -n '244,320p'

Repository: RustPython/RustPython

Length of output: 3032

🏁 Script executed:

# Get more context on PyAtomicRef swap and deref for Option<T> cat -n crates/vm/src/object/ext.rs | sed -n '340,380p'

Repository: RustPython/RustPython

Length of output: 1714

🏁 Script executed:

# Check the full Traverse trait documentation cat -n crates/vm/src/object/traverse.rs | head -80

Repository: RustPython/RustPython

Length of output: 3580

🏁 Script executed:

# Verify: Can GC run while running flag is set? rg -n "running.store\|running.compare_exchange\|running.load" crates/vm/src/coroutine.rs

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Look at how GC traversal interacts with running coroutines cat -n crates/vm/src/gc_state.rs 2>/dev/null | head -50

Repository: RustPython/RustPython

Length of output: 2071

🏁 Script executed:

# Search for where PyAtomicRef deref safety is discussed or tested rg -n "deref\|swap" crates/vm/src/object/ext.rs | grep -A2 -B2 "unsafe"

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Verify: is there any GC trigger during frame execution that might call traverse? rg -n "gc_collect\|trigger.*gc\|AUTO_GC" crates/vm/src/ | head -20

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Check if there are existing tests or comments about concurrent traverse/swap safety rg -n "test.*traverse\|test.*atomic" crates/vm/src/ -B 2 -A 5 | head -40

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Check if exception.deref could be called while exception is being swapped # Look at the exact sequence in run_with_context more carefully cat -n crates/vm/src/coroutine.rs | sed -n '99,116p'

Repository: RustPython/RustPython

Length of output: 926

🏁 Script executed:

# Verify the AcqRel vs Relaxed ordering implications rg -n "Ordering::" crates/vm/src/object/ext.rs | grep -A2 -B2 "swap\|deref"

Repository: RustPython/RustPython

Length of output: 846

🏁 Script executed:

# Check if there's a panic-safety issue with the running flag cleanup cat -n crates/vm/src/coroutine.rs | sed -n '90,116p'

Repository: RustPython/RustPython

Length of output: 1213

🏁 Script executed:

# Check for panic-safety guard patterns in similar code rg -n "scopeguard\|guard\|Drop" crates/vm/src/coroutine.rs

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Look for how other code handles the running flag or similar state cleanup rg -n "compare_exchange.*running\|running.*store" crates/vm/src/ -B 3 -A 3 | head -50

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

# Check if resume_gen_frame could panic and what the consequences would be cat -n crates/vm/src/vm/mod.rs | sed -n '1091,1130p'

Repository: RustPython/RustPython

Length of output: 1973

Wrap running flag reset in a guard to ensure panic-safety.

If resume_gen_frame or the inner closure panics, line 114 is never executed, leaving running=true permanently. This causes subsequent send() calls to immediately fail with "already executing". Use scopeguard::defer! or similar to guarantee running.store(false) is called on all exit paths, matching the pattern already used in resume_gen_frame() itself (lines 1121-1129).

🤖 Prompt for AI Agents

In `@crates/vm/src/coroutine.rs` around lines 39 - 50, The code sets the coroutine's running flag to true but resets it with a plain store(false) that won't run if resume_gen_frame or its inner closure panics; update the code around the running flag in the function that calls resume_gen_frame (the send()/resume path) to install a panic-safe guard (e.g. scopeguard::defer! or a RAII guard) immediately after setting running.store(true) so that running.store(false) is executed on all exit paths; reference the running atomic, the send()/resume function, and resume_gen_frame to locate where to add the guard and ensure the current plain running.store(false) is removed or made unreachable by the new guard.

coderabbitai · 2026-02-16T14:27:02Z

crates/vm/src/frame.rs

                    // Check if this is a RERAISE instruction
                    // Both AnyInstruction::Raise { kind: Reraise/ReraiseFromStack } and
                    // AnyInstruction::Reraise are reraise operations that should not add
-                    // new traceback entries
+                    // new traceback entries.
+                    // EndAsyncFor and CleanupThrow also re-raise non-matching exceptions.
                    let is_reraise = match op {
                        Instruction::RaiseVarargs { kind } => matches!(
                            kind.get(arg),
                            bytecode::RaiseKind::BareRaise | bytecode::RaiseKind::ReraiseFromStack
                        ),
-                        Instruction::Reraise { .. } => true,
+                        Instruction::Reraise { .. }
+                        | Instruction::EndAsyncFor
+                        | Instruction::CleanupThrow => true,
                        _ => false,
                    };


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's understand the code changes and find where these instructions are used rg -n "EndAsyncFor|CleanupThrow" crates/vm/src/ -A 3 -B 3

Repository: RustPython/RustPython

Length of output: 2834

🏁 Script executed:

# Find test coverage for these instructions rg -n "EndAsyncFor|CleanupThrow|end_async_for|cleanup_throw" --type py tests/ crates/

Repository: RustPython/RustPython

Length of output: 111

🏁 Script executed:

# Look at the full context of frame.rs around the changed lines sed -n '530,570p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 2330

🏁 Script executed:

# Check if there are any comments or documentation about reraise behavior rg -n "reraise|RERAISE" crates/vm/src/frame.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 3124

🏁 Script executed:

# Search for where RaiseVarargs and Reraise instructions are handled rg -n "Instruction::Reraise|Instruction::RaiseVarargs" crates/vm/src/frame.rs -B 2 -A 5

Repository: RustPython/RustPython

Length of output: 2386

🏁 Script executed:

# Find tests related to async for loops and exception handling find . -name "*.py" -type f | xargs grep -l "async for\|EndAsyncFor" 2>/dev/null | head -20

Repository: RustPython/RustPython

Length of output: 703

🏁 Script executed:

# Search for existing exception handling tests find . -name "*.py" -type f | xargs grep -l "StopAsyncIteration\|cleanup" 2>/dev/null | head -20

Repository: RustPython/RustPython

Length of output: 636

🏁 Script executed:

# Look for generator/async tests fd -e py | xargs grep -l "async def\|async for" | head -10

Repository: RustPython/RustPython

Length of output: 337

🏁 Script executed:

# Check bytecode instruction definitions rg -n "EndAsyncFor|CleanupThrow|CLEANUP_THROW|END_ASYNC_FOR" crates/ --type rust -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 9658

🏁 Script executed:

# Look at EndAsyncFor implementation details sed -n '1139,1165p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 1186

🏁 Script executed:

# Look at CleanupThrow implementation details sed -n '2257,2280p' crates/vm/src/frame.rs

Repository: RustPython/RustPython

Length of output: 1428

Verify targeted test coverage for reraise paths in EndAsyncFor and CleanupThrow.

The classification of these instructions as reraise-like is correct: both conditionally re-raise when their expected exception (StopAsyncIteration/StopIteration) is not matched. When they do re-raise, they should not add new traceback entries, matching CPython semantics.

However, ensure targeted coverage exists for the re-raise paths:

EndAsyncFor with non-StopAsyncIteration exceptions

CleanupThrow with non-StopIteration exceptions

Perform a clean build before testing: rm -r target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r

🤖 Prompt for AI Agents

In `@crates/vm/src/frame.rs` around lines 543 - 557, The is_reraise classification correctly treats Instruction::EndAsyncFor and Instruction::CleanupThrow as re-raise-like, but you must add targeted tests that exercise the re-raise branch for each: one test that runs EndAsyncFor with a non-StopAsyncIteration exception and asserts no new traceback entry is added, and one test that runs CleanupThrow with a non-StopIteration exception and asserts the same; add these unit/integration tests around the VM instruction execution harness (target the is_reraise behavior) and run a clean build before testing (rm -r target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r) to ensure fresh artifacts.

coderabbitai · 2026-02-16T14:27:02Z

crates/vm/src/vm/mod.rs

    pub(crate) fn set_exception(&self, exc: Option<PyBaseExceptionRef>) {
        // don't be holding the RefCell guard while __del__ is called
-        let prev = core::mem::replace(&mut self.exceptions.borrow_mut().exc, exc);
-        drop(prev);
+        let mut excs = self.exceptions.borrow_mut();
+        debug_assert!(
+            !excs.stack.is_empty(),
+            "set_exception called with empty exception stack"
+        );
+        if let Some(top) = excs.stack.last_mut() {
+            let prev = core::mem::replace(top, exc);
+            drop(excs);
+            drop(prev);
+        } else {
+            excs.stack.push(exc);
+            drop(excs);
+        }
+        #[cfg(feature = "threading")]
+        thread::update_thread_exception(self.topmost_exception());
    }


⚠️ Potential issue | 🟡 Minor

set_exception: dead else branch contradicts debug_assert.

The debug_assert! on line 1457 guarantees the stack is non-empty (in debug builds), making the else branch on line 1465 unreachable. In release builds, the assert is elided and the else silently pushes — which masks a logic bug rather than failing fast.

Pick one stance: if the stack must be non-empty, drop the else and keep the assert (or upgrade to assert!). If the empty case is expected, remove the debug_assert.

Proposed fix (assuming non-empty is the invariant)

pub(crate) fn set_exception(&self, exc: Option<PyBaseExceptionRef>) { // don't be holding the RefCell guard while __del__ is called let mut excs = self.exceptions.borrow_mut(); - debug_assert!( - !excs.stack.is_empty(), - "set_exception called with empty exception stack" - ); - if let Some(top) = excs.stack.last_mut() { - let prev = core::mem::replace(top, exc); - drop(excs); - drop(prev); - } else { - excs.stack.push(exc); - drop(excs); - } + let top = excs + .stack + .last_mut() + .expect("set_exception called with empty exception stack"); + let prev = core::mem::replace(top, exc); + drop(excs); + drop(prev); #[cfg(feature = "threading")] thread::update_thread_exception(self.topmost_exception()); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pub(crate) fn set_exception(&self, exc: Option<PyBaseExceptionRef>) {

// don't be holding the RefCell guard while __del__ is called

let prev = core::mem::replace(&mut self.exceptions.borrow_mut().exc, exc);

drop(prev);

let mut excs = self.exceptions.borrow_mut();

debug_assert!(

!excs.stack.is_empty(),

"set_exception called with empty exception stack"

);

if let Some(top) = excs.stack.last_mut() {

let prev = core::mem::replace(top, exc);

drop(excs);

drop(prev);

} else {

excs.stack.push(exc);

drop(excs);

}

#[cfg(feature = "threading")]

thread::update_thread_exception(self.topmost_exception());

}

pub(crate) fn set_exception(&self, exc: Option<PyBaseExceptionRef>) {

// don't be holding the RefCell guard while __del__ is called

let mut excs = self.exceptions.borrow_mut();

let top = excs

.stack

.last_mut()

.expect("set_exception called with empty exception stack");

let prev = core::mem::replace(top, exc);

drop(excs);

drop(prev);

#[cfg(feature = "threading")]

thread::update_thread_exception(self.topmost_exception());

}

🤖 Prompt for AI Agents

In `@crates/vm/src/vm/mod.rs` around lines 1454 - 1471, The debug_assert in set_exception claims exceptions.stack must be non-empty but the code still has an else that pushes into the stack, hiding a logic bug in release builds; enforce the invariant by replacing debug_assert! with assert! (or otherwise ensure failure in release builds) and remove the unreachable else branch so set_exception only replaces the top entry via last_mut() and never pushes a new entry; update the code paths around exceptions.borrow_mut(), core::mem::replace(top, exc), and thread::update_thread_exception(self.topmost_exception()) accordingly to preserve existing behavior when the stack is non-empty.

coderabbitai

🧹 Nitpick comments (4)

crates/vm/src/vm/thread.rs (2)
17-25: ThreadSlot: consider restricting field visibility.

Both frames and exception are pub, exposing the raw Mutex<Vec<FramePtr>> and PyAtomicRef to any module. Currently, external reads go through accessor functions (get_all_current_exceptions, get_all_current_frames), but the public fields allow bypassing those. If the intent is to keep access centralized, pub(crate) would be tighter.

90-99: Silent no-op when CURRENT_THREAD_SLOT is None.

push_thread_frame and pop_thread_frame silently skip if the slot is uninitialized. While this is a safe defensive choice (the slot should always be Some after enter_vm), a debug_assert! would help catch unexpected call-order bugs during development.
Optional: add debug assertion
 pub fn push_thread_frame(fp: FramePtr) {
     CURRENT_THREAD_SLOT.with(|slot| {
-        if let Some(s) = slot.borrow().as_ref() {
-            s.frames.lock().push(fp);
-        }
+        let borrowed = slot.borrow();
+        debug_assert!(borrowed.is_some(), "push_thread_frame called before enter_vm");
+        if let Some(s) = borrowed.as_ref() {
+            s.frames.lock().push(fp);
+        }
     });
 }
crates/vm/src/vm/mod.rs (2)
103-118: FramePtr safety: the Send impl relies on a non-local invariant.

The unsafe impl Send for FramePtr comment states the Vec is "always empty when the VM moves between threads." This is critical — if violated, a FramePtr could dangle on the new thread. The new_thread() method (line 306) creates a fresh frames: RefCell::new(vec![]), which upholds the invariant for the threading path. Good.

However, there's no compile-time enforcement. If someone later pushes to frames and then moves the VM (e.g., via ThreadedVirtualMachine), this would be UB. Consider adding a debug_assert!(self.frames.borrow().is_empty()) in new_thread() or in the ThreadedVirtualMachine construction.

1087-1131: resume_gen_frame: intentionally skips update_thread_exception — document the trade-off.

Line 1113 pushes to the exception stack directly, bypassing thread::update_thread_exception. This means sys._current_exceptions() won't reflect the exception state during generator execution. If this is a deliberate optimization (avoiding TLS + atomic overhead on every generator resume), a brief comment explaining why would help future maintainers, beyond "inline exception push."

Also, the expect on line 1125 in the scopeguard is safe (it pops the entry pushed at line 1113), but if resume_gen_frame is ever called without the matching push being reached (e.g., early return added between 1100 and 1113), the cleanup would panic. The current code is correct, but it's fragile against refactoring.
Suggested comment improvement
-        // Inline exception push without thread exception update
-        self.exceptions.borrow_mut().stack.push(exc);
+        // Inline exception push: skip thread::update_thread_exception() to avoid
+        // TLS + atomic overhead on every generator resume.  sys._current_exceptions()
+        // will not reflect the generator's transient exception state — acceptable
+        // because the frame is short-lived and the snapshot is best-effort.
+        self.exceptions.borrow_mut().stack.push(exc);

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/vm/src/vm/mod.rs`:
- Around line 1087-1089: The doc comment on the "Lightweight frame execution for
generator/coroutine resume" routine is incorrect: the implementation pushes the
FramePtr into self.frames and into the per-thread ThreadSlot (see the push at
the block that touches FramePtr/self.frames/ThreadSlot and the corresponding
scopeguard pop), so update the doc comment to say that the thread frame stack
(sys._current_frames()) is updated (i.e., a thread frame is pushed and later
popped via the scopeguard) and that trace/profile call/return events are still
fired when tracing is active; reference FramePtr, self.frames, ThreadSlot and
the scopeguard in the comment to match the code behavior.

🧹 Nitpick comments (2)

crates/vm/src/vm/mod.rs (1)

1095-1100: Recursion guard logic is duplicated from with_recursion.

Lines 1096–1100 replicate the check_recursive_call + check_c_stack_overflow + depth-increment sequence from with_recursion (lines 1000–1011). Consider extracting the guard into a reusable helper (e.g., returning a depth guard) so both call sites share the same logic.
crates/vm/src/vm/thread.rs (1)
93-110: push_thread_frame / pop_thread_frame silently no-op when slot is None.

If CURRENT_THREAD_SLOT is None (i.e. enter_vm was never called), both functions silently skip the operation. This is safe defensively, but a debug_assert! on the None path would help catch initialization ordering bugs during development.
Optional: add debug assertion
 pub fn push_thread_frame(fp: FramePtr) {
     CURRENT_THREAD_SLOT.with(|slot| {
         if let Some(s) = slot.borrow().as_ref() {
             s.frames.lock().push(fp);
+        } else {
+            debug_assert!(false, "push_thread_frame called without initialized thread slot");
         }
     });
 }

coderabbitai · 2026-02-16T15:44:14Z

crates/vm/src/vm/mod.rs

+    /// Lightweight frame execution for generator/coroutine resume.
+    /// Skips the thread frame stack (sys._current_frames()) but still fires
+    /// trace/profile call/return events when tracing is active.


⚠️ Potential issue | 🟡 Minor

Doc comment contradicts implementation — thread frame IS pushed.

Lines 1104–1106 push the FramePtr to both self.frames and the per-thread ThreadSlot, and the scopeguard pops both. The doc comment "Skips the thread frame stack (sys._current_frames())" is wrong and will mislead future readers.

Proposed fix

- /// Lightweight frame execution for generator/coroutine resume. - /// Skips the thread frame stack (sys._current_frames()) but still fires - /// trace/profile call/return events when tracing is active. + /// Lightweight frame execution for generator/coroutine resume. + /// Pushes to the thread frame stack and fires trace/profile events, + /// but skips the thread exception update for performance.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/// Lightweight frame execution for generator/coroutine resume.

/// Skips the thread frame stack (sys._current_frames()) but still fires

/// trace/profile call/return events when tracing is active.

/// Lightweight frame execution for generator/coroutine resume.

/// Pushes to the thread frame stack and fires trace/profile events,

/// but skips the thread exception update for performance.

🤖 Prompt for AI Agents

In `@crates/vm/src/vm/mod.rs` around lines 1087 - 1089, The doc comment on the "Lightweight frame execution for generator/coroutine resume" routine is incorrect: the implementation pushes the FramePtr into self.frames and into the per-thread ThreadSlot (see the push at the block that touches FramePtr/self.frames/ThreadSlot and the corresponding scopeguard pop), so update the doc comment to say that the thread frame stack (sys._current_frames()) is updated (i.e., a thread frame is pushed and later popped via the scopeguard) and that trace/profile call/return events are still fired when tracing is active; reference FramePtr, self.frames, ThreadSlot and the scopeguard in the comment to match the code behavior.

youknowone marked this pull request as ready for review February 16, 2026 06:58