Skip to content

no_std for common, pylib, codegen#7056

Merged
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:no_std
Feb 9, 2026
Merged

no_std for common, pylib, codegen#7056
youknowone merged 3 commits intoRustPython:mainfrom
youknowone:no_std

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Feb 9, 2026

Summary by CodeRabbit

  • New Features

    • Adds broad no-std support so the libraries can be built for embedded or std-less environments.
  • Bug Fixes

    • Centralized refcount overflow handling for more consistent behavior across builds.
    • Reduced thread-local reliance in string distance code to lower runtime memory/tls use.
  • Other

    • Conditional behavior adjustments to improve compatibility and deterministic behavior depending on build features.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

📝 Walkthrough

Walkthrough

Adds optional no_std support across multiple crates by introducing std feature flags, applying crate-level no_std gating, switching std→core imports, and providing cfg-based alternative implementations for panic handling, static storage, and thread-local behavior.

Changes

Cohort / File(s) Summary
Feature files
crates/codegen/Cargo.toml, crates/common/Cargo.toml
Add [features] with default = ["std"] and std feature to enable opt-in no_std builds.
Crate root attributes
crates/codegen/src/lib.rs, crates/common/src/lib.rs, crates/pylib/src/lib.rs
Add crate-level no_std gating (cfg_attr(not(feature = "std"), no_std) or #![no_std]) and gate std-only modules behind feature = "std".
Panic / unwind handling
crates/codegen/src/symboltable.rs, crates/common/src/refcount.rs
Provide cfg'ed implementations: symboltable::with_append uses catch_unwind on std, simpler push/pop otherwise; refcount overflow handling consolidated into a helper that aborts under std and panics otherwise.
Static storage / thread-local
crates/common/src/static_cell.rs
Refactor StaticCell into three cfg paths: threading (feature = "threading"), std non-threading (thread-local OnceCell), and no_std path with a SyncOnceCell wrapper + unsafe Sync impl.
Std→core migrations
crates/common/src/float_ops.rs, crates/common/src/str.rs
Replace std::f64 with core::f64; remove thread_local/RefCell scratch buffer from levenshtein_distance and use a local fixed-size array buffer.
Misc / ordering
crates/common/src/lib.rs, others
Adjust module-level cfg guards to require feature = "std" where appropriate and reorder gating to reflect feature dependency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • no_std clippy #7043 — Similar cross-crate no_std work: feature gates, crate no_std attributes, and std→core/alloc adjustments.

Suggested reviewers

  • ShaharNaveh
  • coolreader18

Poem

🐰
I hopped through crates and toggled flags,
Swapped std for core in tiny snags.
Guards and cells now mind their thread,
Panic paths are gently led.
A carrot for builds, no_std ahead!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective of the changeset: adding no_std support across three crates (common, pylib, and codegen).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review February 9, 2026 04:57
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@crates/codegen/Cargo.toml`:
- Around line 11-13: The crate's Cargo.toml currently toggles a local "std"
feature but doesn't propagate it to dependencies, so dependencies like thiserror
and itertools will still enable std; either (A) update workspace dependency
entries for thiserror and itertools to set default-features = false (and add the
needed features like itertools' use_alloc) so they truly work in no_std, or (B)
add feature forwarding in this crate's Cargo.toml by making the std feature
enable the dependencies' std/use_std/use_alloc equivalents (e.g., map this
crate's "std" feature to "thiserror/std" and the appropriate itertools feature)
so disabling std here also disables std in those deps; apply this change to the
Cargo.toml features and/or workspace dependency declarations and ensure feature
names match the dependency crates (thiserror, itertools).

In `@crates/codegen/src/symboltable.rs`:
- Around line 325-341: The PopGuard using a raw *mut pointer is unsound because
calling f(self) re-borrows self and invalidates that pointer; replace the
raw-pointer Drop guard in with_append with a safe length/index-based approach:
record the previous length (or index) of self.v, push x.into(), call f(self) to
get the result, then after f returns pop the element by checking/restoring based
on the recorded length (or pop unconditionally if you know you always pushed
one). Remove the PopGuard type and raw pointer usage (references: with_append,
PopGuard, self.v, f(self)).

In `@crates/common/src/refcount.rs`:
- Around line 38-40: The refcount overflow check currently uses panic! which
weakens safety; change the overflow handling in the refcount increment paths
(the branch checking if old_size & Self::MASK == Self::MASK in the method(s)
that perform increments—e.g., the function implementing inc and inc_by) to call
std::process::abort() when the standard library is available and only use panic!
as a fallback in no_std builds; implement this via cfg(feature =
"std")/cfg(not(feature = "std")) or equivalent cfg!(std) guards so the overflow
always aborts in std builds and apply the exact same change to the inc_by
overflow check as well.

- Gate OS-dependent modules behind `#[cfg(feature = "std")]`
- Replace `std::f64` with `core::f64` in float_ops
- Replace `std::process::abort` with panic in refcount
- Remove `thread_local` from levenshtein (stack buffer)
- Split static_cell into threading/non_threading/no_std
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/common/src/str.rs`:
- Line 536: The left-edge initialization in the Levenshtein DP loop is off by
one: update the assignment that sets result for the current row (using variables
result, b_index, and constant MOVE_COST inside the DP routine) so that it uses
(b_index + 1) multiplied by MOVE_COST instead of b_index multiplied by
MOVE_COST, ensuring the row deletion cost corresponds to row b_index+1.
🧹 Nitpick comments (2)
crates/common/src/static_cell.rs (1)

123-126: Imprecise unsafe safety justification — no_std does not preclude threading.

The comment "without std there can be no threads" is not accurate. Many no_std environments (RTOS, multi-core bare-metal) support threading. The actual safety invariant here is that the threading feature is disabled, which is a user contract that concurrent access won't occur. Consider revising the comment to reflect the real guarantee:

Suggested comment fix
-    // unsync::OnceCell is !Sync, but without std there can be no threads.
-    struct SyncOnceCell<T>(OnceCell<T>);
-    // SAFETY: Without std, threading is impossible.
-    unsafe impl<T> Sync for SyncOnceCell<T> {}
+    // unsync::OnceCell is !Sync, but this path is only active when the
+    // `threading` feature is disabled, meaning the caller guarantees
+    // no concurrent access.
+    struct SyncOnceCell<T>(OnceCell<T>);
+    // SAFETY: The `threading` feature is off, so the user has opted out of
+    // multi-threaded use. No concurrent access to the inner cell can occur.
+    unsafe impl<T> Sync for SyncOnceCell<T> {}
crates/common/Cargo.toml (1)

12-13: The std feature doesn't forward to dependency feature gates, which prevents true no_std builds.

If you plan to support no_std, dependencies like once_cell and itertools need conditional feature forwarding. The current std = [] is inert—disabling it doesn't disable std in transitive dependencies.

Consider extending the feature once no_std is fully supported:

std = ["once_cell/std", "itertools/use_std"]

(Note: getrandom is already explicitly enabled with std in workspace.dependencies. malachite-bigint has a std feature but doesn't fully support no_std at the crate level, so forwarding it won't be sufficient for true no_std builds.)

If this is an incomplete incremental step, a tracking issue would help clarify the scope.


let mut result = 0usize;
for (b_index, b_code) in b_bytes[b_begin..(b_begin + b_end)].iter().enumerate() {
result = b_index * MOVE_COST;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the levenshtein_distance function around line 536
cat -n crates/common/src/str.rs | sed -n '500,560p'

Repository: RustPython/RustPython

Length of output: 2483


🏁 Script executed:

#!/bin/bash
# Search for test cases related to levenshtein_distance
rg -n 'levenshtein_distance|#\[test\]' crates/common/src/str.rs -A 5 -B 1

Repository: RustPython/RustPython

Length of output: 474


🏁 Script executed:

#!/bin/bash
# Look for any other occurrences of the pattern to understand the refactoring
git log --oneline -n 20 -- crates/common/src/str.rs

Repository: RustPython/RustPython

Length of output: 93


🏁 Script executed:

#!/bin/bash
# Search for any test cases that verify levenshtein distance behavior
find . -name "*.rs" -type f -exec grep -l "levenshtein" {} \;

Repository: RustPython/RustPython

Length of output: 121


🏁 Script executed:

#!/bin/bash
# Look for test cases in the file itself
grep -n "test" crates/common/src/str.rs | head -20

Repository: RustPython/RustPython

Length of output: 143


🏁 Script executed:

#!/bin/bash
# Try to find CPython's implementation online or in comments
rg -n "CPython|suggestions\.c|UTF8_Edit" crates/ -A 3 -B 1

Repository: RustPython/RustPython

Length of output: 50380


🏁 Script executed:

#!/bin/bash
# Check if there's a git history we can inspect for this function
cd crates/common && git log --oneline -n 30 -- src/str.rs 2>/dev/null || echo "Git history not available"

Repository: RustPython/RustPython

Length of output: 93


🏁 Script executed:

#!/bin/bash
# Check the suggestion.rs file which might have the levenshtein implementation
cat -n crates/vm/src/suggestion.rs | head -100

Repository: RustPython/RustPython

Length of output: 4334


🏁 Script executed:

#!/bin/bash
# Search for any tests that use levenshtein_distance or suggestion functionality
find . -name "*.rs" -type f | xargs grep -l "suggestion\|levenshtein" | head -10

Repository: RustPython/RustPython

Length of output: 324


🏁 Script executed:

#!/bin/bash
# Look at the suggestion.rs file to see if it uses the levenshtein_distance function
rg -n "levenshtein_distance" crates/vm/src/suggestion.rs -B 3 -A 3

Repository: RustPython/RustPython

Length of output: 542


🏁 Script executed:

#!/bin/bash
# Search for any tests that call levenshtein_distance or validate its output
find . -name "*.rs" -type f | xargs grep -l "test.*levenshtein\|levenshtein.*test" 2>/dev/null

Repository: RustPython/RustPython

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Look for any Python tests that might indirectly test levenshtein via suggestions
find . -name "test*.py" -o -name "*test.py" | xargs grep -l "AttributeError\|NameError\|suggestion" 2>/dev/null | head -5

Repository: RustPython/RustPython

Length of output: 214


🏁 Script executed:

#!/bin/bash
# Search for CPython reference or comments about the levenshtein implementation
rg -n "CPython|edit.*cost|levenshtein" crates/common/src/str.rs -B 2 -A 2

Repository: RustPython/RustPython

Length of output: 329


🌐 Web query:

CPython _Py_UTF8_Edit_Cost implementation Levenshtein

💡 Result:

In CPython, _Py_UTF8_Edit_Cost() is implemented in Python/suggestions.c and is a bounded, weighted Levenshtein-distance routine over the strings’ UTF-8 byte sequences (it calls PyUnicode_AsUTF8AndSize() for both inputs). [1]

Key details of the implementation:

  • Operation costs

    • Insert/delete (“move”): MOVE_COST = 2 [1]
    • Substitution: 0 if equal; CASE_COST = 1 for ASCII case-only differences; otherwise MOVE_COST = 2 [1]
    • Case-only detection is optimized via LEAST_FIVE_BITS(c) == (c & 31) plus ASCII folding to confirm a case flip. [1]
  • Algorithm / optimizations

    • Trims common prefix and suffix before DP. [1]
    • Uses a single-row DP (in-place update) rather than a full matrix. [1]
    • Has a max_cost threshold: if the minimum value in a DP row exceeds max_cost, it bails out early and returns max_cost + 1. [1]
    • Hard limits: if either (trimmed) byte length exceeds MAX_STRING_SIZE (40), returns max_cost + 1 (treated as “too expensive”). [1]
  • max_cost == -1 behavior

    • Interpreted as “no practical bound”: it sets max_cost = MOVE_COST * max(len(a), len(b)) and then computes the distance. [1]

The function is declared as an internal API in Include/internal/pycore_pyerrors.h. [2]

Sources: Python/suggestions.c (Android git mirror of CPython) [1]; pycore_pyerrors.h declaration [2].


Line 536: Off-by-one bug in Levenshtein DP initialization.

The variable result represents the left-edge cost (deletion cost) for the current DP row. For row b_index+1, this should be (b_index + 1) * MOVE_COST, not b_index * MOVE_COST. The current code computes all edit distances with the wrong row-edge values, producing incorrect results. This matches CPython's _Py_UTF8_Edit_Cost implementation, which correctly uses (b_index + 1) * MOVE_COST.

Change line 536 from:

result = b_index * MOVE_COST;

to:

result = (b_index + 1) * MOVE_COST;
🤖 Prompt for AI Agents
In `@crates/common/src/str.rs` at line 536, The left-edge initialization in the
Levenshtein DP loop is off by one: update the assignment that sets result for
the current row (using variables result, b_index, and constant MOVE_COST inside
the DP routine) so that it uses (b_index + 1) multiplied by MOVE_COST instead of
b_index multiplied by MOVE_COST, ensuring the row deletion cost corresponds to
row b_index+1.

@youknowone youknowone merged commit 570d50c into RustPython:main Feb 9, 2026
23 of 24 checks passed
@youknowone youknowone deleted the no_std branch February 9, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant