Skip to content

Update hashlib from v3.14.3 and align _hashlib to CPython#7120

Merged
youknowone merged 2 commits intoRustPython:mainfrom
youknowone:hashlib
Feb 14, 2026
Merged

Update hashlib from v3.14.3 and align _hashlib to CPython#7120
youknowone merged 2 commits intoRustPython:mainfrom
youknowone:hashlib

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Feb 14, 2026

Summary by CodeRabbit

  • New Features

    • Added HMAC and PBKDF2 cryptographic functions.
    • Exposed hash algorithm enumeration and per-algorithm constructors.
    • Added FIPS mode detection attribute.
  • Bug Fixes

    • Improved error handling and propagation across hashing operations.
    • Fixed mangled-name consistency for private identifier emission.
  • Chores

    • Updated cryptographic dependencies and added hmac and pbkdf2.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 14, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Replaces emitted parameter-name strings with mangled variants, updates many hash wrappers to propagate errors and accept the VM, refactors hashlib to centralize algorithm handling and add HMAC/PBKDF2, updates stdlib crypto dependencies, and adds an explicit Ok type annotation in VM stdio init.

Changes

Cohort / File(s) Summary
Code Generation
crates/codegen/src/compile.rs
Emit mangled identifier strings instead of raw parameter-name literals in keyword-only defaults and kw_defaults.
Dependencies
crates/stdlib/Cargo.toml
Bumped digest and added hmac = "0.12", pbkdf2 = { version = "0.12", features = ["hmac"] }.
Hashlib Core
crates/stdlib/src/hashlib.rs
Large refactor: centralized HASH_ALGORITHMS, unified hashlib_new with data resolution, added hmac_digest and pbkdf2_hmac, new helpers (resolve_digestmod, hash_digest_size, unsupported_hash), module attrs (openssl_md_meth_names, _constructors, _GIL_MINSIZE, get_fips_mode).
Hash Wrappers
crates/stdlib/src/blake2.rs, crates/stdlib/src/md5.rs, crates/stdlib/src/sha1.rs, crates/stdlib/src/sha256.rs, crates/stdlib/src/sha3.rs, crates/stdlib/src/sha512.rs
Wrapper functions updated to call local_* variants with vm and propagate Result using ?, preserving public signatures but changing internal error propagation.
VM init
crates/vm/src/vm/mod.rs
Added explicit type annotation on Ok in essential_init's make_stdio closure for error type inference.

Sequence Diagram(s)

sequenceDiagram
    participant PyCode as Python caller
    participant HashLib as _hashlib (Rust)
    participant VM as VirtualMachine
    participant Crypto as Crypto crates (digest/hmac/pbkdf2)
    PyCode->>HashLib: call hashlib_new / hmac_digest / pbkdf2_hmac (args)
    HashLib->>VM: resolve_data(args) / resolve_digestmod(args, vm)
    HashLib->>HashLib: select algorithm from HASH_ALGORITHMS
    alt Construct hasher
        HashLib->>HashLib: instantiate PyHasher / PyHasherXof
    else HMAC/PBKDF2
        HashLib->>Crypto: call hmac/pbkdf2 with resolved digest name
        Crypto-->>HashLib: bytes / error
    end
    HashLib->>VM: convert result to PyObject (into_pyobject(vm)) or return Err
    VM-->>PyCode: PyObject / PyError
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh
  • fanninpm
  • coolreader18

Poem

🐰 I nibble names and mangle right,

I hop through hashes, passing VM light,
PBKDF2 and HMAC in my clutch,
Errors caught with a careful touch,
A tiny rabbit cheers the build tonight.

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 74.47% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (18 files):

⚔️ Cargo.lock (content)
⚔️ Lib/hashlib.py (content)
⚔️ Lib/test/test_hashlib.py (content)
⚔️ Lib/test/test_hmac.py (content)
⚔️ Lib/test/test_inspect/test_inspect.py (content)
⚔️ Lib/test/test_keywordonlyarg.py (content)
⚔️ Lib/test/test_positional_only_arg.py (content)
⚔️ Lib/test/test_smtplib.py (content)
⚔️ crates/codegen/src/compile.rs (content)
⚔️ crates/stdlib/Cargo.toml (content)
⚔️ crates/stdlib/src/blake2.rs (content)
⚔️ crates/stdlib/src/hashlib.rs (content)
⚔️ crates/stdlib/src/md5.rs (content)
⚔️ crates/stdlib/src/sha1.rs (content)
⚔️ crates/stdlib/src/sha256.rs (content)
⚔️ crates/stdlib/src/sha3.rs (content)
⚔️ crates/stdlib/src/sha512.rs (content)
⚔️ crates/vm/src/vm/mod.rs (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the pull request, which is to update the hashlib implementation and align it with CPython standards.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review February 14, 2026 11:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/stdlib/src/hashlib.rs (1)

554-557: ⚠️ Potential issue | 🔴 Critical

compare_digest uses non-constant-time ==, defeating its purpose.

This function exists specifically to prevent timing side-channel attacks when comparing HMACs/digests. The current Vec<u8> equality check short-circuits on the first differing byte, leaking information about which prefix matches. A constant-time comparison is required.

Proposed fix
+    // Constant-time comparison to prevent timing side-channel attacks.
+    fn constant_time_eq(a: &[u8], b: &[u8]) -> bool {
+        if a.len() != b.len() {
+            return false;
+        }
+        a.iter()
+            .zip(b.iter())
+            .fold(0u8, |acc, (x, y)| acc | (x ^ y))
+            == 0
+    }
+
     let a_hash = a.borrow_bytes().to_vec();
     let b_hash = b.borrow_bytes().to_vec();

-    Ok((a_hash == b_hash).to_pyobject(vm))
+    Ok(constant_time_eq(&a_hash, &b_hash).to_pyobject(vm))
🤖 Fix all issues with AI agents
In `@crates/stdlib/Cargo.toml`:
- Around line 70-71: The Cargo.toml currently lists digest = "0.10.3" which
conflicts with pbkdf2 = { version = "0.12", features = ["hmac"] } because pbkdf2
(with hmac feature) requires digest ^0.10.7; update the digest dependency to at
least "0.10.7" (e.g., digest = "0.10.7" or a compatible caret range) so that
pbkdf2 0.12 and hmac 0.12 can resolve, then run cargo update / cargo build to
verify the dependency graph for no further conflicts.

In `@crates/stdlib/src/hashlib.rs`:
- Around line 625-630: The code casts a validated i64 `len` to `usize` for
`dklen` without an upper bound, which can trigger huge allocations; modify the
Some(obj) branch (where `let len: i64 = obj.try_into_value(vm)?;`) to enforce a
safe maximum before casting: compute a max_dklen = (2u64.pow(32) - 1) *
digest_size (i.e. (2^32-1) * digest_size) using the digest output size from the
relevant digest object, check that `len >= 1 && (len as u64) <= max_dklen`,
return vm.new_value_error(...) if out of range, and only then convert with a
checked conversion (e.g., usize::try_from(len) or len as usize after the bound
check) so dklen cannot request excessive memory.
🧹 Nitpick comments (3)
crates/stdlib/src/hashlib.rs (3)

376-421: Large repetitive match arms in hashlib_new — consider a helper to reduce duplication.

The same PyHasher::new(name, HashWrapper::new::<T>(data)).into_pyobject(vm) pattern is repeated for every fixed-output algorithm. Extracting a small generic helper would collapse ~30 lines into one call per arm.

This applies equally to hmac_digest (Lines 596-608) and pbkdf2_hmac (Lines 646-658).


580-609: HMAC doesn't support Blake2 or Shake, but HASH_ALGORITHMS includes them.

hmac_digest falls through to the _ arm for blake2b, blake2s, shake_128, and shake_256, returning unsupported hash type. If this is intentional (matching CPython behaviour), it's fine — but it may surprise callers who see those names in openssl_md_meth_names. Worth a brief comment documenting the intentional exclusion, or extending support if HMAC-Blake2 is desired.


571-578: hmac_new is a stub that always raises — consider documenting intent.

If this is a deliberate placeholder awaiting a full HMAC object implementation, a brief // TODO or code comment would help future contributors understand it's intentional rather than an oversight.

Comment on lines 625 to 630
Some(obj) => {
let len: i64 = obj.try_into_value(vm)?;
if len < 1 {
return Err(vm.new_value_error("key length must be greater than 0.".to_owned()));
}
len as usize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unchecked i64usize cast for dklen can allocate excessive memory.

After verifying len >= 1, Line 630 casts directly with len as usize. While an i64 that fits in usize won't wrap, a very large value (e.g., 2^50) would attempt a multi-petabyte allocation. CPython caps dklen at (2^32 - 1) * digest_size per the PKCS#5 spec. Consider adding a reasonable upper bound.

Proposed fix
             Some(obj) => {
                 let len: i64 = obj.try_into_value(vm)?;
                 if len < 1 {
                     return Err(vm.new_value_error("key length must be greater than 0.".to_owned()));
                 }
-                len as usize
+                usize::try_from(len).map_err(|_| {
+                    vm.new_overflow_error("key length is too great.".to_owned())
+                })?
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Some(obj) => {
let len: i64 = obj.try_into_value(vm)?;
if len < 1 {
return Err(vm.new_value_error("key length must be greater than 0.".to_owned()));
}
len as usize
Some(obj) => {
let len: i64 = obj.try_into_value(vm)?;
if len < 1 {
return Err(vm.new_value_error("key length must be greater than 0.".to_owned()));
}
usize::try_from(len).map_err(|_| {
vm.new_overflow_error("key length is too great.".to_owned())
})?
}
🤖 Prompt for AI Agents
In `@crates/stdlib/src/hashlib.rs` around lines 625 - 630, The code casts a
validated i64 `len` to `usize` for `dklen` without an upper bound, which can
trigger huge allocations; modify the Some(obj) branch (where `let len: i64 =
obj.try_into_value(vm)?;`) to enforce a safe maximum before casting: compute a
max_dklen = (2u64.pow(32) - 1) * digest_size (i.e. (2^32-1) * digest_size) using
the digest output size from the relevant digest object, check that `len >= 1 &&
(len as u64) <= max_dklen`, return vm.new_value_error(...) if out of range, and
only then convert with a checked conversion (e.g., usize::try_from(len) or len
as usize after the bound check) so dklen cannot request excessive memory.

@youknowone youknowone merged commit ec0f3b4 into RustPython:main Feb 14, 2026
12 of 14 checks passed
@youknowone youknowone deleted the hashlib branch February 14, 2026 12:40
@github-actions
Copy link
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/hashlib.py
[x] test: cpython/Lib/test/test_hashlib.py (TODO: 8)

dependencies:

  • hashlib

dependent tests: (15 tests)

  • hashlib: test_hashlib test_hmac test_smtplib test_tarfile test_unicodedata test_urllib2_localnet
    • urllib.request: test_http_cookiejar test_pathlib test_pydoc test_site test_ssl test_urllib test_urllib2 test_urllib2net test_urllibnet

[ ] lib: cpython/Lib/hmac.py
[ ] test: cpython/Lib/test/test_hmac.py (TODO: 26)

dependencies:

  • hmac

dependent tests: (4 tests)

  • hmac: test_hmac test_smtplib
    • secrets: test_secrets
    • smtplib: test_smtpnet

[x] lib: cpython/Lib/inspect.py
[ ] test: cpython/Lib/test/test_inspect (TODO: 48)

dependencies:

  • inspect

dependent tests: (44 tests)

  • inspect: test_abc test_argparse test_asyncgen test_buffer test_builtin test_code test_collections test_coroutines test_decimal test_enum test_functools test_generators test_grammar test_inspect test_ntpath test_operator test_patma test_posixpath test_pydoc test_signal test_traceback test_type_annotations test_types test_typing test_unittest test_yield_from test_zipimport
    • asyncio: test_asyncio test_contextlib_async test_logging test_os test_sys_settrace test_unittest
    • bdb: test_bdb
    • dataclasses: test__colorize test_genericalias test_pprint test_regrtest test_zoneinfo
    • importlib.metadata: test_importlib
    • pydoc:
      • xmlrpc.server: test_docxmlrpc test_xmlrpc
    • rlcompleter: test_rlcompleter
    • trace: test_trace

[x] test: cpython/Lib/test/test_keywordonlyarg.py

dependencies:

dependent tests: (no tests depend on keywordonlyarg)

[x] test: cpython/Lib/test/test_positional_only_arg.py (TODO: 4)

dependencies:

dependent tests: (no tests depend on positional_only_arg)

[ ] lib: cpython/Lib/smtplib.py
[x] test: cpython/Lib/test/test_smtplib.py
[x] test: cpython/Lib/test/test_smtpnet.py

dependencies:

  • smtplib (native: email.base64mime, email.generator, email.message, email.utils, sys)
    • datetime (native: _thread, math, sys, time)
    • hmac
    • base64, copy, io, re, socket

dependent tests: (2 tests)

  • smtplib: test_smtplib test_smtpnet

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant