Replace custom opcodes with CPython standard sequences#6794
Replace custom opcodes with CPython standard sequences#6794youknowone merged 2 commits intoRustPython:mainfrom
Conversation
📝 WalkthroughWalkthroughReworks collection-building and call-argument emission to a two-step streaming pattern: build empty list/set, extend/update per-element (via Changes
Sequence Diagram(s)sequenceDiagram
participant Codegen
participant Bytecode
participant VM
participant Runtime
Codegen->>Bytecode: emit BuildList(size=0) / BuildSet(size=0)
Codegen->>Bytecode: emit ListExtend(i) / SetUpdate(i) / DictMerge(index) (per element)
Codegen->>Bytecode: emit ListToTuple (if target is tuple)
Bytecode->>VM: stream of instructions
VM->>Runtime: execute BuildList -> create empty sequence
VM->>Runtime: on ListExtend -> iterate source -> append items to list
VM->>Runtime: on SetUpdate -> iterate source -> add items to set
VM->>Runtime: on DictMerge -> validate mapping -> merge keys into target dict
alt conversion requested
VM->>Runtime: ListToTuple intrinsic -> produce tuple
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 542-568: gather_elements leaves tuple chunks on the stack before
the collection is created, so emitting Instruction::BuildList/BuildSet after
those chunks causes ListExtend/SetUpdate to index the wrong stack slot and
trigger unsafe casts; fix by ensuring the collection is created before tuple
chunks are left on the stack (emit Instruction::BuildList { size: 0 } /
Instruction::BuildSet { size: 0 } prior to the code that pushes tuple chunks in
gather_elements) or, if reordering is infeasible, adjust the ListExtend { i: ...
} / SetUpdate { i: ... } indices to account for the tuple-first layout
(increment the depth used) in the CollectionType::List and CollectionType::Set
handling (also apply same fix to the other occurrences noted).
🧹 Nitpick comments (2)
crates/vm/src/frame.rs (2)
941-958: Consider using lazy iteration over keys instead of collecting into Vec.The current implementation collects all keys into a
Vecbefore iterating, which allocates memory proportional to the source dict size. This could be inefficient for large mappings and doesn't match CPython's lazy iteration behavior.Consider using the iterator protocol directly:
♻️ Suggested refactor using lazy iteration
- // Get keys from source and check for duplicates - let keys_iter = vm.call_method(&source, "keys", ())?; - for key in keys_iter.try_to_value::<Vec<PyObjectRef>>(vm)? { + // Get keys from source and check for duplicates + let keys_obj = vm.call_method(&source, "keys", ())?; + let keys_iter = keys_obj.get_iter(vm)?; + while let PyIterReturn::Return(key) = keys_iter.next(vm)? { // Check if keyword argument is a string if key.downcast_ref::<PyStr>().is_none() { return Err(vm.new_type_error("keywords must be strings".to_owned())); } // Check for duplicate keyword arguments if dict.contains_key(&*key, vm) { let key_repr = key.repr(vm)?; return Err(vm.new_type_error(format!( "got multiple values for keyword argument {}", key_repr.as_str() ))); } let value = vm.call_method(&source, "__getitem__", (key.clone(),))?; dict.set_item(&*key, value, vm)?; }
1594-1607: Consider using PySet'supdate_internalmethod for consistency withListExtend.While
ListExtenddelegates tolist.extend(), this handler manually iterates and adds items.PySethas anupdate_internal()method that takes a single iterable and would provide the same consistency pattern: replace the manual iteration withset.update_internal(iterable, vm)?;.
ccb2a22 to
8c3b698
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/vm/src/frame.rs`:
- Around line 921-959: Instruction::DictMerge currently pops the source with
pop_value() then calls nth_value(index.get(arg)) without compensating for the
pop, so it reads the wrong stack slot; mirror the DictUpdate pattern by
subtracting 1 from the computed slot before calling nth_value (i.e., use
nth_value(index.get(arg) - 1) or otherwise adjust the index after pop_value()),
ensuring you reference the same index handling as in DictUpdate and keep the
existing checks (get_method, keys iteration, contains_key, set_item) intact.
♻️ Duplicate comments (3)
crates/codegen/src/compile.rs (3)
539-570: Fix LIST_EXTEND/SET_UPDATE stack target (collection is on TOS).
gather_elementsleaves tuple chunks on the stack, thenBUILD_LIST/BUILD_SETpushes the collection above them.LIST_EXTEND/SET_UPDATEexpects the collection below the iterable on the stack, so this order can target a tuple instead of the list/set and fail at runtime. Please reorder stack construction or adjust the index/depth to match opcode expectations.
4436-4448: Fix LIST_EXTEND stack target for starred bases.This sequence builds the list on TOS above tuple chunks, but
LIST_EXTENDexpects the list below the iterable. That can mis-target the tuple and trigger unsafe casts. Please reorder or adjust index depth as in the earlier unpacking path.
6994-7004: *Fix LIST_EXTEND stack target when building args tuple.Here the list is pushed on top of tuple chunks, but
LIST_EXTENDexpects the list below the iterable. This can mis-target a tuple at runtime. Please reorder the stack or adjust the depth/index so the list is below the iterable.
a35f3bb to
d8b46b1
Compare
|
Code has been automatically formatted The code in this PR has been formatted using:
|
Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences: - BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop - BuildSetFromTuples → BUILD_SET + SET_UPDATE loop - BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple) - BuildMapForCall → DICT_MERGE loop Implement missing opcodes: - ListExtend: Extend list with iterable elements - SetUpdate: Add iterable elements to set - DictMerge: Merge dict with duplicate key checking
aebd863 to
6b5d0d3
Compare
* Replace custom opcodes with standard sequences Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences: - BuildListFromTuples → BUILD_LIST + LIST_EXTEND loop - BuildSetFromTuples → BUILD_SET + SET_UPDATE loop - BuildTupleFromTuples → BUILD_LIST + LIST_EXTEND + CALL_INTRINSIC_1(ListToTuple) - BuildMapForCall → DICT_MERGE loop Implement missing opcodes: - ListExtend: Extend list with iterable elements - SetUpdate: Add iterable elements to set - DictMerge: Merge dict with duplicate key checking * Auto-generate: generate_opcode_metadata.py --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Remove RustPython-specific opcodes (BuildListFromTuples, BuildMapForCall, BuildSetFromTuples, BuildTupleFromTuples) and replace their usage with CPython 3.14 standard opcode sequences:
Implement missing opcodes:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.