Update csv.py from 3.14.3#7181
Conversation
📝 WalkthroughWalkthroughCSV parsing error messages were standardized to emphasize Unicode-character requirements for delimiter, quotechar, escapechar, and quoting parameters; lineterminator and dialect argument validations were clarified. The import list in Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/stdlib/src/csv.rs`:
- Around line 171-177: The catch-all branch for validating "quotechar" currently
hardcodes the string "not int" in the error message; update it to mirror the
escapechar handler by capturing the incoming value (attr) and using attr.class()
in the vm.new_exception_msg call so the error reports the actual type (e.g.,
"not list") instead of always "not int"; locate the branch that constructs
Err(vm.new_exception_msg(super::_csv::error(vm), ...)) for quotechar and replace
the literal with a message built from attr.class().
- Line 20: The import statement brings in an unused symbol
sliceable::SliceableSequenceOp; remove SliceableSequenceOp from the use line
that currently reads use rustpython_vm::{match_class,
sliceable::SliceableSequenceOp}; so it only imports match_class (i.e. change to
import match_class and drop sliceable::SliceableSequenceOp) to eliminate the
unused-import compiler warning.
- Around line 234-237: The error message in the fallback arm for attr currently
says `"quoting" must be string or None` but the code only accepts PyInt (see the
PyInt handling above), so update the constructed message in that match arm
(where msg is built and Err(vm.new_type_error(...)) is returned) to say
`"quoting" must be an integer or None, not {}"`, keeping the existing use of
attr.class() to show the offending type; this ensures the TypeError text matches
the PyInt expectation and CPython style.
| use parking_lot::Mutex; | ||
| use rustpython_common::lock::LazyLock; | ||
| use rustpython_vm::match_class; | ||
| use rustpython_vm::{match_class, sliceable::SliceableSequenceOp}; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that SliceableSequenceOp is not used in csv.rs
echo "=== Searching for SliceableSequenceOp usage ==="
rg 'SliceableSequenceOp' crates/stdlib/src/csv.rs
echo ""
echo "=== Also checking if it's used in any related imports/code ==="
wc -l crates/stdlib/src/csv.rsRepository: RustPython/RustPython
Length of output: 276
Remove unused import SliceableSequenceOp from line 20.
This import is not referenced anywhere in the file and will produce a compiler warning.
- use rustpython_vm::{match_class, sliceable::SliceableSequenceOp};
+ use rustpython_vm::match_class;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| use rustpython_vm::{match_class, sliceable::SliceableSequenceOp}; | |
| use rustpython_vm::match_class; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/stdlib/src/csv.rs` at line 20, The import statement brings in an
unused symbol sliceable::SliceableSequenceOp; remove SliceableSequenceOp from
the use line that currently reads use rustpython_vm::{match_class,
sliceable::SliceableSequenceOp}; so it only imports match_class (i.e. change to
import match_class and drop sliceable::SliceableSequenceOp) to eliminate the
unused-import compiler warning.
There was a problem hiding this comment.
This trait is what brings PyStr.len into scope
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for .len() calls in the csv.rs file to verify SliceableSequenceOp usage
echo "=== Searching for .len() calls in csv.rs ==="
rg -n '\.len\(\)' crates/stdlib/src/csv.rs
echo ""
echo "=== Looking for context around s.len() specifically ==="
rg -n -C2 's\.len\(\)' crates/stdlib/src/csv.rs
Length of output: 1719
@ShaharNaveh, you're absolutely right! Thank you for the clarification.
I can confirm that PyStr.len() is indeed used on lines 144, 165, and 189 in the error messages for delimiter, quotechar, and escapechar validation. In Rust, trait imports bring their methods into scope even when the trait name itself isn't directly referenced in the code. Without the SliceableSequenceOp import, those .len() calls would fail to compile.
My suggestion to remove this import was incorrect. The import is necessary and should remain.
🐰
((
(='.')
(")_(")
🧬 Code Graph Analysis Results
[crates/stdlib/src/csv.rs] parse_delimiter_from_obj (lines around its definition)
/// Parses the delimiter from a Python object and returns its ASCII value.
///
/// This function attempts to extract the 'delimiter' attribute from the given Python object and ensures that the attribute is a single-character string. If successful, it returns the ASCII value of the character. If the attribute is not a single-character string, an error is returned.
///
/// # Arguments
///
/// * `vm` - A reference to the VirtualMachine, used for executing Python code and manipulating Python objects.
/// * `obj` - The PyObjectRef from which the 'delimiter' attribute is to be parsed.
///
/// # Returns
///
/// If successful, returns a `PyResult<u8>` representing the ASCII value of the 'delimiter' attribute. If unsuccessful, returns a `PyResult` containing an error message.
///
/// # Errors
///
/// This function can return the following errors:
///
/// * If the 'delimiter' attribute is not a single-character string, a type error is returned.
/// * If the 'obj' is not of string type and does not have a 'delimiter' attribute, a type error is returned.
fn parse_delimiter_from_obj(vm: &VirtualMachine, obj: &PyObject) -> PyResult<u8> {
if let Ok(attr) = obj.get_attr("delimiter", vm) {
parse_delimiter_from_obj(vm, &attr)
} else {
match_class!(match obj.to_owned() {
s @ PyStr => {
Ok(s.as_str().bytes().exactly_one().map_err(|_| {
vm.new_type_error(format!(
r#""delimiter" must be a unicode character, not a string of length {}"#,
s.len()
))
})?)
}
attr => {
let msg = format!(
r#""delimiter" must be a unicode character, not {}"#,
attr.class()
);
Err(vm.new_type_error(msg))
}
})
}
}
[crates/stdlib/src/csv.rs] parse_quotechar_from_obj (lines around its definition)
fn parse_quotechar_from_obj(vm: &VirtualMachine, obj: &PyObject) -> PyResult<Option<u8>> {
match_class!(match obj.get_attr("quotechar", vm)? {
s @ PyStr => {
Ok(Some(s.as_str().bytes().exactly_one().map_err(|_| {
vm.new_exception_msg(
super::_csv::error(vm),
format!(r#""quotechar" must be a unicode character or None, not a string of length {}"#, s.len()),
)
})?))
}
_n @ PyNone => {
Ok(None)
}
attr => {
Err(vm.new_exception_msg(
super::_csv::error(vm),
format!(
r#""quotechar" must be a unicode character or None, not {}"#,
attr.class()
),
))
}
})
}
[crates/stdlib/src/csv.rs] prase_lineterminator_from_obj (lines around its definition)
fn prase_lineterminator_from_obj(vm: &VirtualMachine, obj: &PyObject) -> PyResult<Terminator> {
match_class!(match obj.get_attr("lineterminator", vm)? {
s @ PyStr => {
Ok(if s.as_bytes().eq(b"\r\n") {
csv_core::Terminator::CRLF
} else if let Some(t) = s.as_bytes().first() {
// Due to limitations in the current implementation within csv_core
// the support for multiple characters in lineterminator is not complete.
// only capture the first character
csv_core::Terminator::Any(*t)
} else {
return Err(vm.new_exception_msg(
super::_csv::error(vm),
r#""lineterminator" must be a string"#.to_owned(),
));
})
}
attr => {
Err(vm.new_type_error(format!(
r#""lineterminator" must be a string, not {}"#,
attr.class()
)))
}
})
}
[crates/stdlib/src/csv.rs] prase_quoting_from_obj (lines around its definition)
fn prase_quoting_from_obj(vm: &VirtualMachine, obj: &PyObject) -> PyResult<QuoteStyle> {
match_class!(match obj.get_attr("quoting", vm)? {
i @ PyInt => {
Ok(i.try_to_primitive::<isize>(vm)?.try_into().map_err(|_e| {
let msg = r#"bad "quoting" value"#;
vm.new_type_error(msg.to_owned())
})?)
}
attr => {
let msg = format!(r#""quoting" must be string or None, not {}"#, attr.class());
Err(vm.new_type_error(msg.to_owned()))
}
})
}
[crates/stdlib/src/csv.rs] prase_dialect_item_from_arg (lines around its definition)
fn prase_dialect_item_from_arg(
vm: &VirtualMachine,
obj: PyObjectRef,
) -> Result<DialectItem, ArgumentError> {
match_class!(match obj {
s @ PyStr => {
Ok(DialectItem::Str(s.as_str().to_string()))
}
PyNone => {
Err(ArgumentError::InvalidKeywordArgument("dialect".to_string()))
}
t @ PyType => {
let temp = t
.as_object()
.call(vec![], vm)
.map_err(|_e| ArgumentError::InvalidKeywordArgument("dialect".to_string()))?;
Ok(DialectItem::Obj(
PyDialect::try_from_object(vm, temp).map_err(|_| {
ArgumentError::InvalidKeywordArgument("dialect".to_string())
})?,
))
}
obj => {
if let Ok(cur_dialect_item) = PyDialect::try_from_object(vm, obj) {
Ok(DialectItem::Obj(cur_dialect_item))
} else {
let msg = "dialect".to_string();
Err(ArgumentError::InvalidKeywordArgument(msg))
}
}
})
}
[crates/stdlib/src/csv.rs] PyDialect::try_from_object usage context (via related parsing)
impl TryFromObject for PyDialect {
fn try_from_object(vm: &VirtualMachine, obj: PyObjectRef) -> PyResult<Self> {
let delimiter = parse_delimiter_from_obj(vm, &obj)?;
let quotechar = parse_quotechar_from_obj(vm, &obj)?;
let escapechar = parse_escapechar_from_obj(vm, &obj)?;
let doublequote = obj.get_attr("doublequote", vm)?.try_to_bool(vm)?;
let skipinitialspace = obj.get_attr("skipinitialspace", vm)?.try_to_bool(vm)?;
let lineterminator = prase_lineterminator_from_obj(vm, &obj)?;
let quoting = prase_quoting_from_obj(vm, &obj)?;
let strict = if let Ok(t) = obj.get_attr("strict", vm) {
t.try_to_bool(vm).unwrap_or(false)
} else {
false
};
Ok(Self {
delimiter,
quotechar,
escapechar,
doublequote,
skipinitialspace,
lineterminator,
quoting,
strict,
})
}
}
[crates/stdlib/src/csv.rs] Examples of trait/conversion behavior informing PyStr.len scope
fn parse_delimiter_from_obj(vm: &VirtualMachine, obj: &PyObject) -> PyResult<u8> { ... }
Note: The above snippets illustrate how PyStr length checks (e.g., ensuring a single-character string via .bytes().exactly_one() and using .len()) factor into delimiter/quotechar handling in this csv module, which is relevant to understanding how PyStr.len contributes to scope in this code path.
✏️ Learnings added
Learnt from: ShaharNaveh
Repo: RustPython/RustPython PR: 7181
File: crates/stdlib/src/csv.rs:20-20
Timestamp: 2026-02-17T07:54:36.200Z
Learning: In RustPython's codebase (especially in `**/*.rs`), the `SliceableSequenceOp` trait import from `rustpython_vm::sliceable` is necessary to bring methods like `.len()` into scope for `PyStr`, even if the trait name itself is not directly referenced in the code.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-06T05:30:53.056Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-06T00:40:12.400Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
| attr => { | ||
| let msg = format!("\"quoting\" must be string or None, not {}", attr.class()); | ||
| let msg = format!(r#""quoting" must be string or None, not {}"#, attr.class()); | ||
| Err(vm.new_type_error(msg.to_owned())) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
CPython csv module quoting parameter type validation error message
💡 Result:
In CPython’s csv module, the quoting parameter is validated as an integer (one of the csv.QUOTE_* constants):
-
If you pass a non-integer (e.g., a string like
"csv.QUOTE_NONE"), you get:
TypeError: "quoting" must be an integer. [1], [3] -
If you pass an integer that’s not a valid quoting enum value (e.g.,
quoting=99), you get:
TypeError: bad "quoting" value. [2]
Sources: [1] [2] [3]
🏁 Script executed:
cd crates/stdlib/src && head -n 250 csv.rs | tail -n 50Repository: RustPython/RustPython
Length of output: 2536
Error message says "quoting" must be string or None but quoting expects an integer.
The function accepts only PyInt (line 228), so the fallthrough error should say "quoting" must be an integer, not {class} to match CPython's actual error message for this parameter.
- let msg = format!(r#""quoting" must be string or None, not {}"#, attr.class());
+ let msg = format!(r#""quoting" must be an integer, not {}"#, attr.class());📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| attr => { | |
| let msg = format!("\"quoting\" must be string or None, not {}", attr.class()); | |
| let msg = format!(r#""quoting" must be string or None, not {}"#, attr.class()); | |
| Err(vm.new_type_error(msg.to_owned())) | |
| } | |
| attr => { | |
| let msg = format!(r#""quoting" must be an integer, not {}"#, attr.class()); | |
| Err(vm.new_type_error(msg.to_owned())) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/stdlib/src/csv.rs` around lines 234 - 237, The error message in the
fallback arm for attr currently says `"quoting" must be string or None` but the
code only accepts PyInt (see the PyInt handling above), so update the
constructed message in that match arm (where msg is built and
Err(vm.new_type_error(...)) is returned) to say `"quoting" must be an integer or
None, not {}"`, keeping the existing use of attr.class() to show the offending
type; this ensures the TypeError text matches the PyInt expectation and CPython
style.
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/stdlib/src/csv.rs`:
- Around line 238-241: The error message in the match arm that handles
unexpected types for quoting is wrong: it says `"quoting" must be string or
None` but quoting accepts a PyInt (see the branch handling PyInt around the
quoting parsing code), so update the error message in the attr => arm to:
`"quoting" must be an integer, not {}"`, keeping the same use of attr.class()
and vm.new_type_error to construct and return the TypeError (i.e., change the
format! string around the attr.class() call in that match arm).
- Line 20: Remove the unused import SliceableSequenceOp from the rustpython_vm
import list in the csv module; update the use statement to only import the
symbols actually referenced (e.g., keep match_class) so the import line no
longer includes SliceableSequenceOp and the compiler warning is resolved.
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/csv.py dependencies:
dependent tests: (2 tests)
Legend:
|
Summary by CodeRabbit