Skip to content

Conversation

bschoenmaeckers
Copy link
Member

No description provided.

@bschoenmaeckers bschoenmaeckers force-pushed the PyUnicodeWriter branch 3 times, most recently from f0df624 to eb46ca2 Compare June 19, 2025 13:09
@bschoenmaeckers bschoenmaeckers force-pushed the PyUnicodeWriter branch 7 times, most recently from 0897456 to 203dec2 Compare June 20, 2025 07:16
@bschoenmaeckers bschoenmaeckers force-pushed the PyUnicodeWriter branch 3 times, most recently from aea618c to c28a9e4 Compare July 14, 2025 08:34
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long to come to this. This API looks pretty tempting as a way to squeeze some performance out of e.g. error pathways where we need to format Python strings from error values.

It would be interesting to see benchmarks compared to the existing pattern of formatting to string and then copying into a new Python string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The choice to write compat logic using private APIs of older Python versions makes me a little uneasy. Is the performance advance so significant that we can justify it?

Comment on lines +84 to +107
#[cfg(not(any(Py_LIMITED_API, PyPy)))]
impl fmt::Write for PyUnicodeWriter {
fn write_str(&mut self, s: &str) -> fmt::Result {
let result = unsafe {
PyUnicodeWriter_WriteUTF8(self.as_ptr(), s.as_ptr().cast(), s.len() as isize)
};
if result < 0 {
self.set_error();
Err(fmt::Error)
} else {
Ok(())
}
}

fn write_char(&mut self, c: char) -> fmt::Result {
let result = unsafe { PyUnicodeWriter_WriteChar(self.as_ptr(), c as u32) };
if result < 0 {
self.set_error();
Err(fmt::Error)
} else {
Ok(())
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider any buffering on the Rust side to avoid making hotly repeated FFI calls? Possibly a follow-up.

Comment on lines +22 to +28
/// This is like the `format!` macro, but it returns a `PyString` instead of a `String`.
#[macro_export]
macro_rules! py_format {
($py: expr, $($arg:tt)*) => {
$crate::types::PyString::from_fmt($py, format_args!($($arg)*))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to decide if we need this API if we already have PyString::from_fmt. I like the macro conciseness, though in the past we've had pushback against adding too many macros.

Interesting thing to note here is that format_args! has an as_str() const fn, so I wonder if there's a crazy way that we can make this automatically intern! the value if the format string is a constant 🤔

Comment on lines +30 to +35
#[cfg(not(any(Py_LIMITED_API, PyPy)))]
/// The `PyUnicodeWriter` is a utility for efficiently constructing Python strings
pub struct PyUnicodeWriter {
writer: NonNull<ffi::PyUnicodeWriter>,
last_error: Option<PyErr>,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage to having this as a public type rather than an implementation detail to PyString::from_fmt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants