Skip to content

Object header slimming: prefix allocation for ObjExt#7334

Merged
youknowone merged 6 commits intoRustPython:mainfrom
youknowone:worktree-obj-header-slim
Mar 5, 2026
Merged

Object header slimming: prefix allocation for ObjExt#7334
youknowone merged 6 commits intoRustPython:mainfrom
youknowone:worktree-obj-header-slim

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Mar 3, 2026

Extract dict, weak_list, and slots fields from PyInner into a separate ObjExt struct allocated as a prefix before PyInner using Layout::extend(). Objects that don't need these fields (int, str, float, list, tuple, dict, etc.) skip the prefix entirely.

  • Add HAS_WEAKREF flag to PyTypeFlags for per-type weakref control
  • Add HAS_EXT bit to GcBits indicating prefix presence
  • Define ObjExt struct with dict, weak_list, slots
  • Shrink PyInner header from ~80-88 bytes to ~32 bytes for lightweight objects
  • Update all accessor methods to go through ext_ref()
  • Update bootstrap type hierarchy to use prefix allocation
  • Add weakref getset descriptor for heap types
  • Set HAS_WEAKREF on builtin types that support weak references
  • Remove test_weak_keyed_bad_delitem expectedFailure (now passes)

Summary by CodeRabbit

  • New Features
    • Broad weak-reference support added to many built-in types (functions, methods, modules, generators/coroutines/async generators, collections, I/O types, memoryview, sets, properties, type objects and type-parameter classes), enabling creation of weak refs to those instances without changing runtime behavior.
  • Chores
    • Internal object layout and type initialization updated to consistently support weak-reference metadata.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduce HAS_WEAKREF across many pyclass declarations; add a prefixed ObjExt (dict, weak_list, slots) to object layout and update allocation/deallocation, type construction, weakref descriptor injection, and GC/traversal to use the ObjExt prefix. (36 words)

Changes

Cohort / File(s) Summary
Core object layout & traversal
crates/vm/src/object/core.rs, crates/vm/src/object/traverse_object.rs
Add ObjExt prefix (dict, weak_list, slots), EXT_OFFSET, and ext_ref() accessors; change allocation/deallocation to support prefixed objects; make traversal/GC conditional on ObjExt presence; update weakref/dict/slot accessors to use ext_ref.
Type system & slot handling
crates/vm/src/builtins/type.rs, crates/vm/src/types/slot.rs
Introduce HAS_WEAKREF PyType flag; propagate weakref capability from bases/MRO and slots; add weakref descriptor injection and related subtype_get/set_weakref; annotate PyType with HAS_WEAKREF; adjust slot validation and creation paths.
Builtin pyclass flags (many files)
crates/vm/src/builtins/...
asyncgenerator.rs, builtin_func.rs, classmethod.rs, code.rs, coroutine.rs, function.rs, generator.rs, genericalias.rs, memory.rs, module.rs, namespace.rs, property.rs, set.rs, staticmethod.rs, union.rs
Add HAS_WEAKREF to numerous builtin pyclass flags(...) annotations — metadata-only changes for many builtins; verify implications for allocation/ObjExt.
Stdlib pyclass flags (many files)
crates/stdlib/src/*, crates/vm/src/stdlib/*
_asyncio.rs, _sqlite3.rs, array.rs, mmap.rs, re.rs, io.rs, collections.rs, functools.rs, itertools.rs, sre.rs, thread.rs, typevar.rs
Add HAS_WEAKREF to many stdlib pyclass annotations (IO types, sqlite3, array, mmap, regex patterns, threading locks, typing-related types, deque, partial, itertools, etc.). Mostly annotation updates without behavior changes.
Stdlib IO (wide edits)
crates/vm/src/stdlib/io.rs
Extensive addition of HAS_WEAKREF to many IO-related pyclass flags (TextIOWrapper, FileIO, Buffered*, StringIO/BytesIO, etc.); metadata-only edits but intersect allocation/ObjExt behavior.
Type flags definition
crates/vm/src/types/slot.rs
Add new bitflag HAS_WEAKREF = 1 << 41 to PyTypeFlags.

Sequence Diagram

sequenceDiagram
    participant Py as Python code
    participant TypeSys as PyType constructor / type system
    participant MRO as Base types / MRO
    participant Alloc as VM allocator (alloc_type_with_ext / ObjExt)
    participant Obj as Instance
    participant GC as Garbage Collector

    Py->>TypeSys: define/prepare type (may include __slots__)
    TypeSys->>MRO: query bases for HAS_DICT / HAS_WEAKREF
    MRO-->>TypeSys: return inherited flags
    TypeSys->>Alloc: create PyType (set HAS_WEAKREF, install __weakref__ descriptor if needed)
    Py->>Obj: instantiate object of PyType
    Obj->>Alloc: allocate instance (with ObjExt prefix if HAS_WEAKREF/HAS_DICT)
    Alloc-->>Obj: initialize ObjExt (dict, weak_list, slots)
    Py->>Obj: create weakref to instance
    Obj->>Obj: weak_list updated via ext_ref()
    GC->>Obj: traverse ext_ref() fields when present
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested reviewers

  • arihant2math
  • ShaharNaveh

"I am a rabbit in allocation's glen,
I hop where ObjExt nests dicts again —
weakref drifts softly, slots find their place,
Types inherit kindness, objects gain space.
A tiny hop for GC, a joyful trace." 🐇

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Object header slimming: prefix allocation for ObjExt' clearly and specifically describes the main change—implementing prefix allocation (ObjExt) to reduce object header size.
Docstring Coverage ✅ Passed Docstring coverage is 98.44% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/dataclasses.py
[x] test: cpython/Lib/test/test_dataclasses (TODO: 4)

dependencies:

  • dataclasses

dependent tests: (90 tests)

  • dataclasses: test__colorize test_copy test_ctypes test_enum test_genericalias test_patma test_pprint test_pydoc test_regrtest test_typing test_zoneinfo
    • pprint: test_htmlparser test_ssl test_sys_setprofile test_unittest
      • pickle: test_annotationlib test_array test_ast test_asyncio test_bool test_builtin test_bytes test_bz2 test_codecs test_collections test_concurrent_futures test_configparser test_coroutines test_csv test_ctypes test_decimal test_defaultdict test_deque test_descr test_dict test_dictviews test_email test_enumerate test_exceptions test_fractions test_functools test_generators test_http_cookies test_importlib test_inspect test_io test_ipaddress test_iter test_itertools test_list test_logging test_lzma test_memoryio test_memoryview test_minidom test_opcache test_operator test_ordered_dict test_os test_pathlib test_pickle test_picklebuffer test_pickletools test_platform test_plistlib test_positional_only_arg test_posix test_random test_range test_re test_set test_shelve test_slice test_socket test_statistics test_str test_string test_time test_trace test_tuple test_type_aliases test_type_params test_types test_unittest test_uuid test_xml_dom_minicompat test_xml_etree test_zipfile test_zlib test_zoneinfo

[ ] test: cpython/Lib/test/test_descr.py (TODO: 44)
[ ] test: cpython/Lib/test/test_descrtut.py (TODO: 3)

dependencies:

dependent tests: (no tests depend on descr)

[x] lib: cpython/Lib/weakref.py
[x] lib: cpython/Lib/_weakrefset.py
[x] test: cpython/Lib/test/test_weakref.py (TODO: 20)
[ ] test: cpython/Lib/test/test_weakset.py

dependencies:

  • weakref

dependent tests: (203 tests)

  • weakref: test_array test_ast test_asyncio test_code test_concurrent_futures test_context test_contextlib test_copy test_ctypes test_deque test_descr test_dict test_enum test_exceptions test_file test_fileio test_frame test_functools test_gc test_generators test_genericalias test_importlib test_inspect test_io test_ipaddress test_itertools test_logging test_memoryio test_memoryview test_mmap test_ordered_dict test_pickle test_picklebuffer test_queue test_re test_scope test_set test_slice test_socket test_sqlite3 test_ssl test_struct test_sys test_tempfile test_thread test_threading test_threading_local test_type_params test_types test_typing test_unittest test_uuid test_weakref test_weakset test_xml_etree
    • asyncio: test_asyncio test_contextlib_async test_os test_sys_settrace test_unittest
    • bdb: test_bdb
    • concurrent.futures.process: test_compileall test_concurrent_futures
    • copy: test_bytes test_codecs test_collections test_copyreg test_coroutines test_csv test_decimal test_defaultdict test_dictviews test_email test_fractions test_http_cookies test_minidom test_opcache test_optparse test_platform test_plistlib test_posix test_site test_statistics test_sysconfig test_tomllib test_urllib2 test_xml_dom_minicompat test_zlib
      • argparse: test_argparse
      • collections: test_annotationlib test_bisect test_builtin test_c_locale_coercion test_call test_configparser test_contains test_ctypes test_exception_group test_fileinput test_funcattrs test_hash test_httpservers test_iter test_iterlen test_json test_math test_monitoring test_pathlib test_patma test_pprint test_pydoc test_random test_reprlib test_richcmp test_shelve test_sqlite3 test_string test_traceback test_tuple test_urllib test_userdict test_userlist test_userstring test_with
      • dataclasses: test__colorize test_ctypes test_regrtest test_zoneinfo
      • email.generator: test_email
      • gettext: test_gettext test_tools
      • http.cookiejar: test_http_cookiejar
      • http.server: test_robotparser test_urllib2_localnet test_xmlrpc
      • logging.handlers: test_pkgutil
      • mailbox: test_mailbox
      • smtplib: test_smtplib test_smtpnet
      • tarfile: test_shutil test_tarfile
      • webbrowser: test_webbrowser
    • inspect: test_abc test_asyncgen test_buffer test_grammar test_ntpath test_operator test_posixpath test_signal test_type_annotations test_yield_from test_zipimport
      • ast: test_compile test_compiler_codegen test_dis test_fstring test_future_stmt test_type_comments test_ucn test_unparse
      • cmd: test_cmd
      • importlib.metadata: test_importlib
      • pkgutil: test_runpy
      • rlcompleter: test_rlcompleter
      • trace: test_trace
    • logging: test_hashlib test_support test_urllib2net
      • hashlib: test_hmac test_unicodedata
      • multiprocessing.util: test_concurrent_futures
      • venv: test_venv
    • multiprocessing: test_fcntl test_multiprocessing_main_handling
    • symtable: test_symtable
    • tempfile: test_bz2 test_cmd_line test_ctypes test_doctest test_ensurepip test_faulthandler test_filecmp test_importlib test_launcher test_linecache test_pkg test_py_compile test_selectors test_string_literals test_subprocess test_tabnanny test_termios test_threadedtempfile test_urllib_response test_winconsoleio test_zipapp test_zipfile test_zipfile64 test_zstd
      • ctypes.util: test_ctypes
      • urllib.request: test_sax test_urllibnet

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@youknowone youknowone force-pushed the worktree-obj-header-slim branch from c99a5ae to 8d9a54c Compare March 3, 2026 20:40
@youknowone youknowone marked this pull request as ready for review March 4, 2026 04:51
@youknowone youknowone force-pushed the worktree-obj-header-slim branch from 8d9a54c to dc0dec0 Compare March 4, 2026 04:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/object/core.rs (1)

281-318: ⚠️ Potential issue | 🟠 Major

Use Layout::extend to compute ObjExt back-offset instead of a fixed constant

The allocation code correctly uses Layout::extend to compute inner_offset when allocating ObjExt + PyInner. However, ext_ref() uses a hard-coded EXT_OFFSET constant based on size_of::<ObjExt>(), creating a mismatch. While the current code works because ObjExt has #[repr(C, align(8))], this dependency on implementation details is fragile. If the alignment of PyInner<T> changes (e.g., from a future generic payload with higher alignment), Layout::extend will add padding between ObjExt and PyInner, making the constant offset incorrect and causing undefined behavior in ext_ref().

Compute the offset dynamically in ext_ref() to stay in sync with the allocation logic.

Proposed fix
-const EXT_OFFSET: usize = core::mem::size_of::<ObjExt>();

 impl<T> PyInner<T> {
     /// Access the ObjExt prefix at a negative offset from this PyInner.
     pub(super) fn ext_ref(&self) -> Option<&ObjExt> {
         if !GcBits::from_bits_retain(self.gc_bits.load(Ordering::Relaxed)).contains(GcBits::HAS_EXT) {
             return None;
         }
+        let (_, inner_offset) = core::alloc::Layout::new::<ObjExt>()
+            .extend(core::alloc::Layout::new::<Self>())
+            .expect("ObjExt + PyInner layout overflow");
         let self_addr = (self as *const Self as *const u8).addr();
-        let ext_ptr = core::ptr::with_exposed_provenance::<ObjExt>(self_addr.wrapping_sub(EXT_OFFSET));
+        let ext_ptr = core::ptr::with_exposed_provenance::<ObjExt>(self_addr.wrapping_sub(inner_offset));
         Some(unsafe { &*ext_ptr })
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/object/core.rs` around lines 281 - 318, ext_ref() currently
uses the hard-coded EXT_OFFSET (size_of::<ObjExt>()) which can be wrong if
Layout::extend inserted padding during allocation; change ext_ref() to recompute
the true back-offset using the same allocation logic by calling
Layout::new::<ObjExt>().extend(Layout::new::<Self>()) (or equivalent) to obtain
the inner offset used when allocating ObjExt+PyInner, then use that computed
offset instead of EXT_OFFSET when calling with_exposed_provenance; reference
ext_ref(), EXT_OFFSET, ObjExt, PyInner, and with_exposed_provenance in your
changes and handle the Result/Err from Layout::extend appropriately (unwrap or
early-return) so the computed offset stays in sync with allocation.
🧹 Nitpick comments (2)
crates/vm/src/builtins/type.rs (1)

1889-1927: Extract shared descriptor-injection logic for __dict__/__weakref__.

The two branches are nearly identical (presence checks + inherited check + getset insertion). A small helper would reduce drift risk and improve maintainability.

As per coding guidelines: "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/builtins/type.rs` around lines 1889 - 1927, The two blocks that
inject a getset descriptor for "__dict__" and "__weakref__" duplicate the same
existence/inheritance checks and insertion logic; extract a small helper (e.g.,
inject_descriptor_if_missing) that takes the VM/context (vm or vm.ctx), the type
object (typ), the interned name (or name string), the getter and setter
functions (subtype_get_dict/subtype_set_dict or
subtype_get_weakref/subtype_set_weakref) and performs: check
typ.slots.flags.has_feature for the relevant PyTypeFlags value, check inherited
presence via typ.mro.read().iter().any(...), check
typ.attributes.read().contains_key(...), and if absent create the descriptor via
vm.ctx.new_getset and insert it into typ.attributes.write().insert(...); then
replace both branches with calls to this helper using the appropriate name,
getter and setter and flag test (PyTypeFlags::HAS_DICT /
PyTypeFlags::HAS_WEAKREF) to remove duplication while preserving behavior.
crates/vm/src/object/core.rs (1)

2123-2147: Extract common bootstrap-type initialization to reduce unsafe duplication

Line 2123-2147 repeats nearly identical unsafe initialization for type_type_ptr and object_type_ptr with only payload differences. A tiny helper would reduce drift risk in this memory-sensitive path.

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/object/core.rs` around lines 2123 - 2147, The two unsafe blocks
initializing type_type_ptr and object_type_ptr are duplicated; create a small
helper (e.g., init_type_with_ext or write_type_ptr) that calls
alloc_type_with_ext(), takes the differing payload (type_payload or
object_payload) and the Uninit { typ } marker, performs the unsafe
partially_init! write of a PyInner::<PyType> (setting ref_count, vtable via
PyObjVTable::of::<PyType>(), gc_bits, and payload) and returns the allocated
pointer; then replace the duplicated unsafe blocks by calling this helper twice
with the appropriate payloads (type_payload and object_payload) to centralize
the unsafe logic and avoid drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/object/core.rs`:
- Around line 1061-1074: In get_weakrefs(), you must acquire the weakref list's
stripe lock before dereferencing head_ptr to avoid racing with
add/clear/drop_inner; after obtaining wrl via weak_ref_list(), compute the
stripe index for head_ptr (or use the existing stripe-lock API on the weakref
list), lock that stripe, re-read head_ptr and only then dereference into head
and attempt the ref_count.safe_inc(), then release the stripe lock; update the
function to use the weakref list's stripe locking primitives around the head_ptr
load/dereference (referencing get_weakrefs, weak_ref_list, head_ptr, and head).

---

Outside diff comments:
In `@crates/vm/src/object/core.rs`:
- Around line 281-318: ext_ref() currently uses the hard-coded EXT_OFFSET
(size_of::<ObjExt>()) which can be wrong if Layout::extend inserted padding
during allocation; change ext_ref() to recompute the true back-offset using the
same allocation logic by calling
Layout::new::<ObjExt>().extend(Layout::new::<Self>()) (or equivalent) to obtain
the inner offset used when allocating ObjExt+PyInner, then use that computed
offset instead of EXT_OFFSET when calling with_exposed_provenance; reference
ext_ref(), EXT_OFFSET, ObjExt, PyInner, and with_exposed_provenance in your
changes and handle the Result/Err from Layout::extend appropriately (unwrap or
early-return) so the computed offset stays in sync with allocation.

---

Nitpick comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 1889-1927: The two blocks that inject a getset descriptor for
"__dict__" and "__weakref__" duplicate the same existence/inheritance checks and
insertion logic; extract a small helper (e.g., inject_descriptor_if_missing)
that takes the VM/context (vm or vm.ctx), the type object (typ), the interned
name (or name string), the getter and setter functions
(subtype_get_dict/subtype_set_dict or subtype_get_weakref/subtype_set_weakref)
and performs: check typ.slots.flags.has_feature for the relevant PyTypeFlags
value, check inherited presence via typ.mro.read().iter().any(...), check
typ.attributes.read().contains_key(...), and if absent create the descriptor via
vm.ctx.new_getset and insert it into typ.attributes.write().insert(...); then
replace both branches with calls to this helper using the appropriate name,
getter and setter and flag test (PyTypeFlags::HAS_DICT /
PyTypeFlags::HAS_WEAKREF) to remove duplication while preserving behavior.

In `@crates/vm/src/object/core.rs`:
- Around line 2123-2147: The two unsafe blocks initializing type_type_ptr and
object_type_ptr are duplicated; create a small helper (e.g., init_type_with_ext
or write_type_ptr) that calls alloc_type_with_ext(), takes the differing payload
(type_payload or object_payload) and the Uninit { typ } marker, performs the
unsafe partially_init! write of a PyInner::<PyType> (setting ref_count, vtable
via PyObjVTable::of::<PyType>(), gc_bits, and payload) and returns the allocated
pointer; then replace the duplicated unsafe blocks by calling this helper twice
with the appropriate payloads (type_payload and object_payload) to centralize
the unsafe logic and avoid drift.

ℹ️ Review info
Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: f9b2f09a-8121-4596-9e26-58a6f2a80429

📥 Commits

Reviewing files that changed from the base of the PR and between bc108df and 8d9a54c.

⛔ Files ignored due to path filters (3)
  • Lib/test/test_dataclasses/__init__.py is excluded by !Lib/**
  • Lib/test/test_descr.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
📒 Files selected for processing (31)
  • crates/stdlib/src/_asyncio.rs
  • crates/stdlib/src/_sqlite3.rs
  • crates/stdlib/src/array.rs
  • crates/stdlib/src/mmap.rs
  • crates/stdlib/src/re.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/builtins/builtin_func.rs
  • crates/vm/src/builtins/classmethod.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/builtins/coroutine.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/generator.rs
  • crates/vm/src/builtins/genericalias.rs
  • crates/vm/src/builtins/memory.rs
  • crates/vm/src/builtins/module.rs
  • crates/vm/src/builtins/namespace.rs
  • crates/vm/src/builtins/property.rs
  • crates/vm/src/builtins/set.rs
  • crates/vm/src/builtins/staticmethod.rs
  • crates/vm/src/builtins/type.rs
  • crates/vm/src/builtins/union.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/collections.rs
  • crates/vm/src/stdlib/functools.rs
  • crates/vm/src/stdlib/io.rs
  • crates/vm/src/stdlib/itertools.rs
  • crates/vm/src/stdlib/sre.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/stdlib/typevar.rs
  • crates/vm/src/types/slot.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/vm/src/object/core.rs (1)

1062-1070: ⚠️ Potential issue | 🔴 Critical

Take the weakref stripe lock before dereferencing head_ptr.

Line 1064 loads and Line 1068 dereferences wrl.head without synchronization, which can race with list mutation/unlink and read freed memory.

🔒 Suggested fix
 pub(crate) fn get_weakrefs(&self) -> Option<PyObjectRef> {
     let wrl = self.weak_ref_list()?;
+    let _lock = weakref_lock::lock(self as *const PyObject as usize);
     let head_ptr = wrl.head.load(Ordering::Relaxed);
     if head_ptr.is_null() {
         None
     } else {
         let head = unsafe { &*head_ptr };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/object/core.rs` around lines 1062 - 1070, get_weakrefs
currently loads wrl.head into head_ptr and then dereferences it without
synchronization, risking a race with list mutation/unlink; fix by acquiring the
weakref stripe lock used for list mutations before reading/dereferencing
wrl.head (use the same stripe-lock API on the wrl returned by weak_ref_list()),
re-check head_ptr/is_null after acquiring the lock, and only then call unsafe {
&*head_ptr } and PyRef::from_raw(head_ptr); ensure the lock is released after
these checks so you hold the same lock that protects unlinking/mutation during
the deref.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/object/core.rs`:
- Around line 281-285: The const EXT_OFFSET calculated as size_of::<ObjExt>() is
unsafe for types where PyInner<T> requires greater alignment; change ext_ref()
to compute the prefix offset dynamically using
core::alloc::Layout::new::<ObjExt>() and Layout::new::<PyInner<T>>() with
Layout::extend() (matching the allocation/deallocation code) instead of using
EXT_OFFSET, so pointer arithmetic from PyInner<T> back to ObjExt uses the
extended layout.offset() result; update references in ext_ref() to use that
dynamic offset and keep ObjExt, PyInner<T>, and allocation/deallocation logic
consistent.

---

Duplicate comments:
In `@crates/vm/src/object/core.rs`:
- Around line 1062-1070: get_weakrefs currently loads wrl.head into head_ptr and
then dereferences it without synchronization, risking a race with list
mutation/unlink; fix by acquiring the weakref stripe lock used for list
mutations before reading/dereferencing wrl.head (use the same stripe-lock API on
the wrl returned by weak_ref_list()), re-check head_ptr/is_null after acquiring
the lock, and only then call unsafe { &*head_ptr } and
PyRef::from_raw(head_ptr); ensure the lock is released after these checks so you
hold the same lock that protects unlinking/mutation during the deref.

ℹ️ Review info
Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 0080d147-cf66-4742-a65f-8a6ee6476723

📥 Commits

Reviewing files that changed from the base of the PR and between 8d9a54c and dc0dec0.

⛔ Files ignored due to path filters (3)
  • Lib/test/test_dataclasses/__init__.py is excluded by !Lib/**
  • Lib/test/test_descr.py is excluded by !Lib/**
  • Lib/test/test_weakref.py is excluded by !Lib/**
📒 Files selected for processing (31)
  • crates/stdlib/src/_asyncio.rs
  • crates/stdlib/src/_sqlite3.rs
  • crates/stdlib/src/array.rs
  • crates/stdlib/src/mmap.rs
  • crates/stdlib/src/re.rs
  • crates/vm/src/builtins/asyncgenerator.rs
  • crates/vm/src/builtins/builtin_func.rs
  • crates/vm/src/builtins/classmethod.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/builtins/coroutine.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/generator.rs
  • crates/vm/src/builtins/genericalias.rs
  • crates/vm/src/builtins/memory.rs
  • crates/vm/src/builtins/module.rs
  • crates/vm/src/builtins/namespace.rs
  • crates/vm/src/builtins/property.rs
  • crates/vm/src/builtins/set.rs
  • crates/vm/src/builtins/staticmethod.rs
  • crates/vm/src/builtins/type.rs
  • crates/vm/src/builtins/union.rs
  • crates/vm/src/object/core.rs
  • crates/vm/src/object/traverse_object.rs
  • crates/vm/src/stdlib/collections.rs
  • crates/vm/src/stdlib/functools.rs
  • crates/vm/src/stdlib/io.rs
  • crates/vm/src/stdlib/itertools.rs
  • crates/vm/src/stdlib/sre.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/vm/src/stdlib/typevar.rs
  • crates/vm/src/types/slot.rs
🚧 Files skipped from review as they are similar to previous changes (17)
  • crates/vm/src/object/traverse_object.rs
  • crates/stdlib/src/_sqlite3.rs
  • crates/vm/src/stdlib/collections.rs
  • crates/vm/src/builtins/memory.rs
  • crates/stdlib/src/_asyncio.rs
  • crates/vm/src/types/slot.rs
  • crates/vm/src/stdlib/functools.rs
  • crates/vm/src/builtins/staticmethod.rs
  • crates/vm/src/builtins/property.rs
  • crates/vm/src/builtins/set.rs
  • crates/vm/src/stdlib/thread.rs
  • crates/stdlib/src/mmap.rs
  • crates/vm/src/builtins/function.rs
  • crates/stdlib/src/array.rs
  • crates/vm/src/builtins/code.rs
  • crates/vm/src/stdlib/typevar.rs
  • crates/vm/src/builtins/generator.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/vm/src/object/core.rs (1)

281-323: ⚠️ Potential issue | 🟠 Major

Derive ObjExt offset from Layout::extend inside ext_ref() instead of hardcoding EXT_OFFSET.

Line 285 assumes zero inter-padding universally, but Line 289 only proves alignment against PyInner<()>. If any PyInner<T> has stricter alignment, ext_ref() can compute the wrong prefix address.

Suggested fix
-const EXT_OFFSET: usize = core::mem::size_of::<ObjExt>();
-// Guarantee: ObjExt size is a multiple of its alignment, and its alignment
-// is >= any PyInner alignment, so Layout::extend produces no inter-padding.
-const _: () = assert!(core::mem::size_of::<ObjExt>() % core::mem::align_of::<ObjExt>() == 0);
-const _: () = assert!(core::mem::align_of::<ObjExt>() >= core::mem::align_of::<PyInner<()>>());
+const _: () = assert!(core::mem::size_of::<ObjExt>() % core::mem::align_of::<ObjExt>() == 0);

 impl<T> PyInner<T> {
@@
     pub(super) fn ext_ref(&self) -> Option<&ObjExt> {
         if !GcBits::from_bits_retain(self.gc_bits.load(Ordering::Relaxed)).contains(GcBits::HAS_EXT)
         {
             return None;
         }
+        let inner_layout = core::alloc::Layout::new::<Self>();
+        let ext_layout = core::alloc::Layout::new::<ObjExt>();
+        let (_, inner_offset) = ext_layout
+            .extend(inner_layout)
+            .expect("ObjExt + PyInner layout overflow");
         let self_addr = (self as *const Self as *const u8).addr();
-        let ext_ptr =
-            core::ptr::with_exposed_provenance::<ObjExt>(self_addr.wrapping_sub(EXT_OFFSET));
+        let ext_ptr = core::ptr::with_exposed_provenance::<ObjExt>(
+            self_addr.wrapping_sub(inner_offset),
+        );
         Some(unsafe { &*ext_ptr })
     }
 }
#!/bin/bash
set -euo pipefail

echo "== Potential >8-byte alignment payload indicators =="
rg -nP --type rust 'repr\s*\(\s*align\((?:1[6-9]|[2-9][0-9])\)\s*\)|\b(u128|i128|__m\d+)\b' crates/vm/src -C1 || true

echo
echo "== PyPayload impls to audit for effective alignment =="
rg -nP --type rust 'impl\s+.*\bPyPayload\b\s+for\s+' crates/vm/src -C1

echo
echo "== Current ext-offset derivation sites =="
rg -nP --type rust 'EXT_OFFSET|with_exposed_provenance|Layout::extend' crates/vm/src/object/core.rs -C2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/object/core.rs` around lines 281 - 323, The ext_ref() method
must compute the ObjExt prefix offset using Layout::extend at runtime for the
actual PyInner<T> instead of using the hardcoded EXT_OFFSET; change ext_ref() to
compute let ( _objext_layout, offset) =
core::alloc::Layout::new::<ObjExt>().extend(core::alloc::Layout::new::<PyInner<T>>()).unwrap();
then use self_addr.wrapping_sub(offset) with
core::ptr::with_exposed_provenance::<ObjExt>(...) to get ext_ptr; update
references to EXT_OFFSET in ext_ref() (and remove or keep the const as unused)
so alignment/padding for the concrete T is respected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/vm/src/object/core.rs`:
- Around line 281-323: The ext_ref() method must compute the ObjExt prefix
offset using Layout::extend at runtime for the actual PyInner<T> instead of
using the hardcoded EXT_OFFSET; change ext_ref() to compute let (
_objext_layout, offset) =
core::alloc::Layout::new::<ObjExt>().extend(core::alloc::Layout::new::<PyInner<T>>()).unwrap();
then use self_addr.wrapping_sub(offset) with
core::ptr::with_exposed_provenance::<ObjExt>(...) to get ext_ptr; update
references to EXT_OFFSET in ext_ref() (and remove or keep the const as unused)
so alignment/padding for the concrete T is respected.

ℹ️ Review info
Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: c232500b-f2de-4809-a14d-d9bdcf433963

📥 Commits

Reviewing files that changed from the base of the PR and between dc0dec0 and 273fa80.

📒 Files selected for processing (1)
  • crates/vm/src/object/core.rs

Comment on lines +890 to +896
let needs_ext = dict.is_some()
|| typ
.slots
.flags
.has_feature(crate::types::PyTypeFlags::HAS_WEAKREF)
|| member_count > 0;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, verify the file exists and examine the specific lines mentioned
wc -l crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 98


🏁 Script executed:

# View the context around lines 890-896
sed -n '880,905p' crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 1149


🏁 Script executed:

# View the context around lines 1195-1213
sed -n '1185,1225p' crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 1280


🏁 Script executed:

# Run the audit commands from the review comment
echo "== Call sites passing None as dict arg to PyRef::new_ref =="
rg -nPU --type rust 'PyRef::new_ref\(\s*[^,]+,\s*[^,]+,\s*None\s*\)' crates/vm/src -C2 || true

echo
echo "== Constructors that pass dict through to PyRef::new_ref (manual HAS_DICT audit) =="
rg -nP --type rust '\bPyRef::new_ref\s*\(' crates/vm/src -C2 | head -100

Repository: RustPython/RustPython

Length of output: 9946


🏁 Script executed:

# Find where HAS_DICT flag is set and which types use it
rg -nP 'HAS_DICT|has_dict' crates/vm/src --type rust -C2 | head -150

Repository: RustPython/RustPython

Length of output: 8207


🏁 Script executed:

# Check PyModule definition - does it have HAS_DICT?
rg -nP 'class.*module|impl.*PyModule' crates/vm/src --type rust -A5 -B2 | head -80

Repository: RustPython/RustPython

Length of output: 4798


🏁 Script executed:

# Check exception types - do they have HAS_DICT?
rg -nP 'HAS_DICT.*flags|flags.*HAS_DICT' crates/vm/src --type rust -B5 -A2 | head -100

Repository: RustPython/RustPython

Length of output: 5872


🏁 Script executed:

# Check if there are any HAS_DICT types that might be allocated with None
# Let's look at how PyModule is allocated - does it use new_ref or something else?
rg -nP 'PyModule|module_type' crates/vm/src/vm/vm_new.rs -A3 -B3 | head -60

Repository: RustPython/RustPython

Length of output: 723


🏁 Script executed:

# Let's check the PyObjectPayload trait and see how it handles dict allocation
cat -n crates/vm/src/object/payload.rs | sed -n '55,70p'

Repository: RustPython/RustPython

Length of output: 528


🏁 Script executed:

# Check if there are any direct calls to PyInner::new that bypass the payload layer
rg -nP 'PyInner::new\s*\(' crates/vm/src --type rust -C2 | head -40

Repository: RustPython/RustPython

Length of output: 458


🏁 Script executed:

# Check what happens when set_dict() is called on object without dict support
# Looking at the instance_dict and set_dict implementation more carefully
sed -n '1195,1225p' crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 1013


🏁 Script executed:

# Check if ext_ref() returns None when needs_ext was false
sed -n '920,970p' crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 1356


🏁 Script executed:

# Let's verify what happens in the allocation when needs_ext is false
sed -n '890,945p' crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 2151


🏁 Script executed:

# Let's check if there are actual HAS_DICT types being allocated with None
# First, check what classes have HAS_DICT flag
rg -nP 'HAS_DICT' crates/vm/src --type rust -B2 -A2 | grep -E 'pyclass|flags' | head -40

Repository: RustPython/RustPython

Length of output: 3068


🏁 Script executed:

# Now let's specifically check if any places allocate HAS_DICT types with dict=None
# Look at heap type creation (which auto-adds HAS_DICT)
sed -n '545,560p' crates/vm/src/vm/context.rs

Repository: RustPython/RustPython

Length of output: 584


🏁 Script executed:

# Check ext_ref implementation to understand when it returns None
rg -nP 'fn ext_ref' crates/vm/src/object/core.rs -A10

Repository: RustPython/RustPython

Length of output: 534


🏁 Script executed:

# Let's check if there are any other callers of PyRef::new_ref that might bypass the payload trait
rg -nP 'PyRef::new_ref' crates/vm/src --type rust | wc -l

Repository: RustPython/RustPython

Length of output: 67


🏁 Script executed:

# Check for exception allocation to see if they all properly pass dicts
sed -n '100,130p' crates/vm/src/vm/vm_new.rs

Repository: RustPython/RustPython

Length of output: 1321


🏁 Script executed:

# Let's trace through one complete example - exception allocation
rg -nP 'new_exception|PyBaseException::new' crates/vm/src/vm/vm_new.rs -A5 | head -50

Repository: RustPython/RustPython

Length of output: 1998


🏁 Script executed:

# Let's check if all the places that should use PayloadObjectPayload::_into_ref actually do
rg -nP '\.into_ref\(' crates/vm/src --type rust | head -20

Repository: RustPython/RustPython

Length of output: 1654


🏁 Script executed:

# Check how heap types actually allocate instances - is there a standard path?
rg -nP 'construct|__new__|__init__' crates/vm/src/types/slot.rs -A3 -B3 | head -60

Repository: RustPython/RustPython

Length of output: 2084


🏁 Script executed:

# Let's understand the full picture - check new_base_object usage and if exceptions follow it
rg -nP 'new_base_object' crates/vm/src --type rust -B2 -A2

Repository: RustPython/RustPython

Length of output: 2054


🏁 Script executed:

# Let's verify the exact code at the two locations mentioned: lines 890-896 and 1195-1213
echo "=== Lines 890-896 ==="
sed -n '890,896p' crates/vm/src/object/core.rs

echo
echo "=== Lines 1195-1213 ==="
sed -n '1195,1213p' crates/vm/src/object/core.rs

Repository: RustPython/RustPython

Length of output: 918


🏁 Script executed:

# Check if there are any other places where a HAS_DICT type could be allocated with None
# Look for all pyclass declarations with HAS_DICT to see if they're always properly initialized
rg -nP '#\[pyclass\(.*HAS_DICT' crates/vm/src --type rust -A8 | grep -E 'pyclass|into_ref|new_ref|_into_ref' | head -50

Repository: RustPython/RustPython

Length of output: 1082


Add debug-time assertion to enforce the HAS_DICT allocation invariant.

When PyInner::new() allocates an object, needs_ext is computed based on dict.is_some(), but types with the HAS_DICT flag must always receive an initialized instance dict. If a HAS_DICT type is allocated with dict = None, instance_dict() and set_dict() will silently fail. The high-level APIs like new_base_object() have protection, but direct calls to PyRef::new_ref() bypass these checks.

Suggested guard (debug-time invariant)
     fn new(payload: T, typ: PyTypeRef, dict: Option<PyDictRef>) -> *mut Self {
         let member_count = typ.slots.member_count;
+        let has_dict_flag = typ
+            .slots
+            .flags
+            .has_feature(crate::types::PyTypeFlags::HAS_DICT);
+        debug_assert!(
+            !has_dict_flag || dict.is_some(),
+            "HAS_DICT types must pass an initialized instance dict when allocating"
+        );
         let needs_ext = dict.is_some()
             || typ
                 .slots
                 .flags
                 .has_feature(crate::types::PyTypeFlags::HAS_WEAKREF)
             || member_count > 0;

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin worktree-obj-header-slim

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
crates/vm/src/object/core.rs (2)

890-897: ⚠️ Potential issue | 🟠 Major

Enforce HAS_DICT allocation invariant at the constructor boundary.

needs_ext is currently derived from dict.is_some(), so a direct PyRef::new_ref(..., dict=None) for a HAS_DICT type can allocate without dict/ext and make instance_dict()/set_dict() fail unexpectedly. Add a debug assertion in PyInner<T>::new.

Proposed fix
     fn new(payload: T, typ: PyTypeRef, dict: Option<PyDictRef>) -> *mut Self {
         let member_count = typ.slots.member_count;
+        let has_dict_flag = typ
+            .slots
+            .flags
+            .has_feature(crate::types::PyTypeFlags::HAS_DICT);
+        debug_assert!(
+            !has_dict_flag || dict.is_some(),
+            "HAS_DICT types must pass an initialized instance dict when allocating"
+        );
         let needs_ext = dict.is_some()
             || typ
                 .slots
                 .flags
                 .has_feature(crate::types::PyTypeFlags::HAS_WEAKREF)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/object/core.rs` around lines 890 - 897, The constructor
currently derives needs_ext from dict.is_some(), which allows creating a PyInner
for a type with PyTypeFlags::HAS_DICT without providing a dict; add a debug
assertion inside PyInner<T>::new that enforces the allocation invariant: if
typ.slots.flags.has_feature(crate::types::PyTypeFlags::HAS_DICT) then
dict.is_some() (or equivalent) must be true; update the assertion location in
PyInner::new so it runs before using needs_ext/allocating extensions and
reference the existing typ.slots.flags and dict parameters to locate where to
add the check.

281-285: ⚠️ Potential issue | 🔴 Critical

Hardcoded EXT_OFFSET can misaddress ObjExt for stricter PyInner<T> alignment.

PyInner<T>::new/PyInner<T>::dealloc correctly use Layout::extend, but ext_ref() subtracts size_of::<ObjExt>() directly. That offset is only guaranteed when no inter-layout padding is inserted; this can silently break pointer arithmetic if a payload raises PyInner<T> alignment.

Proposed fix
-const EXT_OFFSET: usize = core::mem::size_of::<ObjExt>();
-
 impl<T> PyInner<T> {
@@
     pub(super) fn ext_ref(&self) -> Option<&ObjExt> {
         if !GcBits::from_bits_retain(self.gc_bits.load(Ordering::Relaxed)).contains(GcBits::HAS_EXT)
         {
             return None;
         }
         let self_addr = (self as *const Self as *const u8).addr();
-        let ext_ptr =
-            core::ptr::with_exposed_provenance::<ObjExt>(self_addr.wrapping_sub(EXT_OFFSET));
+        let (_, inner_offset) = core::alloc::Layout::new::<ObjExt>()
+            .extend(core::alloc::Layout::new::<Self>())
+            .expect("ObjExt + PyInner layout overflow");
+        let ext_ptr = core::ptr::with_exposed_provenance::<ObjExt>(
+            self_addr.wrapping_sub(inner_offset),
+        );
         Some(unsafe { &*ext_ptr })
     }
 }
Does Rust `core::alloc::Layout::extend` insert padding before the second layout when it has stricter alignment than the first (so offset may be greater than `size_of::<prefix>()`)?

Also applies to: 316-324

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/object/core.rs` around lines 281 - 285, EXT_OFFSET is hardcoded
to size_of::<ObjExt>(), which can be wrong if PyInner<T> has stricter alignment
than ObjExt; update ext_ref() (and the analogous code at 316-324) to compute the
prefix->payload offset using Layout::extend rather than assuming no padding.
Specifically, replace uses of EXT_OFFSET (and the const) with a computed offset
obtained by creating Layout::new::<ObjExt>() and calling
.extend(Layout::new::<PyInner<T>>()) (or otherwise using Layout::extend for the
actual payload type) and use the returned .0.size()/.1 or the computed offset to
do the pointer subtraction so pointer arithmetic matches what PyInner::new and
PyInner::dealloc use.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/vm/src/object/core.rs`:
- Around line 890-897: The constructor currently derives needs_ext from
dict.is_some(), which allows creating a PyInner for a type with
PyTypeFlags::HAS_DICT without providing a dict; add a debug assertion inside
PyInner<T>::new that enforces the allocation invariant: if
typ.slots.flags.has_feature(crate::types::PyTypeFlags::HAS_DICT) then
dict.is_some() (or equivalent) must be true; update the assertion location in
PyInner::new so it runs before using needs_ext/allocating extensions and
reference the existing typ.slots.flags and dict parameters to locate where to
add the check.
- Around line 281-285: EXT_OFFSET is hardcoded to size_of::<ObjExt>(), which can
be wrong if PyInner<T> has stricter alignment than ObjExt; update ext_ref() (and
the analogous code at 316-324) to compute the prefix->payload offset using
Layout::extend rather than assuming no padding. Specifically, replace uses of
EXT_OFFSET (and the const) with a computed offset obtained by creating
Layout::new::<ObjExt>() and calling .extend(Layout::new::<PyInner<T>>()) (or
otherwise using Layout::extend for the actual payload type) and use the returned
.0.size()/.1 or the computed offset to do the pointer subtraction so pointer
arithmetic matches what PyInner::new and PyInner::dealloc use.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: c24b9ffb-f67a-45cc-a23b-07663bfa7d2b

📥 Commits

Reviewing files that changed from the base of the PR and between 273fa80 and 9013af9.

📒 Files selected for processing (1)
  • crates/vm/src/object/core.rs

@youknowone youknowone force-pushed the worktree-obj-header-slim branch from 9013af9 to 7b83655 Compare March 4, 2026 09:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
crates/vm/src/object/core.rs (1)

279-289: ⚠️ Potential issue | 🟠 Major

EXT_OFFSET still assumes no extra Layout::extend padding for all payload alignments.

Line 288 only proves the constraint for PyInner<()>, not for all PyInner<T>. If any payload has alignment > 8, alloc/dealloc offsets (computed via Layout::extend) can diverge from Line 339’s fixed subtraction. This was already flagged earlier and still applies here.

#!/bin/bash
set -euo pipefail

echo "== Candidate high-alignment declarations in vm =="
rg -nP --type rust '#\s*\[\s*repr\s*\(\s*align\((?:[9-9]|[1-9][0-9]+)\)\s*\)\s*\]' crates/vm/src || true

echo
echo "== Payload candidates that often imply >8 alignment =="
rg -nP --type rust '\b(u128|i128|core::arch::|std::arch::|__m\d+)\b' crates/vm/src || true

echo
echo "== PyPayload implementations =="
rg -nP --type rust 'impl\s+.*\bPyPayload\b\s+for\s+' crates/vm/src -C1

Expected verification result: either (1) a hard guarantee that all payload alignments are <= align_of::<ObjExt>(), or (2) ext pointer recovery that does not rely on a fixed constant offset.

Also applies to: 333-340

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/object/core.rs`:
- Around line 1506-1511: gc_clear_raw() currently only drains ext.slots but
gc_has_clear() advertises clearing when ext.dict.is_some(), so dict edges are
left intact; modify gc_clear_raw (the block starting with obj.0.ext_ref()) to
also remove and push the dict entry when ext.dict is Some(...) — i.e., if let
Some(dict) = ext.dict.take() { result.push(dict); } (or equivalent removal) so
that dict references are cleared consistently with gc_has_clear(); ensure you
reference ext_ref(), ext.slots, and ext.dict in the same function.
- Around line 311-321: The class-reassignment validation in set___class__() only
compares basicsize and HAS_DICT, but has_ext() (used by ext_ref() and dealloc())
depends on HAS_DICT, HAS_WEAKREF, and typ.slots.member_count; to avoid
allocation-layout mismatch, update set___class__() to also require the new
type's HAS_WEAKREF flag equals the old type's HAS_WEAKREF and
typ.slots.member_count equality (or alternatively record the object's ext
presence at allocation and make it immutable); implement the checks in
set___class__() and return a clear TypeError if either HAS_WEAKREF or
member_count differ so has_ext() remains invariant across a type swap.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: a3baf0a5-40cf-41de-94a0-b91c04dd0ebc

📥 Commits

Reviewing files that changed from the base of the PR and between 9013af9 and 7b83655.

📒 Files selected for processing (1)
  • crates/vm/src/object/core.rs

@youknowone youknowone force-pushed the worktree-obj-header-slim branch 5 times, most recently from e12914f to d86ddd5 Compare March 4, 2026 17:04
# gh-101634: Catch UnicodeDecodeError if stdout cannot be
# decoded from encoding
err_msg = f"Failed to parse worker process JSON: {exc}"
# TODO: RUSTPYTHON; include JSON for debugging flaky worker bugs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert test code

tuple(loc) for loc in data['covered_lines']
]
return TestResult(**data)
# TODO: RUSTPYTHON; diagnostics for flaky worker bug
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert test code

@youknowone youknowone force-pushed the worktree-obj-header-slim branch 3 times, most recently from 64bd634 to 158f155 Compare March 5, 2026 15:16
Extract dict, weak_list, and slots fields from PyInner<T> into a
separate ObjExt struct allocated as a prefix before PyInner using
Layout::extend(). Objects that don't need these fields (int, str,
float, list, tuple, dict, etc.) skip the prefix entirely.

- Add HAS_WEAKREF flag to PyTypeFlags for per-type weakref control
- Add HAS_EXT bit to GcBits indicating prefix presence
- Define ObjExt struct with dict, weak_list, slots
- Shrink PyInner header from ~80-88 bytes to ~32 bytes for lightweight objects
- Update all accessor methods to go through ext_ref()
- Update bootstrap type hierarchy to use prefix allocation
- Add __weakref__ getset descriptor for heap types
- Set HAS_WEAKREF on builtin types that support weak references
- Remove test_weak_keyed_bad_delitem expectedFailure (now passes)
- Add HAS_WEAKREF flag to PyFuture and PyTask (matches CPython)
- Rename subtype_getweakref/setweakref to subtype_get_weakref/set_weakref
  to fix cspell unknown word lint
…kers

- Add HAS_WEAKREF to PyArray and PyDeque (matches CPython)
- Add HAS_WEAKREF to PyItertoolsGrouper (internal use by groupby)
- Remove 6 expectedFailure markers from test_dataclasses for weakref/slots
  tests that now pass
…, typevar types

Add HAS_WEAKREF flag to built-in types that support weakref:
- PyCode, PyUnion, PyPartial, Lock, RLock
- All IO base/concrete classes (_IOBase, _RawIOBase, _BufferedIOBase,
  _TextIOBase, BufferedReader, BufferedWriter, BufferedRandom,
  BufferedRWPair, TextIOWrapper, StringIO, BytesIO, FileIO,
  WindowsConsoleIO)
- PyMmap, sre Pattern, sqlite3 Connection/Cursor
- TypeVar, ParamSpec, ParamSpecArgs, ParamSpecKwargs, TypeVarTuple

Remove 3 expectedFailure markers from test_descr for now-passing tests.
- Add HAS_DICT flag to PyType (type metaclass) alongside existing
  HAS_WEAKREF. All type objects are instances of type and need dict
  support, matching CPython's PyType_Type.
- Replace unimplemented!() in PyMethodDescriptor::descr_get with
  fallback to bind obj directly, matching CPython's method_get which
  uses PyCFunction_NewEx for non-METH_METHOD methods.
- Remove HAS_EXT gc_bits flag; detect ext from type flags
  using raw pointer reads to avoid Stacked Borrows violations
- Store HeapMethodDef owner in payload instead of dict hack
- Clear dict entries in gc_clear_raw to break cycles
- Add WASM error fallback when exception serialization fails
@youknowone youknowone force-pushed the worktree-obj-header-slim branch from 158f155 to a2fe158 Compare March 5, 2026 16:18
@youknowone youknowone merged commit 04cf5da into RustPython:main Mar 5, 2026
4 of 14 checks passed
@youknowone youknowone deleted the worktree-obj-header-slim branch March 5, 2026 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant