Skip to content

Align specialization guards and caching with CPython#7341

Merged
youknowone merged 31 commits intoRustPython:mainfrom
youknowone:specialization
Mar 4, 2026
Merged

Align specialization guards and caching with CPython#7341
youknowone merged 31 commits intoRustPython:mainfrom
youknowone:specialization

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Mar 4, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Improved tuple index bounds checking with clearer "tuple index out of range" errors.
  • Refactor

    • Revamped bytecode storage and adaptive counter handling for safer concurrent access and more stable instruction iteration.
    • De-specialized instruction streams to avoid runtime-mutated code during compilation.
    • Hardened JIT code caching and invalidation.
  • Performance

    • Added hint-based fast-paths for exact-dict lookups to speed common dictionary accesses.
  • Behavior

    • Expanded instrumentation to treat an additional resume instruction as traceable.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: f7913c33-3baf-4654-95d5-df8bc8685b4a

📥 Commits

Reviewing files that changed from the base of the PR and between 968372a and cc3fd09.

📒 Files selected for processing (2)
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/frame.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/vm/src/dict_inner.rs

📝 Walkthrough

Walkthrough

Refactors adaptive counters into out-of-line bit‑encoded AtomicU16 storage and updates CodeUnits layout/APIs; routes JIT compilation over cleaned bytecode bytes; adds dict hint/get_hint fast‑path helpers; adds PyBoundMethod accessors and JIT guards; tightens tuple index bounds handling and expands instrumentation trace detection.

Changes

Cohort / File(s) Summary
Adaptive counter encoding & CodeUnits layout
crates/compiler-core/src/bytecode.rs
Replaces in‑place u8 counters with bit‑encoded u16 counters and helpers (adaptive_counter_bits, adaptive_counter_triggers, advance_adaptive_counter, adaptive_counter_backoff). CodeUnits becomes a struct with units: UnsafeCell<Box<[CodeUnit]>> and adaptive_counters: Box<[AtomicU16]>. Adds constants (BACKOFF_BITS, MAX_BACKOFF, UNREACHABLE_BACKOFF, ADAPTIVE_COOLDOWN_VALUE, JUMP_BACKWARD_INITIAL_VALUE), updates adaptive counter read/write APIs to u16, implements Clone and unsafe impl Sync, and updates opcode/counter access paths.
JIT instruction sourcing
crates/jit/src/instructions.rs
Switches iteration to a cleaned/stable instruction stream derived from bytecode.instructions.original_bytes().as_slice()CodeUnits, used for label collection and compilation; propagates BadBytecode on conversion failure and ensures iteration over de‑specialized (zeroed CACHE) opcodes.
Dictionary cached-hint APIs
crates/vm/src/builtins/dict.rs, crates/vm/src/dict_inner.rs
Adds hint_for_key (returns optional u16 index hint for exact dicts) and get_item_opt_hint / get_hint fast‑path APIs that validate a cached entry index and return the value when valid; falls back to normal lookup when hint is absent or stale.
Function / bound‑method tweaks & JIT guards
crates/vm/src/builtins/function.rs
Adds jit_guard logic (cfg(feature = "jit")) to guard invalidation/access of jitted_code during code swaps and assignment. Adds pub(crate) accessors function_obj() and self_obj() on PyBoundMethod. Minor argument‑binding iteration adjustment in fill_locals_from_args.
Tuple bounds & error messaging
crates/vm/src/builtins/tuple.rs
Replaces direct index access with wrap_index(i); returns IndexError("tuple index out of range") when wrapping fails, otherwise returns cloned element at the wrapped index.
Instrumentation trace detection
crates/vm/src/stdlib/sys/monitoring.rs
Treats Instruction::ResumeCheck as traceable alongside Instruction::Resume, altering which instructions become instrumented in later phases.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh
  • coolreader18

Poem

🐰 Bits and hops in tidy rows,
Counters tucked where quiet grows.
Dict hints blink, a furtive wink,
JIT reads clean — no cached mislink.
A rabbit cheers: bytecode’s bright! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Align specialization guards and caching with CPython' directly addresses the main theme of the pull request, which involves refactoring adaptive counters, instruction caching, and bytecode handling across multiple files to align specialized instruction optimization with CPython's approach.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the specialization branch 2 times, most recently from d1dc3d6 to 2d7c2f7 Compare March 4, 2026 11:00
@youknowone youknowone changed the title Specialization Align specialization guards and caching with CPython Mar 4, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] lib: cpython/Lib/pickle.py
[ ] lib: cpython/Lib/_compat_pickle.py
[ ] test: cpython/Lib/test/test_pickle.py (TODO: 20)
[ ] test: cpython/Lib/test/test_picklebuffer.py (TODO: 12)
[ ] test: cpython/Lib/test/test_pickletools.py (TODO: 8)

dependencies:

  • pickle (native: _pickle, itertools, sys)
    • _compat_pickle
    • _compat_pickle
    • argparse, codecs, copyreg, functools, io, pprint, struct, types

dependent tests: (97 tests)

  • pickle: test_annotationlib test_array test_ast test_asyncio test_bool test_builtin test_bytes test_bz2 test_codecs test_collections test_concurrent_futures test_configparser test_coroutines test_csv test_ctypes test_decimal test_defaultdict test_deque test_descr test_dict test_dictviews test_email test_enum test_enumerate test_exceptions test_fractions test_functools test_generators test_genericalias test_http_cookies test_importlib test_inspect test_io test_ipaddress test_iter test_itertools test_list test_logging test_lzma test_memoryio test_memoryview test_minidom test_opcache test_operator test_ordered_dict test_os test_pathlib test_pickle test_picklebuffer test_pickletools test_platform test_plistlib test_positional_only_arg test_posix test_random test_range test_re test_set test_shelve test_slice test_socket test_statistics test_str test_string test_time test_trace test_tuple test_type_aliases test_type_params test_types test_typing test_unittest test_uuid test_xml_dom_minicompat test_xml_etree test_zipfile test_zlib test_zoneinfo
    • logging: test_asyncio test_concurrent_futures test_hashlib test_pkgutil test_support test_urllib2net
      • asyncio.futures: test_asyncio
      • concurrent.futures._base: test_concurrent_futures
      • hashlib: test_hmac test_smtplib test_tarfile test_unicodedata test_urllib2_localnet
      • http.cookiejar: test_http_cookiejar test_urllib2
      • multiprocessing.util: test_asyncio test_compileall test_concurrent_futures
      • venv: test_venv

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

- Disable ForIterGen specialization (falls through to generic
  path) because inline generator frame resumption is needed
  for correct debugger StopIteration visibility (test_bdb)
- Use downcast_ref_if_exact for PyNativeFunction in CALL
  specialization guards
- Add can_specialize_call guard for class __init__ specialization
- Remove expectedFailure for test_bad_newobj_args (now passing)
@youknowone youknowone marked this pull request as ready for review March 4, 2026 14:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/vm/src/dict_inner.rs`:
- Around line 366-374: Snapshot the entry's key and value while holding the dict
read guard, drop the guard, then perform equality; specifically, in the block
that uses self.read() and inner.entries.get(hint) (and currently calls
key.key_is and key.key_eq on entry.key), clone/copy the entry.key and
entry.value out of the inner guard, release the guard, and then run
key.key_is(&cloned_key) and only if needed call key.key_eq(vm, &cloned_key) to
decide the result and return the cloned value; this avoids invoking key_eq while
holding the read lock.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 1b01ab44-3238-428f-8cd8-393e5623cbb5

📥 Commits

Reviewing files that changed from the base of the PR and between 68ad332 and d464119.

⛔ Files ignored due to path filters (1)
  • Lib/test/test_pickle.py is excluded by !Lib/**
📒 Files selected for processing (8)
  • crates/compiler-core/src/bytecode.rs
  • crates/jit/src/instructions.rs
  • crates/vm/src/builtins/dict.rs
  • crates/vm/src/builtins/function.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/dict_inner.rs
  • crates/vm/src/frame.rs
  • crates/vm/src/stdlib/sys/monitoring.rs

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d464119937

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 322 to 324
builtins: PyObjectRef,
closure: &[PyCellRef],
func_obj: Option<PyObjectRef>,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Restore Frame API compatibility for callers

This change drops the use_datastack parameter from Frame::new (and removes related frame-local APIs) without updating the existing call sites, so the crate no longer builds: cargo check -p rustpython-vm fails with unresolved methods/arity mismatches in builtins/function.rs, builtins/frame.rs, builtins/super.rs, and vm/mod.rs. As written, the commit is build-blocking and cannot be integrated until the callers are updated or compatibility shims are restored.

Useful? React with 👍 / 👎.

@youknowone youknowone marked this pull request as draft March 4, 2026 15:16
Introduce FastLocalsData enum with Heap and DataStack variants
so non-generator/coroutine frames allocate localsplus on the VM
datastack instead of the heap. Includes materialize_to_heap for
migration when needed (e.g. generator suspension).
@youknowone youknowone marked this pull request as ready for review March 4, 2026 15:19
Move key_eq call outside the read lock guard to avoid
potential deadlock when Python __eq__ re-enters dict
mutation paths. Matches the existing pattern in lookup().
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc3fd09611

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@@ -2805,7 +2828,7 @@ impl ExecutingFrame<'_> {
Ok(None)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat ResumeCheck as non-traceable at frame entry

By dispatching Instruction::ResumeCheck through the normal execution path here, functions whose first opcode is RESUME_CHECK now run, but the trace-event filters in run() still only skip Resume/InstrumentedResume (see the matches! checks around line and opcode tracing in crates/vm/src/frame.rs). That causes extra line/opcode callbacks at function entry for RESUME_CHECK code objects (for example, a code object with the first opcode patched from RESUME to RESUME_CHECK emits an extra event on the def line), which breaks tracing parity with the existing RESUME path.

Useful? React with 👍 / 👎.

@youknowone youknowone merged commit 9a0511b into RustPython:main Mar 4, 2026
14 checks passed
@youknowone youknowone deleted the specialization branch March 4, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant