Skip to content

Fix container-overflow in split_with_sizes_copy_out_cuda#183777

Open
jeffdaily wants to merge 1 commit into
mainfrom
jeffdaily/split-with-sizes-copy-overflow
Open

Fix container-overflow in split_with_sizes_copy_out_cuda#183777
jeffdaily wants to merge 1 commit into
mainfrom
jeffdaily/split-with-sizes-copy-overflow

Conversation

@jeffdaily
Copy link
Copy Markdown
Collaborator

block_idx_to_split_idx was reserve()-d using a num_blocks count computed before iters_per_chunk was applied, then grown via insert() with the iters_per_chunk-scaled block count. This left size() < capacity(), so the trailing region inside the vector's allocated storage carried libstdc++'s ASAN container-overflow annotation. pack_vecs then memcpy'd vec->data() for vec->size() bytes — the access landed in the annotated region and ASAN flagged it.

Compute the per-split block counts after iters_per_chunk is known, size the vector exactly to the total, and fill by index so size() == capacity() and no trailing annotation remains.

Reproduced under an ASAN build of pytorch on ROCm; the bug is a real correctness latency masked by the standard libstdc++ allocator behavior on non-ASAN builds.

Authored with Claude.

`block_idx_to_split_idx` was `reserve()`-d using a `num_blocks` count computed *before* `iters_per_chunk` was applied, then grown via `insert()` with the `iters_per_chunk`-scaled block count. This left `size() < capacity()`, so the trailing region inside the vector's allocated storage carried libstdc++'s ASAN container-overflow annotation. `pack_vecs` then memcpy'd `vec->data()` for `vec->size()` bytes — the access landed in the annotated region and ASAN flagged it.

Compute the per-split block counts after `iters_per_chunk` is known, size the vector exactly to the total, and fill by index so `size() == capacity()` and no trailing annotation remains.

Reproduced under an ASAN build of pytorch on ROCm; the bug is a real correctness latency masked by the standard libstdc++ allocator behavior on non-ASAN builds.

Authored with Claude.
@pytorch-bot pytorch-bot Bot added the release notes: cuda release notes category label May 14, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 14, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/183777

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 Unclassified Failure

As of commit ca45f9c with merge base a1cc64b (image):

UNCLASSIFIED FAILURE - DrCI could not classify the following job because the workflow did not run on the merge base. The failure may be pre-existing on trunk or introduced by this PR:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants