Skip to content

Gemma4: fix failed test cases#45568

Merged
vasqu merged 13 commits into
huggingface:mainfrom
kaixuanliu:gemma4-fix
May 5, 2026
Merged

Gemma4: fix failed test cases#45568
vasqu merged 13 commits into
huggingface:mainfrom
kaixuanliu:gemma4-fix

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

@kaixuanliu kaixuanliu commented Apr 22, 2026

What does this PR do?

This PR did several things:

  1. Skip some test cases that are not suitbale for gemma4 model
  2. Fix bug when attention_mask is None(tests/models/gemma4/test_modeling_gemma4.py::Gemma4Audio2TextModelTest::test_eager_matches_fa2_generate)
  3. fix some failed test cases related to test_flash_attn_x_from_config
  4. Add XPU related Expectations

Fixes # (issue)

Code Agent Policy

  • I confirm that this is not a pure code agent PR.

Who can review?

@ydshieh pls help review

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu kaixuanliu changed the title Gemma4 fix Gemma4: fix failed test cases Apr 22, 2026
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu kaixuanliu marked this pull request as ready for review April 22, 2026 09:25
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Comment on lines +1944 to +1945
if attention_mask is not None:
attention_mask = self._convert_4d_mask_to_blocked_5d(attention_mask)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cyrilvallez any opinion.

From PR descriptioin

Fix bug when attention_mask is None(tests/models/gemma4/test_modeling_gemma4.py::Gemma4Audio2TextModelTest::test_eager_matches_fa2_generate)

Comment thread tests/models/gemma4/test_modeling_gemma4.py
Comment on lines +464 to +478
@require_flash_attn
@require_torch_accelerator
@mark.flash_attn_test
@slow
def test_flash_attn_2_from_config(self):
# Gemma4 requires mm_token_type_ids in train mode, so we test in eval mode
self.flash_attn_from_config(attn_implementation="flash_attention_2", test_fwd_in_train=False)

@require_flash_attn_3
@require_torch_gpu
@mark.flash_attn_3_test
@slow
def test_flash_attn_3_from_config(self):
# Gemma4 requires mm_token_type_ids in train mode, so we test in eval mode
self.flash_attn_from_config(attn_implementation="flash_attention_3", test_fwd_in_train=False)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaixuanliu I didn't see these 2 failing on our Flash Attn CI job.

Could you share more info / error logs ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our flash attn ci doesn have FA3 - I think it's hard to install because you need to compile from source and it's much longer than FA2 build from source

Maybe we could add a separate FA4 CI - not sure how stable it is tho since it's still in beta

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for FA3 and FA4, on my env they are skipped as well. I can delete these two.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no, see my comment below #45568 (comment)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean for

test_flash_attn_2_from_config

our CI is [PASSED]. So I am not sure why we need this fix, at least for FA2.

Our CI runner don't have FA3 or FA4, so they are skipped. But the question may still valid: do we really this fix?

Copy link
Copy Markdown
Contributor Author

@kaixuanliu kaixuanliu Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the code is added before this PR is merged: #45454, it will crash for this case before here is removed. After this PR this case can pass. I will update the code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @zucchini-nlp for viz, don't think it's super important but would still be nice to fix at some point ig

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gemma4 requires mm_token_type_ids in train mode, so we test in eval mode > was fixed already no?

If not, @kaixuanliu can you open an issue and ping me there. I might forget to come back when bugs are reported under PRs 😅

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already fixed. And I have removed this part.

Comment thread tests/models/gemma4/test_modeling_gemma4.py Outdated
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, Integration tests add device-specific Expectations entries for XPU without a default fallback, so running these tests on an unsupported accelerator type (or an XPU generation not covered) can select an unintended expectation or raise if no expectation matches. This makes the tests more brittle to new device properties.

Severity: informational | Category: reliability

How to fix: Add default expectation fallback

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

XPU expectations were added without a (None, None) default. This can make the test brittle when run on different XPU generations or unexpected device properties.

Fix Focus Areas

  • tests/models/gemma4/test_modeling_gemma4.py[534-542]
  • tests/models/gemma4/test_modeling_gemma4.py[575-593]
  • tests/models/gemma4/test_modeling_gemma4.py[621-629]
  • tests/models/gemma4/test_modeling_gemma4.py[675-681]
  • tests/models/gemma4/test_modeling_gemma4.py[745-755]

Recommended changes

  • Add a default expectation (None, None): <existing cuda expectation> if you want to preserve previous behavior for other devices.
  • Or, add additional XPU keys if multiple gens are expected to run these tests.
  • Ensure the intended device coverage is explicit to avoid accidental matching on future hardware.

Found by Qodo code review

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of got lost, sorry let me run slow tests just for sanity checking and then merge

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented May 5, 2026

run-slow: gemma4

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma4

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/gemma4"]
quantizations: []

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN a3c44435 workflow commit (merge commit)
PR 37b8baa4 branch commit (from PR)
main a6ccf935 base commit (on main)

Model CI Report

1 new failed tests from this PR 😭

  • gemma4:
    tests/models/gemma4/test_modeling_gemma4.py::Gemma4IntegrationTest::test_export_text_only (❌ ⟹ ❌)

@vasqu vasqu added this pull request to the merge queue May 5, 2026
Merged via the queue into huggingface:main with commit df2f2b5 May 5, 2026
22 of 23 checks passed
Exile333 pushed a commit to Exile333/transformers that referenced this pull request May 6, 2026
* set eval mode for flash attn tests

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* skip flash_attn tests

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* fix bug when attention_mask is None

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* add XPU expectations

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* add deterministic decorator

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* skip 2 compile related tests

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* update

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* update

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* nice code

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* fix code quality check

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* update comment

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

---------

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
@kaixuanliu kaixuanliu deleted the gemma4-fix branch May 8, 2026 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants