Gemma4: fix failed test cases#45568
Conversation
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
| if attention_mask is not None: | ||
| attention_mask = self._convert_4d_mask_to_blocked_5d(attention_mask) |
There was a problem hiding this comment.
@Cyrilvallez any opinion.
From PR descriptioin
Fix bug when attention_mask is None(tests/models/gemma4/test_modeling_gemma4.py::Gemma4Audio2TextModelTest::test_eager_matches_fa2_generate)
| @require_flash_attn | ||
| @require_torch_accelerator | ||
| @mark.flash_attn_test | ||
| @slow | ||
| def test_flash_attn_2_from_config(self): | ||
| # Gemma4 requires mm_token_type_ids in train mode, so we test in eval mode | ||
| self.flash_attn_from_config(attn_implementation="flash_attention_2", test_fwd_in_train=False) | ||
|
|
||
| @require_flash_attn_3 | ||
| @require_torch_gpu | ||
| @mark.flash_attn_3_test | ||
| @slow | ||
| def test_flash_attn_3_from_config(self): | ||
| # Gemma4 requires mm_token_type_ids in train mode, so we test in eval mode | ||
| self.flash_attn_from_config(attn_implementation="flash_attention_3", test_fwd_in_train=False) |
There was a problem hiding this comment.
@kaixuanliu I didn't see these 2 failing on our Flash Attn CI job.
Could you share more info / error logs ?
There was a problem hiding this comment.
Our flash attn ci doesn have FA3 - I think it's hard to install because you need to compile from source and it's much longer than FA2 build from source
Maybe we could add a separate FA4 CI - not sure how stable it is tho since it's still in beta
There was a problem hiding this comment.
Well, for FA3 and FA4, on my env they are skipped as well. I can delete these two.
There was a problem hiding this comment.
No, I mean for
test_flash_attn_2_from_config
our CI is [PASSED]. So I am not sure why we need this fix, at least for FA2.
Our CI runner don't have FA3 or FA4, so they are skipped. But the question may still valid: do we really this fix?
There was a problem hiding this comment.
cc @zucchini-nlp for viz, don't think it's super important but would still be nice to fix at some point ig
There was a problem hiding this comment.
Gemma4 requires mm_token_type_ids in train mode, so we test in eval mode > was fixed already no?
If not, @kaixuanliu can you open an issue and ping me there. I might forget to come back when bugs are reported under PRs 😅
There was a problem hiding this comment.
It's already fixed. And I have removed this part.
|
Hi, Integration tests add device-specific Expectations entries for XPU without a default fallback, so running these tests on an unsupported accelerator type (or an XPU generation not covered) can select an unintended expectation or raise if no expectation matches. This makes the tests more brittle to new device properties. Severity: informational | Category: reliability How to fix: Add default expectation fallback Agent prompt to fix - you can give this to your LLM of choice:
Found by Qodo code review |
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
vasqu
left a comment
There was a problem hiding this comment.
It kind of got lost, sorry let me run slow tests just for sanity checking and then merge
|
run-slow: gemma4 |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: gemma4 |
|
This comment contains models: ["models/gemma4"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
CI ResultsCommit Info
Model CI Report❌ 1 new failed tests from this PR 😭
|
* set eval mode for flash attn tests Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * skip flash_attn tests Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * fix bug when attention_mask is None Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * add XPU expectations Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * add deterministic decorator Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * skip 2 compile related tests Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * nice code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * fix code quality check Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comment Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
What does this PR do?
This PR did several things:
attention_maskis None(tests/models/gemma4/test_modeling_gemma4.py::Gemma4Audio2TextModelTest::test_eager_matches_fa2_generate)test_flash_attn_x_from_configFixes # (issue)
Code Agent Policy
Who can review?
@ydshieh pls help review