Skip to content

fix(kcore): swap sendMsgToSrc/sendMsgToDst column references#802

Merged
SemyonSinchenko merged 2 commits intographframes:mainfrom
DavidGi:fix/kcore-swapped-pregel-messages
Mar 4, 2026
Merged

fix(kcore): swap sendMsgToSrc/sendMsgToDst column references#802
SemyonSinchenko merged 2 commits intographframes:mainfrom
DavidGi:fix/kcore-swapped-pregel-messages

Conversation

@DavidGi
Copy link
Contributor

@DavidGi DavidGi commented Mar 4, 2026

Issue 801

Each directed edge (u,v) was sending u's own kcore back to u and v's own kcore back to v, so no vertex ever received a neighbour's value. The algorithm converged immediately after one superstep returning kcore = degree for all vertices.

Fix: swap Pregel.src <-> Pregel.dst so each endpoint receives the opposite end's value, matching the Mandal & Hasan (2017) algorithm.

Also:

  • Document that the algorithm, by definition, expects an undirected graph; supplying both (u,v) and (v,u) double-counts and produces wrong results.
  • Fix star-graph test: expected center kcore 3 -> 1 (correct textbook value; no 2-core can form when leaves each have one neighbour).
  • Replace chain-graph test: was a copy of the triangle test with the same cyclic edges; now uses an actual open chain (0-1-2) with the correct assertion kcore=1 for all vertices and an explanatory comment.
  • Tighten medium-graph range assertions to values produced by the correct algorithm (distinct count >= 3, central vertex >= 3).
  • Add triangle-with-tail regression test with exact oracle values (kcore={1:2,2:2,3:2,4:1,5:1}) that catches the swapped-message bug.

Tested with Spark 3.5.x, 4.0.x, 4.1.x, and verified against NetworkX core_number().

Each directed edge (u,v) was sending u's own kcore back to u and v's
own kcore back to v, so no vertex ever received a neighbour's value.
The algorithm converged immediately after one superstep returning
kcore = degree for all vertices.

Fix: swap Pregel.src <-> Pregel.dst so each endpoint receives the
opposite end's value, matching the Mandal & Hasan (2017) algorithm.

Also:
- Document that the implementation treats the graph as undirected
  (total degree + bidirectional message passing per edge); supplying
  both (u,v) and (v,u) double-counts and produces wrong results.
- Fix star-graph test: expected center kcore 3 -> 1 (correct textbook
  value; no 2-core can form when leaves each have one neighbour).
- Replace chain-graph test: was a copy of the triangle test with the
  same cyclic edges; now uses an actual open chain (0->1->2) with the
  correct assertion kcore=1 for all vertices and an explanatory comment.
- Tighten medium-graph range assertions to values produced by the
  correct algorithm (distinct count >= 3, central vertex >= 3).
- Add triangle-with-tail regression test with exact oracle values
  (kcore={1:2,2:2,3:2,4:1,5:1}) that catches the swapped-message bug.

Co-Authored-By: Claude <noreply@anthropic.com>
@DavidGi DavidGi marked this pull request as ready for review March 4, 2026 09:48
Copy link
Collaborator

@SemyonSinchenko SemyonSinchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot @DavidGi!

@SemyonSinchenko
Copy link
Collaborator

@DavidGi running pre-commit run --all-files should be enough. Or you can run scalafmtAll + scalafixAll from ths sbt

@DavidGi
Copy link
Contributor Author

DavidGi commented Mar 4, 2026

@DavidGi running pre-commit run --all-files should be enough. Or you can run scalafmtAll + scalafixAll from ths sbt

thanks @SemyonSinchenko , i've applied the formatting!

@SemyonSinchenko SemyonSinchenko merged commit 12ad0c8 into graphframes:main Mar 4, 2026
7 checks passed
@SemyonSinchenko
Copy link
Collaborator

Thanks for the contribution, @DavidGi !

@DavidGi DavidGi deleted the fix/kcore-swapped-pregel-messages branch March 4, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: KCore: algorithm returns wrong kcore values (kcore = degree for all vertices)

2 participants