Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[distGB] graphbolt graph edge's mask will be filled with 0 if these edges have no mask initial #7846

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

CfromBU
Copy link
Collaborator

@CfromBU CfromBU commented Dec 12, 2024

Add a new parameter "padding" to add_edge_attribute, the edge's mask will be filled with padding rather than 0.
When use DistEdgeDataloader the padding will be set to 1 to fully sample the edge with no mask.

@CfromBU
Copy link
Collaborator Author

CfromBU commented Dec 12, 2024

@thvasilo ,this pr can fix the graphbolt's mask issue in graphstorm.

@CfromBU CfromBU requested a review from Rhett-Ying December 12, 2024 09:45
Copy link
Collaborator

@Rhett-Ying Rhett-Ying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add dedicated testcases for the bug

@@ -1858,6 +1859,92 @@ def test_local_sampling_heterograph(num_parts, use_graphbolt, prob_or_mask):
)


def check_mask_hetero_sampling_gb(tmpdir, num_server, use_graphbolt=True):
def create_hetero_graph(dense=False, empty=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there's already a function named create_random_hetero() could that be used here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The graph created by create_random_hetero() contains r12, r23, and r13 relations.

When applying a mask to r12, neither DGLGraph nor GB format can sample the n3 node.
When applying a mask to r23, both DGLGraph and GB format can sample the n1 node.
When applying a mask to r13, both DGLGraph and GB format can sample the n2 node.

So the graph can't be used here and I create a new function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Can we add a comment why this particular graph structure is needed for this test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I re-ran the tests and found that the previous statement was incorrect regarding the use of the create_random_hetero. Specifically, before the bug fix, n1 couldn't be sampled when mask was only applied to r23, but after the fix, n1 can now be sampled by r13. So, I've switched to using create_random_hetero for the tests.

Comment on lines 1938 to 1941
@pytest.mark.parametrize("num_parts", [1])
def test_local_masked_sampling_heterograph_gb(
num_server,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's only one value do we need to parametrize?

Also, ensure parametrized string name and input argument have the same name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the code here was a bit rough, but it has been modified now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants