-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[distGB] graphbolt graph edge's mask will be filled with 0 if these edges have no mask initial #7846
base: master
Are you sure you want to change the base?
Conversation
@thvasilo ,this pr can fix the graphbolt's mask issue in graphstorm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add dedicated testcases for the bug
@@ -1858,6 +1859,92 @@ def test_local_sampling_heterograph(num_parts, use_graphbolt, prob_or_mask): | |||
) | |||
|
|||
|
|||
def check_mask_hetero_sampling_gb(tmpdir, num_server, use_graphbolt=True): | |||
def create_hetero_graph(dense=False, empty=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there's already a function named create_random_hetero()
could that be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The graph created by create_random_hetero() contains r12, r23, and r13 relations.
When applying a mask to r12, neither DGLGraph nor GB format can sample the n3 node.
When applying a mask to r23, both DGLGraph and GB format can sample the n1 node.
When applying a mask to r13, both DGLGraph and GB format can sample the n2 node.
So the graph can't be used here and I create a new function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Can we add a comment why this particular graph structure is needed for this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I re-ran the tests and found that the previous statement was incorrect regarding the use of the create_random_hetero
. Specifically, before the bug fix, n1
couldn't be sampled when mask was only applied to r23
, but after the fix, n1
can now be sampled by r13
. So, I've switched to using create_random_hetero
for the tests.
@pytest.mark.parametrize("num_parts", [1]) | ||
def test_local_masked_sampling_heterograph_gb( | ||
num_server, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's only one value do we need to parametrize?
Also, ensure parametrized string name and input argument have the same name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, the code here was a bit rough, but it has been modified now.
Add a new parameter "padding" to add_edge_attribute, the edge's mask will be filled with padding rather than 0.
When use DistEdgeDataloader the padding will be set to 1 to fully sample the edge with no mask.