Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-blocking PUT in CHPL_COMM=ofi #25977

Merged
merged 25 commits into from
Oct 9, 2024
Merged

Non-blocking PUT in CHPL_COMM=ofi #25977

merged 25 commits into from
Oct 9, 2024

Commits on Oct 9, 2024

  1. Non-blocking PUT implementation

    Previously, non-blocking PUTs were implemented via blocking PUTs, which could
    severely limit performance. Prior to 2.0, small PUTs invoked fi_inject_write,
    which essentially turned them into non-blocking PUTs, but chpl_comm_put
    returned as if the PUT was completed. This could cause MCM violations as well
    as hangs caused by not progressing the network stack properly. These
    deficiences were fixed in 2.0, but led to a performance regression. This
    commit implements non-blocking PUTs correctly, so that the chpl_comm_*nb*
    functions work correctly. This should restore 1.32.0 performance while
    avoiding MCM violations and hangs.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    ffbfdd3 View commit details
    Browse the repository at this point in the history
  2. Added comments

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    4ba2770 View commit details
    Browse the repository at this point in the history
  3. Free non-blocking handle after operation completes

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    132677d View commit details
    Browse the repository at this point in the history
  4. Cleanup

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    11eccc1 View commit details
    Browse the repository at this point in the history
  5. Rewrote PUT logic

    Rewrote PUT logic so that low-level functions are non-blocking, and a blocking
    PUT is implemented by initiating a non-blocking PUT and waiting for it to
    complete. This simplifies the implementation and avoids code duplication.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    a77c004 View commit details
    Browse the repository at this point in the history
  6. Add environment variables for testing

    Allow specifying the maximum message size and maximum number of endpoings.
    These are intended primarily for testing.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    08861b5 View commit details
    Browse the repository at this point in the history
  7. Free dynamically-allocated handles in ofi_put

    Also some code cleanup.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    6baa0e0 View commit details
    Browse the repository at this point in the history
  8. Change forceMemFxVisAllNodes to work on unbound endpoints

    We are now using this function to force visibility when an unbound endpoint is
    released, so it needs to work on unbound endpoints.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    2e0c389 View commit details
    Browse the repository at this point in the history
  9. Improved tci debugging output

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    47e75cb View commit details
    Browse the repository at this point in the history
  10. Allocate visibility bitmaps for unbound endpoints

    Operations to force visibility are deferred until the endpoint is released,
    which requires the visibility bitmaps.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    9f4079e View commit details
    Browse the repository at this point in the history
  11. Fixed number of transmit contexts computation

    Fixed how the number of transmit contexts needed is computed, and added some
    comments.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    06e11af View commit details
    Browse the repository at this point in the history
  12. Added tciAlloc call site debug info

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    1872b30 View commit details
    Browse the repository at this point in the history
  13. Change type of numTxCtxs and numRxCtxs to size_t

    Change type of numTxCtxs and numRxCtxs to size_t to match type of
    info->domain_attr->ep_cnt.
    
    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    b07b36d View commit details
    Browse the repository at this point in the history
  14. numTxCtxs is now of type size_t

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    27f9312 View commit details
    Browse the repository at this point in the history
  15. Better comments

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    e3bc955 View commit details
    Browse the repository at this point in the history
  16. Remove trailing whitespace

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    80d1bc1 View commit details
    Browse the repository at this point in the history
  17. Run bigTransfer test with unbound endpoints

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    b58f4e0 View commit details
    Browse the repository at this point in the history
  18. Only run PUT

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    a174b15 View commit details
    Browse the repository at this point in the history
  19. Run bigTransfer tests with small fabric message size

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    c06b0b4 View commit details
    Browse the repository at this point in the history
  20. Add chpl_comm_free_nb_handle to CHPL_COMM=none

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    c4ff0a1 View commit details
    Browse the repository at this point in the history
  21. Add chpl_comm_free_nb_handle to CHPL_COMM=gasnet

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    95a418e View commit details
    Browse the repository at this point in the history
  22. Added chpl_comm_free_nb_handle to CHPL_COMM=ugni

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    1efd4f4 View commit details
    Browse the repository at this point in the history
  23. Added chpl_comm_free_nb_handle to gasnet-ex

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    9bbb972 View commit details
    Browse the repository at this point in the history
  24. Fixed typos

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    2dfb923 View commit details
    Browse the repository at this point in the history
  25. Addressed reviewer's comments

    Signed-off-by: John H. Hartman <jhh67@users.noreply.github.com>
    jhh67 committed Oct 9, 2024
    Configuration menu
    Copy the full SHA
    b260c4f View commit details
    Browse the repository at this point in the history