-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PairG header-only library #1
base: master
Are you sure you want to change the base?
Conversation
The goal is to minimize the number of dependencies of PairG header-only library. PairG ends up to be only dependent on Kokkos and kokkos-kernels libraries. This commit decouples argument parsing module from header files which makes PairG library independent from `clipp` library. It also decouples `PaSGAL` from the library by using a template type for directed char graph.
The vg file included in the `tests` module has been accordingly updated.
These libraries recently support standalone CMake build system.
The function `get_max_result_nnz` has been renamed to `get_c_nnz` in commit kokkos-kernels@39e8503.
- employ modern CMake and raise minimum required version of CMake to 3.13 - use `find_package` to resolve dependencies - use installed dependencies if there are already present on the system - use `add_subdirectory` call or CMake `ExternalProject` module for bundled dependencies - change handling of Protobuf, Kokkos, kokkos-kernels, Catch2, htslib, and libvgio - define target `libpairg` for PairG library - define target `pairg` for PairG command-line tool - avoid modifying `CMAKE_CXX_FLAGS` since it is set by CMake for different build types - set compiler standard flag through `target_compile_feature` for `libpairg` target - different build type for `pairg` and `tests` targets - rename `BUILD_TESTS` to `BUILD_TESTING` which is more conventional and set it off by default **Note**: This commit makes `libvgio` a **required** dependency which was bundled before. This is because libvgio does not handle its dependencies properly (e.g. libhandlegraph). Therefore, it cannot be simply bundled by calling `add_subdirectory`. Considering that handling libvgio's dependencies should be done in its own package, it seems reasonable to ask users to have it installed as a requirement. This commit, however, does not remove `libvgio` submodule in case this change needs to be reverted to make it bundled again. The required procedure for bundling VGio is included but commented out because of aforementioned issue in VGio build script.
I briefly explained the reason behind each change in its corresponding commit message. |
Thanks! I'm curious to know if you are using PairG in your project(s)? |
Yes, I am trying out PairG for distance estimation between two alignments. |
You are welcome to contribute to this repo directly. Just sent an invite. |
In the last iteration of the while loop, the unnecessary matrix multiplication `A_copy` x `A_copy` will be avoided by this change.
Thanks for the invite. Just an update: I also extended the PairG a little bit to be able to index the whole human genome graph. For distance constraints of length 100bp, the original PairG index matrix would take around 1.7TB which is infeasible to construct based on my computational resources. While I managed to compress it to a 27GB-index with peak memory of 200GB during construction. I will share more details in the near future. I would be happy to make a branch and add my implementation in another PR. However, since I am writing my thesis right now, I am not sure when I get the time to make it. |
Great! |
- Use `GNUInstallDirs` module for installation directories - Define build and install interfaces for pairg target properties
The CMake build script no longer install bundled dependencies unless it is explicitly requested by using `USE_BUNDLED_<LIBRARY>` variables.
The template parameters for `spadd` have changed in the new version and unnecessary template parameters have been dropped. To adjust with it, we just let the compiler deduce the parameters.
Since the function definitions might appear across multiple translation units, they should be defined inline.
This commit also changes CMake minimum required version to 3.16.
All types; i.e. ordinal type (`lno_t`), device type, and scalar type can now be provided as a template parameter. The class is defaulted to the old hard-coded types.
All interface functions accept a template parameter specifying a specialisation of `matrixOps` template class. The default type for this parameter is the old `matrixOps` type.
To avoid multiple initialisation or finalisation of kokkos, they are called inside a global fixture.
This PR mainly refactors the build script and a little bit of the library itself in order to make it usable as a header-only library.