Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Framework for testing effect of various MKL settings #2714

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

edsavage
Copy link
Contributor

@edsavage edsavage commented Aug 30, 2024

Checkpointing current status for visibility.

Contains the changes to build system required to compile and link code referring to MKL functions, along with various scripts etc. to exercise those functions and gather data on the impact that they may have.

Tools and packages needed for testing MKL setting with pytorch_inference

  • Start with latest ml_linux_build Docker image (30)
  • Build heapcheck and gperftools and install under /usr/local/gcc103
  • yum install python3 (for running scripts for testing inference)
  • install intel-oneapi-mkl-devel-2024.0 as per linux_image Dockerfile and do:
(cd /opt/intel/oneapi/mkl/2024.0 && tar cf - include) | (cd /usr/local/gcc103 && tar xvf -)

Compiling the code.

Checkout the code in this PR on a linux x86_64 machine and configure CMake as normal, but ensure that pytorch_inference is linked against libtcmalloc. This can be done with e.g.

cmake -B cmake-build-relwithdebinfo  -DLINK_TCMALLOC
cmake --build cmake-build-relwithdebinfo -j`nproc` -t install

Running pytorch_inference

There are several python scripts in the bin/pytorch_inference directory that are capable of running pytorch_inference on various models. Examples are

python3 main.py elser_model_2_linux-x86_64.pt ../../build/distribution/platform/linux-x86_64/bin/pytorch_inference inference_requests.json --num_threads_per_allocation 8 --cache_size 274756282
python3 evaluate.py bert-base-uncased-fill-mask.pt --memory_benchmark --num_threads_per_allocation=4

These scripts can be tweaked in various ways before running. In the case of evaluate.py edit the script to:

  • use either heapprof (from gperftools) or heapcheck.
  • Alter how many inferences are requested and in how many batches.
  • Choose how frequently to send the mkl_free_buffers control request

Viewing results

If running pytorch_inference under heapprof there will be a reasonably large number of output files generated, e.g. /tmp/heapprof.0040.heap. These files need to be post processed by a tool called pprof e.g.:

pprof ../../build/distribution/platform/linux-x86_64/bin/pytorch_inference /tmp/heapprof.0040.heap --pdf > pytorch_inference_heapprof_0040.pdf

to generate a pdf file of the heapprof results (other output formats are available).

Heapcheck has its own GUI especially for viewing results - https://github.com/KDE/heaptrack?tab=readme-ov-file#heaptrack_gui but can also display results as plain text.

Checkpointing current status for visibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant