Closes 3428 putmask optimization #3749

drculhane · 2024-09-05T13:34:52Z

Adds the distributed/aggregated code for putmask

closes #3428

ajpotts

Great work! Just some minor comments.

arkouda/numeric.py

tests/numeric_test.py

arkouda/numeric.py

src/EfuncMsg.chpl

stress-tess · 2024-09-17T16:23:38Z

@drculhane I was in the process of reviewing this and was seeing a lot of the same things as @ajpotts before I realized that the changes made in response to her review comments hadn't been pushed yet. so I'm just dropping this note in case that's not on purpose

stress-tess

I know this hasn't been updated yet but I wanted to go ahead and post this review cause I found a small perf bug

src/EfuncMsg.chpl

stress-tess · 2024-09-18T14:53:14Z

Dropping here my findings for the performance I'm seeing on my laptop for putmask before and after this PR. This isn't as accurate as the communication numbers since this is only simulated multi-locale, but I don't know of a great way to do comm diagnostics on the original since it was implemented on the python side (i.e. it calls out to lots of different chpl functions)

I copied @drculhane's tests for the different cases into a python script with problem size 10**6

python code for `putmask_timing.py`

import arkouda as ak
ak.connect()
import numpy as np
import time
prob_size = 10**6

# stealing from andy's tests for the different edge cases

# values same size as data
print("Case 1: Values same size")
nda = np.random.randint(0, 10, prob_size)
pda = ak.array(nda)
nda2 = (nda**2)
pda2 = ak.array(nda2)
hold_that_thought = nda.copy()
np.putmask(nda, nda > 5, nda2)

t1 = time.time()
ak.putmask(pda, pda > 5, pda2)
t2 = time.time()
print(f"took {t2-t1} seconds")
print(f"matches numpy: {np.allclose(nda, pda.to_ndarray())}\n")

#  values potentially much shorter than data
print("Case 2: Values much shorter")
nda = hold_that_thought.copy()
pda = ak.array(nda)
npvalues = np.arange(3)
akvalues = ak.array(npvalues)
np.putmask(nda, nda > 5, npvalues)

t1 = time.time()
ak.putmask(pda, pda > 5, akvalues)
t2 = time.time()
print(f"took {t2-t1} seconds")
print(f"matches numpy: {np.allclose(nda, pda.to_ndarray())}\n")

# values shorter than data, but likely not to fit on one locale in a multi-locale test
print("Case 3: Values shorter but not small enough to fit on a single locale")
nda = hold_that_thought.copy()
pda = ak.array(nda)
npvalues = np.arange(prob_size // 2 + 1)
akvalues = ak.array(npvalues)
np.putmask(nda, nda > 5, npvalues)

t1 = time.time()
ak.putmask(pda, pda > 5, akvalues)
t2 = time.time()
print(f"took {t2-t1} seconds")
print(f"matches numpy: {np.allclose(nda, pda.to_ndarray())}\n")

# values longer than data
print("Case 4: Values longer")
nda = hold_that_thought.copy()
pda = ak.array(nda)
npvalues = np.arange(prob_size + 1000)
akvalues = ak.array(npvalues)
np.putmask(nda, nda > 5, npvalues)

t1 = time.time()
ak.putmask(pda, pda > 5, akvalues)
t2 = time.time()
print(f"took {t2-t1} seconds")
print(f"matches numpy: {np.allclose(nda, pda.to_ndarray())}\n")

Timings on master:

Case 1: Values same size
took 0.09863877296447754 seconds
matches numpy: True

Case 2: Values much shorter
took 188.12161302566528 seconds
matches numpy: True

Case 3: Values shorter but not small enough to fit on a single locale
took 0.24857282638549805 seconds
matches numpy: True

Case 4: Values longer
took 0.13264894485473633 seconds
matches numpy: True

Timings using this PR:

Case 1: Values same size
took 0.03622913360595703 seconds
matches numpy: True

Case 2: Values much shorter
took 0.060797929763793945 seconds
matches numpy: True

Case 3: Values shorter but not small enough to fit on a single locale
took 0.19672393798828125 seconds
matches numpy: True

Case 4: Values longer
took 0.058113813400268555 seconds
matches numpy: True

so I'm seeing better performance across the board, but the significant gains come from case 2, where we localize the values to each locale when it is small enough. I'm glad this is the case considering this was the biggest reason I suggested the optimization in the first place (my comment on original PR)

Great job @drculhane!!!! 🎉

jaketrookman

Looks good to me

stress-tess

all the logic looks good to me and a great speedup!! 🚀

drculhane requested review from ajpotts, stress-tess and jaketrookman September 5, 2024 13:34

ajpotts approved these changes Sep 5, 2024

View reviewed changes

stress-tess reviewed Sep 18, 2024

View reviewed changes

src/EfuncMsg.chpl Show resolved Hide resolved

src/EfuncMsg.chpl Outdated Show resolved Hide resolved

Rebased

e4bf8fb

drculhane force-pushed the Closes-3428-putmask-optimization branch from 4688a30 to de49188 Compare September 18, 2024 17:59

drculhane and others added 4 commits September 18, 2024 14:01

Rebased with Updates

de49188

fixes flake8 errors

d79a6db

Fixes multi-dim make issue

14ea085

Merge branch 'master' into Closes-3428-putmask-optimization

6f4a3b2

jaketrookman approved these changes Sep 18, 2024

View reviewed changes

stress-tess approved these changes Sep 18, 2024

View reviewed changes

stress-tess added this pull request to the merge queue Sep 18, 2024

Merged via the queue into Bears-R-Us:master with commit 07dc332 Sep 18, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Closes 3428 putmask optimization #3749

Closes 3428 putmask optimization #3749

drculhane commented Sep 5, 2024 •

edited by stress-tess

Loading

ajpotts left a comment

stress-tess commented Sep 17, 2024

stress-tess left a comment

stress-tess commented Sep 18, 2024

jaketrookman left a comment

stress-tess left a comment

Closes 3428 putmask optimization #3749

Closes 3428 putmask optimization #3749

Conversation

drculhane commented Sep 5, 2024 • edited by stress-tess Loading

ajpotts left a comment

Choose a reason for hiding this comment

stress-tess commented Sep 17, 2024

stress-tess left a comment

Choose a reason for hiding this comment

stress-tess commented Sep 18, 2024

jaketrookman left a comment

Choose a reason for hiding this comment

stress-tess left a comment

Choose a reason for hiding this comment

drculhane commented Sep 5, 2024 •

edited by stress-tess

Loading