turn pandas vectorize example into exercise #18

JostMigenda · 2025-11-04T22:00:56Z

As noted in #9, there are currently very few exercises in the optimisation part of the course. In particular, there are none on array broadcasting, despite it being one of the most powerful performance optimisations in the course.

This turns one of the existing examples (vectorising over a pandas.Series) into an exercise. It also includes minor changes to make the existing example more legible.

github-actions · 2025-11-04T22:01:08Z

Thank you!

Thank you for your pull request 😃

🤖 This automated message can help you check the rendered files in your submission for clarity. If you have any questions, please feel free to open an issue in {sandpaper}.

If you have files that automatically render output (e.g. R Markdown), then you should check for the following:

🎯 correct output
🖼️ correct figures
❓ new warnings
‼️ new errors

Rendered Changes

🔍 Inspect the changes: https://github.com/carpentries-incubator/pando-python/compare/md-outputs..md-outputs-PR-18

The following changes were observed in the rendered markdown documents:

 md5sum.txt            |  2 +-
 optimisation-numpy.md | 62 ++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 57 insertions(+), 7 deletions(-)

What does this mean?

If you have source files that require output and figures to be generated (e.g. R Markdown), then it is important to make sure the generated figures and output are reproducible.

This output provides a way for you to inspect the output in a diff-friendly manner so that it's easy to see the changes that occur due to new software versions or randomisation.

⏱️ Updated at 2025-11-04 22:01:55 +0000

Robadob

I really don't like the redundant divide repeats by 20 and multiply time by 20.
IIRC 1000 is a lazy gimmick of mine, to save dividing the result to present an average, it's probably just easier to drop the gimmick and refer to it as total time.

Otherwise the new exercise looks good, though it makes me tempted to extend vectorisation coverage to include np.where() and other conditional operations to demonstrate that it's more flexible than assumed.

Approved in principle, though I'd get rid of the scaling you added in practice.

Not looking to increase your workload, so #19 for future.

Robadob · 2025-11-05T10:11:10Z

episodes/optimisation-numpy.md

-print(f"for_range: {timeit(for_range, number=repeats)*10-gentime:.2f}ms")
-print(f"for_iterrows: {timeit(for_iterrows, number=repeats)*10-gentime:.2f}ms")
-print(f"pandas_apply: {timeit(pandas_apply, number=repeats)*10-gentime:.2f}ms")
+print(f"for_range: {timeit(for_range, number=int(repeats/20))*20-gentime:.2f}ms")  # scale with factor 20, otherwise it takes too long


Isn't the scaling kind of redundant complexity here? Just repeat it 50 times and have lower absolute times?

turn pandas vectorize example into exercise

492cdcc

JostMigenda requested a review from Robadob November 4, 2025 22:01

github-actions bot pushed a commit that referenced this pull request Nov 4, 2025

differences for PR #18

12bf741

Robadob approved these changes Nov 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

turn pandas vectorize example into exercise #18

turn pandas vectorize example into exercise #18

Uh oh!

JostMigenda commented Nov 4, 2025

Uh oh!

github-actions bot commented Nov 4, 2025 •

edited

Loading

Uh oh!

Robadob left a comment

Uh oh!

Robadob Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

turn pandas vectorize example into exercise #18

Are you sure you want to change the base?

turn pandas vectorize example into exercise #18

Uh oh!

Conversation

JostMigenda commented Nov 4, 2025

Uh oh!

github-actions bot commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Thank you!

Rendered Changes

Uh oh!

Robadob left a comment

Choose a reason for hiding this comment

Uh oh!

Robadob Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Nov 4, 2025 •

edited

Loading