Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Map function running multiple times when using Stage object as result generator #109

Open
joejonespushsecurity opened this issue Sep 5, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@joejonespushsecurity
Copy link

Describe the bug
Not sure whether this is a bug or just our misunderstanding of how this is mean to work. When using the map function over a list of objects we expected the map method to only be executed once no matter how many times we iterate over the returned Stage object or how many times we convert it to a list as we do below.

Minimal code to reproduce
Small snippet that contains a minimal amount of code.

import pypeln as pl
from datetime import datetime 
def map(count):
     return {count, datetime.now()}
 
res = pl.task.map(map, [1,2])

print(res)
print(list(res))
print(list(res))

The output we get from the above is

Stage(process_fn=Map(f=<function map at 0x1006ed3a0>), workers=1, maxsize=0, total_sources=1, timeout=0, dependencies=[Stage(process_fn=FromIterable(iterable=[1, 2], maxsize=0), workers=1, maxsize=0, total_sources=1, timeout=0, dependencies=[], on_start=None, on_done=None, f_args=[])], on_start=None, on_done=None, f_args=['count'])
[{1, datetime.datetime(2023, 9, 5, 11, 5, 44, 348977)}, {datetime.datetime(2023, 9, 5, 11, 5, 44, 349063), 2}]
[{1, datetime.datetime(2023, 9, 5, 11, 5, 44, 349627)}, {2, datetime.datetime(2023, 9, 5, 11, 5, 44, 349685)}]

As you can see the datetime objects are being generated each time we build a list from the Stage object. Is that what is meant to happen?

Expected behavior
We expected the result to be cached within the Stage object and only run once. So we would expect the same result to be returned no matter how many times we translated the Stage into a list. The same thing happens if we iterate over the Stage object in a for loop.

Library Info
Please provide os info and elegy version.

import pypeln
print(pypeln.__version__)

Pypeln version 0.4.9

Screenshots
N/A

Additional context
N/A

@joejonespushsecurity joejonespushsecurity added the bug Something isn't working label Sep 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant