Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing primitive take_from(bag, amount) #209

Open
spicychickensauce opened this issue Mar 9, 2025 · 10 comments
Open

Missing primitive take_from(bag, amount) #209

spicychickensauce opened this issue Mar 9, 2025 · 10 comments

Comments

@spicychickensauce
Copy link

First of, thank you for creating and maintaining this great library!

I think there is a missing primitive which I needed quite often when writing generators: take_from(bag, amount)

Sample usage:

cards = [:jack, :queen, :king, :ace, 2, 3, 4, 5, 6, 7, 8, 9, 10]
check all(
        {hand_p1, remaining} <- take_from(cards, 2),
        {hand_p2, _remaining} <- take_from(remaining, 2)
      ) do
  # assert some poker logic
end

I think this should be in the library, since I struggled quite a bit with implementing it myself.
First I tried this horrible mess:
It works, but throws StreamData.TooManyDuplicatesError quite often.

defp take_from(bag, n) do
  indexed_bag =
    Enum.with_index(bag)

  StreamData.uniq_list_of(StreamData.one_of(indexed_bag |> Enum.map(&StreamData.constant/1)),
    length: n
  )
  |> StreamData.map(fn indexed_items ->
    {taken_items, taken_indexes} =
      Enum.unzip(indexed_items)

    taken_indexes = MapSet.new(taken_indexes)

    new_bag =
      indexed_bag
      |> Enum.reject(fn {_, i} -> MapSet.member?(taken_indexes, i) end)
      |> Enum.map(&elem(&1, 0))

    {new_bag, taken_items}
  end)
end

I then improved it quite a bit with this recursive version:

defp take_from(bag, n), do: do_take_from(bag, n, [])

defp do_take_from(bag, 0, acc), do: StreamData.constant({acc, bag})

defp do_take_from(bag, n, acc) do
  StreamData.one_of(Enum.with_index(bag) |> Enum.map(&StreamData.constant/1))
  |> StreamData.bind(fn {item, i} ->
    new_bag = List.delete_at(bag, i)
    do_take_from(new_bag, n - 1, [item | acc])
  end)
end

But I still think there could be a better version, probably one that would use Enum.take_random/2.

But I can't figure it out.

Any ideas? Could we get this into the library?

@spicychickensauce
Copy link
Author

Side note: I'm using this construct quite often, maybe one_of could be made to support lists as an argument directly?

StreamData.one_of(things |> Enum.map(&StreamData.constant/1))

@spicychickensauce spicychickensauce changed the title Missing primitive take_from(bag, length) Missing primitive take_from(bag, amount) Mar 9, 2025
@whatyouhide
Copy link
Owner

The issue with Enum.take_random/2 here is that shrinking won't really work properly. take_from/2 would have to take elements with a randomness that's proportionate to the current generator size, so that a lower generator size means that, say, take_from(bag, 2) will always take the first two elements. I’m open to having this version in the library but shrinking support is a must. Thoughts?

(As for the one_of + constant thing, can you open a separate issue?)

@spicychickensauce
Copy link
Author

I do not understand the intricacies of shrinking, so I can't really contribute much here.

Is one_of shrunk towards earlier elements? If so, I would expect the recursive version to indeed shrink towards earlier elements.

My actual main use case at the moment is generating chess positions. For this it's not clear what kind of shrinking would actually be desired.
I guess it would be to go from complex positions to simpler ones, which might mean to replace queens with pawns. So in that sense shrinking towards the beginning of the list would probably be correct and desired.

Other idea for more efficient implementation:
Shuffle the list, take first n elements.
(assuming shuffle does shrinking towards not shuffled)

I don't know enough haskell to understand the source, but maybe this helps?
https://hackage.haskell.org/package/QuickCheck-2.15.0.1/docs/Test-QuickCheck-Gen.html#v:shuffle

So maybe the missing primitive is actually shuffle?

@whatyouhide
Copy link
Owner

For this it's not clear what kind of shrinking would actually be desired.

Agreed for this case but all generators must think of what is a good way to shrink. In general yes shrinking towards "earlier" elements in the list is a good general strategy.

I think the primitive is not necessarily missing, we just need to extend member_of/1 to support "multiple members".

@spicychickensauce
Copy link
Author

If we extend it, then it should be called members_of, right?

I would still prefer take_from as I proposed, as that clearly implies that elements are removed from the bag as they are taken out. That's also why I included the new bag in the return value, as I always needed that.

@whatyouhide
Copy link
Owner

If the behavior is taking from the collection then yeah but I’m not sure that's helpful. The example above could be rewritten as

[card1, card2, card3, card4] <- members_of(bag, 4),
hand1 = {card1, card2},
hand2 = {card3, card4}

right?

@spicychickensauce
Copy link
Author

spicychickensauce commented Mar 10, 2025

members_of to me does not imply that it would prevent drawing the same card twice.
If it would behave that way then it probably should be named uniq_members_of.
But this now might sound too similar to uniq_list_of, not sure.
Also, I do want to actually allow duplicates, if they appear multiple times in the input list. E.g. 8 pawns, 2 bishops, etc..
So take_from still seems best. take_from_shuffled would also be quite good.

But maybe just having shuffle would really be enough, as then I could use StreamData.shuffle(bag) |> StreamData.map(& Enum.split(&1, n))
(Not quite the same as my recursive implementation, as it also shuffles the remaining list, but that would be ok for my use case)


The example I provided is a bit too simplified, as here taking 4 cards at once would work as well.
But for my actual use case it wouldn't work, since later draws depend on earlier draws. E.g. for my chess example:

In the first draw I place all white pieces, then I place the black pieces. The placement of the black pieces is constrained by where the white pieces where placed, for instance I can't place the black king right next to the white one.

@whatyouhide
Copy link
Owner

I think shuffling is probably a good base though, that's right.

@spicychickensauce
Copy link
Author

Alright then. I think I understand the Haskell implementation now and have replicated it below:

def shuffle(bag) do
  StreamData.list_of(StreamData.float(), length: length(bag))
  |> StreamData.map(fn sort ->
    Enum.zip(bag, sort)
    |> Enum.sort_by(&elem(&1, 1))
    |> Enum.unzip()
    |> elem(0)
  end)
end

Would that be good for a PR? Would this shrink towards unsorted? How do I test the shrinking?

@whatyouhide
Copy link
Owner

Don't worry about testing the shrinking for now. I think it would be nice to implement this with a lazy tree directly since it could be a pretty fundamental primitive, so optimizing would be great, but the implementation is a bit less intuitive. Maybe diving through the source a bit would help, plus https://elixir-lang.org/blog/2017/10/31/stream-data-property-based-testing-and-data-generation-for-elixir/ 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants