-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds List.sample and List.sampleN #3398
Conversation
🦋 Changeset detectedLatest commit: a241b8b The changes in this PR will be included in the next version bump. This PR includes changesets to release 7 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
The latest updates on your projects. Learn more about Vercel for Git ↗︎
2 Skipped Deployments
|
export function sampleN<T>(array: readonly T[], n: number, rng: PRNG): T[] { | ||
const size = Math.max(0, Math.floor(n)); | ||
if (size === 0) return []; | ||
if (size >= array.length) return shuffle(array, rng); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sampling 10 items from the list with 3 items will return the list of 3 items, is this intentional?
I'd expect it to create duplicates, and that would be consistent with how Dist.sampleN works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see that this defaults to sampling without replacement on lower N values, so it makes sense to keep it consistent.
It's a pity that it's inconsistent with how Dist.sampleN
works, though. I assume it was intentional, and also the reason why you didn't make un-namespaced sampleN
non-polymorphic?
Seems like a potential footgun...
How about this: we rename the version with replacement to pickN
, and make sampleN
consistent for lists and dists(with replacement)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. Agreed it's gnarly.
Another option is to add configs. Maybe something like,
sampleN(samples, {withReplacement: boolean, shuffle: boolean})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, options could work, but if we want to combine this with polymorphism on dists vs lists (which seems natural and good), and have different replacement defaults, then it's still problematic.
const index = Math.floor(rng() * array.length); | ||
if (!indices.has(index)) { | ||
indices.add(index); | ||
result.push(array[index]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look optimal, but I tested it on List.make(n,1) -> List.sampleN(n-1)
, and even for n = 1M
it does only 14M loops, so I guess it's fine.
(asymptote is probably something like O(n*log(n))
)
Closes #3337