Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sparse/overlapping labels #12

Open
ericphanson opened this issue Jan 26, 2023 · 1 comment
Open

Support sparse/overlapping labels #12

ericphanson opened this issue Jan 26, 2023 · 1 comment

Comments

@ericphanson
Copy link
Member

We have tables of labels, which could be sparse throughout a recording, and could be overlapping. That makes them a bad fit for LabeledSignal's. Is there another way to use OndaBatches? What needs to happen here to support that?

@kleinschmidt kleinschmidt transferred this issue from another repository Jan 31, 2023
@kleinschmidt
Copy link
Member

kleinschmidt commented Mar 14, 2023

I think there are two (possibly related) cases we may want to support here:

  • sparse/overlapping labels where you only want to load exactly the labeled span in a batch (e.g., you have a bunch of 1s events, and you want to build batches by stacking those 1s items into a batch)
  • sparse/overlapping labels where you want to build batches for longer spans that may overlap some labels (handling the unlabeled bits properly)

if the sparse label starts/stops are aligned with some reasonable sampling rate (edit: and there are no overlaps), we could already support the first one (if we move some internal code into this package). However, in using that code, we're running into trouble with a situation where some spans start/stops are not aligned with the label sampling rate (e.g., if we assuming labels are sampled at 1Hz then if we have spans like 1.4-2.6s, we're in trouble). So in that situation, we have to decide how to convert that into a 1Hz signal. Up to this point we've been able to get by by "snapping" the start/stops to sample times which allows us to use the standard mechanism implemetned here, but an alternative would be to do something like create an all-zero signal and index into it with an AlignedSpan for each label. Dealing with overlaps would be annoying though, unless you do it with "soft labels" (adding up the votes associated with the one-hot encoding of hard labels or just using soft labels directly).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants