-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SleepStaging returns a yasa.Hypnogram instance #127
base: master
Are you sure you want to change the base?
Conversation
3320530
to
51d3fcd
Compare
I think it should stay as full spellings WAKE and REM, so SleepStaging should conform to Hypnogram. I see no huge benefit to abbreviations, and the full spellings are clearer, especially in the lower
Do you think it's possible to add a probabilities attribute to Of course in this case they could just be added to |
Good point. I'll make the change.
Yep, also a good idea. There's going to be some tricky edge cases. For example, when using |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #127 +/- ##
==========================================
- Coverage 92.59% 83.93% -8.66%
==========================================
Files 23 24 +1
Lines 3104 3536 +432
==========================================
+ Hits 2874 2968 +94
- Misses 230 568 +338 ☔ View full report in Codecov by Sentry. |
Oooh, you're right, that's a weird one. I think passing
Yes definitely. As you said, keep it where it is for now. I can submit a separate PR for that later. |
@raphaelvallat I don't want to throw any wrenches in at this point, but I just considered this. Maybe you can tell me why it's a bad idea? If users aren't going to need the I'm asking now, because if we liked this, then this would be a nice time to implement it without breaking the API, because you could actually keep the normal behavior of |
It's not a bad idea at all. I'll have to think more about it. My initial preference would be to vote no, mostly because I'm being lazy and because I will have much less time to work on YASA in the coming weeks, but also:
That said, I'm not against implementing the shortcut |
@remrama PR ready for review! I still need to update the changelog and FAQ but I think this can be done in a final PR before we release v0.7 — together with some cleaning of the existing notebooks. Also, if we decide to switch to Pooch than most of the examples might change anyway. |
Btw I'm removing the FutureWarning in |
I'm moving to this after #130 is finalized. Then we can swap reviews 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raphaelvallat I forgot how rowdy this PR is. This Hypnogram class is soooo nice! Great idea and clear implementation.
I think you can merge this as soon as you'd like. Once it is in together with the evaluation module we can start to play around with them simultaneously and work out any kinks there.
My comments are minor, though I'll emphasize that I think the assertion messages are important. I love this new object-oriented approach to hypnograms, but it's also true that many users are transitioning to YASA and Python at the same time, and the object-oriented approach might be a bit more intimidating. Maybe not, but if so, a few helpful messages here-and-there might go a long way.
@@ -215,7 +219,7 @@ class Hypnogram: | |||
'%REM': 8.9713} | |||
""" | |||
|
|||
def __init__(self, values, n_stages=5, *, freq="30s", start=None, scorer=None): | |||
def __init__(self, values, n_stages=5, *, freq="30s", start=None, scorer=None, proba=None): | |||
assert isinstance( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right after checking that values
is a list, I would check they are all strings, with a useful error message telling the use that "yasa expects strings for each epoch now". The first thing I tried when playing around was:
>>> yasa.Hypnogram([1, 2, 3, 4])
# AttributeError: 'int' object has no attribute 'upper'
I'm quite sure this will happen for many users when upgrading to 0.7, and there should probably be as much hand-holding as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea — added a new check
@@ -228,6 +232,9 @@ def __init__(self, values, n_stages=5, *, freq="30s", start=None, scorer=None): | |||
assert isinstance( | |||
scorer, (type(None), str, int) | |||
), "`scorer` must be either None, or a string or an integer." | |||
assert isinstance( | |||
proba, (pd.DataFrame, type(None)) | |||
), "`proba` must be either None or a pandas.DataFrame" | |||
if n_stages == 2: | |||
accepted = ["W", "WAKE", "S", "SLEEP", "ART", "UNS"] | |||
mapping = {"WAKE": 0, "SLEEP": 1, "ART": -1, "UNS": -2} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar hand-holding opportunity is here, where I think it'd be nice to inform the user that they need to specify n_stages
if they are trying to do specify a hypnogram that has less than 5 possible stages.
>>> yasa.Hypnogram(["S", "W"])
# AssertionError: ['S' 'W'] do not match the accepted values for a 5 stages hypnogram: ['WAKE', 'W', 'N1', 'N2', 'N3', 'REM', 'R', 'ART', 'UNS']
Sidenote, and minor opinion, but I think the wording throughout should be "N-stage hypnogram" instead of "N stages hypnogram". It's varied in a few places, but I think stages should be singular and the hyphen should be there. To me it seems clearer that it's referring to the number of potential stages (which for many sleep researchers, having <5 is not really an obvious need).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think adding the word "unique" to the property string would be helpful: <Hypnogram | 2 epochs x 30s (1.00 minutes), 2 unique stages>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See edits!
@@ -74,6 +74,10 @@ class Hypnogram: | |||
scorer : str | |||
An optional string indicating the scorer name. If specified, this will be set as the name | |||
of the :py:class:`pandas.Series`, otherwise the name will be set to "Stage". | |||
proba : :py:class:`pandas.DataFrame` | |||
An optional dataframe with the probability of each sleep stage for each epoch in hypnogram. | |||
Each row must sum to 1. This is automatically included if the hypnogram is created with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be the best place to comment this, but regarding the proba
that is automatically returned from SleepStaging
, should this not have a Hynogram.scorer
attached to it? Maybe just "YASA"
or even something with the version number like "YASA-vX.X"
? Or I'm not sure if you have a separate name and/or version control for the underlying sleep stager.
But in any case, I think the usefulness of the scorer
attribute is that it allows for convenient comparisons, especially if someone is saving their data and looking back at it after a new stager comes out. Note also that the evaluation
module plotting functions often take advantage of this attribute for plotting and dataframe organization, so using it in the returned dataframe might help out in those instances too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I like the idea but what would be the best way to attach such an attribute to a pandas DataFrame? In any cases, I think this can be done in a separate PR.
@@ -280,6 +300,7 @@ def __init__(self, values, n_stages=5, *, freq="30s", start=None, scorer=None): | |||
self._labels = labels | |||
self._mapping = mapping | |||
self._scorer = scorer | |||
self._proba = proba | |||
|
|||
def __repr__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what causes this, but note the differences in iPython:
In: pd.read_csv
Out: <function pandas.io.parsers.readers.read_csv(filepath_or_buffer: 'FilePath ...
In: h.as_int
Out:
<bound method Hypnogram.as_int of <Hypnogram | 3 epochs x 30s (1.50 minutes), 3 stages>
- Use `.hypno` to get the string values as a pandas.Series
- Use `.as_int()` to get the integer values as a pandas.Series
- Use `.plot_hypnogram()` to plot the hypnogram
See the online documentation for more details.>
I often hit enter on methods without calling them, just to get a quick (if ugly) view of the arguments and such. When I do this on the Hypnogram methods it just returns the object string representation. I'm not sure what is necessary to switch it over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yasa/hypno.py
Outdated
for each epoch in hypnogram. | ||
""" | ||
return self._proba | ||
|
||
# CLASS METHODS BELOW | ||
|
||
def as_annotations(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The MNE annotations glossary link in the docstrings is dead now. And that actually highlights an anxiety of mine about this nice and convenient as_annotations
method; I really like using the BIDS standards for events files, but to-date I've struggled with comprehending the terminology in MNE between events
and annotations
. They have events_to_annotations
and annotations_to_events
methods, and I'm sure that if I look at the docs today there will be a clear differentiation, but as I recall I am consistently "confident-and-subsequently-confused" about how they handle this.
So I think this method is useful but I don't know about the name of it. Maybe it's just as_events
for now? Because that is also consistent with BIDS, which is just looking for an events dataframe, and that's what this is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed — renamed to as_events
Work in progress / Do not review
This PR changes the return type of
SleepStaging.predict()
from anumpy.array
to the newly-createdyasa.Hypnogram
class.Remaining tasks
Warning inSleepStaging.predict
that the hypnogram values have changed, specifically: "W" -> "WAKE", "R" -> "REM"!