Using ruff rule to enforce the existence of docstrings in public methods #1063

h-mayorquin · 2024-09-07T03:15:41Z

Should come after #1062

This was the most annoying one.

pauladkisson

Thank you @h-mayorquin for working through all of this!

One point of discussion is should we include Returns sections for standard methods like get_metadata, get_metadata_schema, etc.? Or should we just have them be 1-liners.

pauladkisson · 2024-09-18T17:30:08Z

src/neuroconv/datainterfaces/ecephys/cellexplorer/cellexplorerdatainterface.py

+
+        Warnings
+        --------
+        Ensure that the `.session.mat` file is correctly located in the expected session path, or the method will not generate


I think it is worthwhile to explicitly specify the expected path: self.session_path / f"{self.session_id}.session.mat"

That's a good point.

Done, will be up in the next commit.

pauladkisson · 2024-09-18T17:31:36Z

src/neuroconv/datainterfaces/ophys/baseimagingextractorinterface.py

        return self.imaging_extractor.frame_to_time(frames=np.arange(stop=self.imaging_extractor.get_num_frames()))

    def set_aligned_timestamps(self, aligned_timestamps: np.ndarray):
+        """Replace all timestamps for this interface with those aligned to the common session start time."""


This needs a Parameters section for aligned_timestamps

Changed to noqa, this docstring is identical and should be inherited from the base class.

pauladkisson · 2024-09-18T17:33:11Z

src/neuroconv/datainterfaces/ophys/basesegmentationextractorinterface.py

        return self.segmentation_extractor.frame_to_time(
            frames=np.arange(stop=self.segmentation_extractor.get_num_frames())
        )

    def set_aligned_timestamps(self, aligned_timestamps: np.ndarray):
+        """set the aligned timestamps for the segmentation extractor."""


Again here, needs Parameters section for aligned_timestamps

Changed to noqa, this docstring is identical and should be inherited from the base class:

neuroconv/src/neuroconv/basetemporalalignmentinterface.py

Lines 42 to 56 in 0d5d6cc

@abstractmethod

def set_aligned_timestamps(self, aligned_timestamps: np.ndarray) -> None:

"""

Replace all timestamps for this interface with those aligned to the common session start time.

Must be in units seconds relative to the common 'session_start_time'.

Parameters

----------

aligned_timestamps : numpy.ndarray

The synchronized timestamps for data in this interface.

"""

raise NotImplementedError(

"The protocol for synchronizing the timestamps of this interface has not been specified!"

)

pauladkisson · 2024-09-18T17:34:37Z

src/neuroconv/datainterfaces/ophys/brukertiff/brukertiffconverter.py

+            by default False.
+        stub_frames : int, optional
+            The number of frames to include in the subset if `stub_test` is True, by default 100.
+


remove empty line

pauladkisson · 2024-09-18T17:35:28Z

src/neuroconv/datainterfaces/ophys/brukertiff/brukertiffdatainterface.py

@@ -28,6 +29,7 @@ def get_streams(
        folder_path: DirectoryPath,
        plane_separation_type: Literal["contiguous", "disjoint"] = None,
    ) -> dict:
+        """get streams for the Bruker TIFF imaging data."""


Needs Parameters section for folder_path and plane_separation_type

pauladkisson · 2024-09-18T17:37:46Z

src/neuroconv/datainterfaces/ophys/scanimage/scanimageimaginginterfaces.py

@@ -87,6 +88,7 @@ class ScanImageLegacyImagingInterface(BaseImagingExtractorInterface):

    @classmethod
    def get_source_schema(cls) -> dict:
+        """ " "Get the source schema for the ScanImage legacy imaging interface."""


Looks like a typo with the extra "s

pauladkisson · 2024-09-18T17:42:22Z

src/neuroconv/datainterfaces/ophys/suite2p/suite2pdatainterface.py

+            Whether to include the centroids of regions of interest (ROIs) in the data, by default True.
+        include_roi_acceptance : bool, optional
+            Whether to include acceptance status of ROIs, by default True.
+        mask_type : str, optional


mask_type should include the full info from basesegmentationinterface:

mask_type : str, default: 'image' There are three types of ROI masks in NWB, 'image', 'pixel', and 'voxel'. * 'image' masks have the same shape as the reference images the segmentation was applied to, and weight each pixel by its contribution to the ROI (typically boolean, with 0 meaning 'not in the ROI'). * 'pixel' masks are instead indexed by ROI, with the data at each index being the shape of the image by the number of pixels in each ROI. * 'voxel' masks are instead indexed by ROI, with the data at each index being the shape of the volume by the number of voxels in each ROI. Specify your choice between these two as mask_type='image', 'pixel', 'voxel', or None. If None, the mask information is not written to the NWB file.

Thanks, should be done in the next commit.

pauladkisson · 2024-09-18T17:44:13Z

src/neuroconv/datainterfaces/text/timeintervalsinterface.py

+        """
+        Get metadata for the time intervals.
+
+        Returns


This get_metadata fn has a Returns section, but all the others do not.

We should probably pick one style and stick with it.

Also changing the noqa, would be easier to unify at the base class and then propagate speicifities if needed.

pauladkisson · 2024-09-18T17:46:46Z

src/neuroconv/nwbconverter.py

        conversion_options = conversion_options or dict()
        for interface_name, data_interface in self.data_interface_objects.items():
            data_interface.add_to_nwbfile(
                nwbfile=nwbfile, metadata=metadata, **conversion_options.get(interface_name, dict())
            )

+        return nwbfile


I'm not sure why you added a return value to add_to_nwbfile (it may need one, idk), but it definitely should not be in this PR.

pauladkisson · 2024-09-18T17:47:53Z

src/neuroconv/tools/hdmf.py

@@ -50,6 +50,8 @@ def estimate_default_chunk_shape(chunk_mb: float, maxshape: tuple[int, ...], dty
    def estimate_default_buffer_shape(
        buffer_gb: float, chunk_shape: tuple[int, ...], maxshape: tuple[int, ...], dtype: np.dtype
    ) -> tuple[int, ...]:
+        """ "Add docstring to this"""


lets do it then

Actually, will noqa this as well. I don't have the expertise to write the docstring of this and I don't want to clash with the version upstream.

h-mayorquin · 2024-09-18T20:12:23Z

ne point of discussion is should we include Returns sections for standard methods like get_metadata, get_metadata_schema, etc.? Or should we just have them be 1-liners.

I am fine with adding it, but this is just a minimal effort to get the ruff passing and then we can improve piecewise and unify.

bendichter · 2024-09-26T14:57:58Z

When a method in a child class is overriding a parent class and does not have a docstring, it takes on the docstring of the parent. I think this works fine and in many cases makes it unnecessary to create a new docstring. However, it looks like this ruff inspector does not respect this and requires docstrings to be defined even on child classes where the parent has an appropriate docstring defined. Is that correct? If that's the case, I think we should be able to write a custom check that will work better for us.

bendichter · 2024-09-26T15:02:50Z

e.g.

import importlib
import inspect
import pkgutil

def check_package_docstrings(package_name):
    """
    Check if all public methods of public classes in public modules of a package have docstrings.

    Args:
    package_name (str): The name of the package to check.

    Returns:
    dict: A dictionary containing information about missing docstrings.
    """
    missing_docstrings = {}
    package = importlib.import_module(package_name)
    
    for _, module_name, _ in pkgutil.walk_packages(package.__path__, package.__name__ + '.'):
        try:
            module = importlib.import_module(module_name)
            
            for name, obj in inspect.getmembers(module):
                if inspect.isclass(obj) and not name.startswith('_'):
                    class_missing = check_class_docstrings(obj)
                    if class_missing:
                        missing_docstrings[f"{module_name}.{name}"] = class_missing
        
        except ImportError as e:
            print(f"Error importing module {module_name}: {e}")
    
    return missing_docstrings

def check_class_docstrings(cls):
    """
    Check if all public methods of a class have docstrings.

    Args:
    cls (type): The class to check.

    Returns:
    list: A list of method names without docstrings.
    """
    missing = []
    for name, method in inspect.getmembers(cls, inspect.isfunction):
        if not name.startswith('_') and not method.__doc__:
            missing.append(name)
    return missing

# Example usage
if __name__ == "__main__":
    package_name = "your_package_name"
    result = check_package_docstrings(package_name)
    
    if result:
        print("Missing docstrings found:")
        for class_name, methods in result.items():
            print(f"  {class_name}: {', '.join(methods)}")
    else:
        print("All public methods have docstrings.")

h-mayorquin · 2024-09-26T16:22:55Z

When a method in a child class is overriding a parent class and does not have a docstring, it takes on the docstring of the parent. I think this works fine and in many cases makes it unnecessary to create a new docstring. However, it looks like this ruff inspector does not respect this and requires docstrings to be defined even on child classes where the parent has an appropriate docstring defined. Is that correct? If that's the case, I think we should be able to write a custom check that will work better for us.

Yeah, that's a downside. The approach that I am taking here is just annotating it like this so it gets ignored:

This is good because it forces one to be explicit about it (you add it when you know the docstrings should be inherited) and is less complex than implementing something on our own. Although, @pauladkisson already implemented an action that I think accomplishes this if you prefer to avoid having to add #noqa annotations

codecov · 2024-09-26T17:31:22Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.58%. Comparing base (36464df) to head (716d400).
Report is 16 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1063      +/-   ##
==========================================
+ Coverage   90.44%   90.58%   +0.13%     
==========================================
  Files         129      129              
  Lines        8055     8164     +109     
==========================================
+ Hits         7285     7395     +110     
+ Misses        770      769       -1

Flag	Coverage Δ
unittests	`90.58% <100.00%> (+0.13%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
src/neuroconv/basedatainterface.py	`95.34% <ø> (+0.82%)`	⬆️
src/neuroconv/baseextractorinterface.py	`100.00% <100.00%> (ø)`
...nv/datainterfaces/behavior/audio/audiointerface.py	`87.95% <100.00%> (ø)`
...ces/behavior/deeplabcut/deeplabcutdatainterface.py	`93.18% <100.00%> (ø)`
...nterfaces/behavior/fictrac/fictracdatainterface.py	`91.78% <100.00%> (ø)`
...s/behavior/lightningpose/lightningposeconverter.py	`96.82% <100.00%> (ø)`
...havior/lightningpose/lightningposedatainterface.py	`95.32% <100.00%> (ø)`
...atainterfaces/behavior/medpc/medpcdatainterface.py	`94.31% <100.00%> (ø)`
...faces/behavior/miniscope/miniscopedatainterface.py	`100.00% <100.00%> (ø)`
...aces/behavior/neuralynx/neuralynx_nvt_interface.py	`98.36% <100.00%> (ø)`
... and 50 more

... and 1 file with indirect coverage changes

bendichter · 2024-09-26T18:22:09Z

I tend to side more on not adding chores for devs. They tend to pile up in a way that creates a lot of busy work and makes it difficult to onboard people and is unwelcoming to new contributors. In this particular case I suppose I could go either way

h-mayorquin · 2024-10-29T16:09:12Z

pyproject.toml

@@ -131,6 +131,7 @@ select = [
    "F401",  # Unused import
    "I",  # All isort rules
    "D101",  # Missing docstring in public class
+    "D102",  # Missing docstring in public method


Suggested change

"D102", # Missing docstring in public method

Removing this line
@pauladkisson this eliminates the check

We still have the problem that I added many #noqa in this PR.

h-mayorquin added 7 commits September 6, 2024 19:16

add ruff check for public functions

9adc03b

changelog

87c6cb5

work in progress

8623806

work in progress

e8a2ccd

noqa in proress

4135939

almost done

7afe65e

DONE

438913b

h-mayorquin changed the title ~~Using ruff rule to enforce the existence of docsrings in public functions~~ Using ruff rule to enforce the existence of docsrings in public methods Sep 7, 2024

h-mayorquin self-assigned this Sep 7, 2024

Base automatically changed from more_ruff_rule to main September 10, 2024 04:49

h-mayorquin changed the title ~~Using ruff rule to enforce the existence of docsrings in public methods~~ Using ruff rule to enforce the existence of docstrings in public methods Sep 10, 2024

Merge branch 'main' into the_ruffest_rule_of_all

3066217

h-mayorquin marked this pull request as ready for review September 10, 2024 20:17

h-mayorquin added 2 commits September 17, 2024 17:23

Merge branch 'main' into the_ruffest_rule_of_all

f2ef463

Merge branch 'main' into the_ruffest_rule_of_all

fb2e484

h-mayorquin requested a review from pauladkisson September 17, 2024 23:35

missing stuff

2cd9b55

pauladkisson requested changes Sep 18, 2024

View reviewed changes

h-mayorquin added 11 commits September 18, 2024 13:56

first suggestion

228b32f

second suggestion

0d5d6cc

removed

8035298

bruker tiff docstring

c69487b

bruker tiff single parameters

564703c

miniscope imaging

d95364b

scanimage

adcf7fb

suit2p paul suggestion

9d7f734

time intervals

dd85b3c

reverse return in add_to_nwbfile

1cca391

hdmf

de12b9a

pauladkisson mentioned this pull request Sep 18, 2024

[Documentation]: Should we add Returns section to the docstrings of our get methods (get_metadata, etc.) #1097

Open

2 tasks

Merge branch 'main' into the_ruffest_rule_of_all

716d400

h-mayorquin commented Oct 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using ruff rule to enforce the existence of docstrings in public methods #1063

Using ruff rule to enforce the existence of docstrings in public methods #1063

h-mayorquin commented Sep 7, 2024

pauladkisson left a comment

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

h-mayorquin Sep 18, 2024

pauladkisson Sep 18, 2024

h-mayorquin Sep 18, 2024

h-mayorquin Sep 18, 2024

h-mayorquin commented Sep 18, 2024

bendichter commented Sep 26, 2024

bendichter commented Sep 26, 2024

h-mayorquin commented Sep 26, 2024 •

edited

Loading

codecov bot commented Sep 26, 2024

bendichter commented Sep 26, 2024

h-mayorquin Oct 29, 2024

	@abstractmethod
	def set_aligned_timestamps(self, aligned_timestamps: np.ndarray) -> None:
	"""
	Replace all timestamps for this interface with those aligned to the common session start time.

	Must be in units seconds relative to the common 'session_start_time'.

	Parameters
	----------
	aligned_timestamps : numpy.ndarray
	The synchronized timestamps for data in this interface.
	"""
	raise NotImplementedError(
	"The protocol for synchronizing the timestamps of this interface has not been specified!"
	)

Using ruff rule to enforce the existence of docstrings in public methods #1063

Are you sure you want to change the base?

Using ruff rule to enforce the existence of docstrings in public methods #1063

Conversation

h-mayorquin commented Sep 7, 2024

pauladkisson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

h-mayorquin commented Sep 18, 2024

bendichter commented Sep 26, 2024

bendichter commented Sep 26, 2024

h-mayorquin commented Sep 26, 2024 • edited Loading

codecov bot commented Sep 26, 2024

Codecov Report

bendichter commented Sep 26, 2024

Choose a reason for hiding this comment

h-mayorquin commented Sep 26, 2024 •

edited

Loading