-
Notifications
You must be signed in to change notification settings - Fork 112
Add Docstring in SnowStormDataset
#868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
christianlocatelli
wants to merge
10
commits into
graphnet-team:main
Choose a base branch
from
christianlocatelli:26_01_20_review_docstrings
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
737ce6c
I changed the Docstring for SnowStormDataset and added a Docstring fo…
christianlocatelli 36d1ed6
Merge branch 'graphnet-team:main' into 26_01_20_review_docstrings
christianlocatelli a4f84c3
I did remove the flake8 errors from the pull request regarding SnowSt…
christianlocatelli ed9bbcf
I removed the whitespace in the blank lines
christianlocatelli 37a66dc
I removed whitespace in line 38
christianlocatelli 27d2c7b
Changed GraphNet to GraphNeT
christianlocatelli b90a00b
Tiny change to see if black error goes away
christianlocatelli e831b17
Installed black version 26.1.0 and ran it locally before commiting
christianlocatelli fefd716
Use black --config ../.../black.toml .../snowstorm_dataset.py
christianlocatelli e10a00e
Merge branch 'graphnet-team:main' into 26_01_20_review_docstrings
christianlocatelli File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -20,12 +20,20 @@ | |
|
|
||
|
|
||
| class SnowStormDataset(IceCubeHostedDataset): | ||
| """IceCube SnowStorm simulation dataset. | ||
| """IceCube SnowStorm Monte Carlo simulation dataset. | ||
|
|
||
| More information can be found at | ||
| https://wiki.icecube.wisc.edu/index.php/SnowStorm_MC#File_Locations | ||
| This is a IceCube Collaboration simulation dataset. | ||
| Requires a username and password. | ||
| This module provides access to the SnowStorm simulation data and prepares it | ||
| for the training and evaluation of deep learning models in GraphNeT by parsing | ||
| the data into the GraphNeT-compatible CuratedDataset format. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The data is already parsed in GraphNeT-compatible format (SQLite). The |
||
|
|
||
| The data is organized by SnowStorm RunIDs containing pulsemaps input features | ||
| along with event-level truth information. | ||
|
|
||
| The access to the data requires an IceCube Collaboration account. | ||
|
|
||
| References: | ||
| SnowStorm documentation: https://wiki.icecube.wisc.edu/index.php/SnowStorm_MC#File_Locations | ||
| SnowStorm paper: arXiv:1909.01530 | ||
| """ | ||
|
|
||
| _experiment = "IceCube SnowStorm dataset" | ||
|
|
@@ -91,7 +99,15 @@ def __init__( | |
| def _prepare_args( | ||
| self, backend: str, features: List[str], truth: List[str] | ||
| ) -> Tuple[Dict[str, Any], Union[List[int], None], Union[List[int], None]]: | ||
| """Prepare arguments for dataset.""" | ||
| """Prepare arguments for dataset. | ||
|
|
||
| Args: | ||
| backend: backend of dataset. Only "sqlite" is supported. | ||
| features: List of features from user to use as input. | ||
| truth: List of event-level truth from user. | ||
|
|
||
| Returns: Dataset arguments, train/val selection, test selection | ||
| """ | ||
| assert backend == "sqlite" | ||
| dataset_paths = [] | ||
| for rid in self._run_ids: | ||
|
|
@@ -106,7 +122,6 @@ def _prepare_args( | |
| # get RunID | ||
| pattern = rf"{re.escape(self.dataset_dir)}/(\d+)/.*" | ||
| event_counts: Dict[str, int] = {} | ||
| event_counts = {} | ||
| for path in dataset_paths: | ||
|
|
||
| # Extract the ID | ||
|
|
@@ -175,7 +190,7 @@ def _create_comment(cls, event_counts: Dict[str, int] = {}) -> None: | |
| runid_string += f"RunID {k} contains {v:10d} events\n" | ||
| tot += v | ||
| cls._comments = ( | ||
| f"Contains ~{tot/1e6:.1f} million events:\n" | ||
| f"Contains ~{tot / 1e6:.1f} million events:\n" | ||
| + runid_string | ||
| + fixed_string | ||
| ) | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not provide access to ALL SnowStorm simulations; it only provides access to a few run_ids (see the global variable
AVAILABLE_RUN_IDS). Maybe clarify this in the comment :)