-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added more documentation for dealing with big datasets #408
Added more documentation for dealing with big datasets #408
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## RC_v1.5.x #408 +/- ##
==========================================
Coverage 60.91% 60.91%
==========================================
Files 23 23
Lines 3541 3541
==========================================
Hits 2157 2157
Misses 1384 1384
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Linting results by Pylint:Your code has been rated at 8.70/10 (previous run: 8.70/10, +0.00) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. The only change I would suggest to the notebook is performing tracking to show that this needs to be performed on a single dataframe, but that it can be done without loading all spatial data into memory. The notebook isn't yet displayed in the docs, is that intended? And the failure of the jupyter notebooks CI can be fixed with the inclusion of dask in example_requirements.txt
, which has been done in #334
I'm ready for a review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good to me! Happy for this to be merged when ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice addition. Thank you for your work!
I requested some formatting changes - which will be easy to do
doc/big_datasets_examples/notebooks/parallel_processing_tobac.ipynb
Outdated
Show resolved
Hide resolved
…ataset_update # Conflicts: # doc/conf.py # doc/examples.rst
Ok, @fsenf, I believe I've addressed your comments and I am now ready for a re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for update! I am happy with the changes. I especially like that the doc notebooks have been integrated into the example gallery: https://tobac--408.org.readthedocs.build/en/408/examples.html
Please go ahead and merge!
Cheers, Fabian.
PS: It would be cool to have nice looking thumbnails for the new gallery entry. I will create a new issue for that request.
Added more details to our documentation around dealing with big datasets, including a new example notebook of running feature detection in parallel.
With @RBhupi from tobathon February 2024