Skip to content

main branch review wrt JOSS/publication #169

@MaHaWo

Description

@MaHaWo

As discussed per mail, I have collected a few things that stood out to me while having a look at the current main branch:
Didn't have the capacity to go over everything with a focus on algorithm or conduct speed tests, but anyway, here are my 2 cents:

General: project setup, deps...

  • maybe extract cpu jax from the dependencies and have [cpu], [cuda] options. Not important though
  • min python could perhaps be upgraded to 3.11 or even 3.12, 3.14 has been released recently
  • the same for setuptools, current version is 80.9 according to pip
  • remove the TODO.md. this is a cookiecutter setup how-to file.
  • consider adding a contribution guide CONTRIBUTION.md building on the current README.md and possibly a coding style guide if you want to have that, or subsume it under the the contribution guide.
  • how to build the docs locally is missing from the readme? useful for local development. at least I didn't find it.
  • ! We should fix the 'unknown license found' in the repo overview page (rhs at the top). Apparently the copying.md is found by github as 'unknown license'.
  • ! the copying.md currently effectively functions as a contributor license agreement, at least that is my understanding. However, this is not obvious, and there is no mechanism for contributors to agree to it or not. By default, under open source licenses, contributors retain the copyright to their contributions unless agreed otherwise, which must be explicit. The default for most non-company open source projects is that this stays as is, and the copyright is not modified or transfered. If we want this to be the case, I think we should delete COPYING.md completely and only have the MIT license. If otherwise , we should have a proper Contributor Lisence Agreement I think, including an obvious mechanism for contributors to transfer the copyright to the ASTRO-AI lab. However, that is somewhat unusual for open source projects of RUBIX' class and could deter contributions and usage, and the need for it is not obvious I think.

Code

  • what is debug.py for?
  • why do we have config.yml and /config/*.yml?
  • In the data.py pytree classes, should the data not be jax arrays? Afaik jax pytrees don't work when the leafs are not jax.Array
  • There's a TODO left in data.py line 380. should this go into an issue?
  • type annotations missing in pynbody.py
  • some type annotations and explanatory comments missing in private functions in illustris.py. Can help with understanding code after a longer break.
  • IllustrisAPI.py, line 209 seems to miss an else branch: either throw or do whatever else is appropriate.
  • what's the reason that logger.py, utils.py etc are outside of the rubix directory?
  • type annotations are sometimes present, sometimes not throughout the code base
  • from x import * is often considered inferior to from x import what, ever, you, need because it can pollute the namespace with all kinds of things you don't necessarily need and you have no control over what's gonna be imported when the imported package changes its exports. happens in from rubix.spectra.dust.extinction_models import *. Same in from .extinction_models import *.
  • rotation.py line 78 has the logger.info(f"Rotating galaxy with alpha={alpha}, beta={beta}, gamma={gamma}") line atop the docstring. might interfere with auto detection for docs or so?
  • TODO in telescope.py. --> issue?
  • type hints missing in the visualization.py file.
  • docstrings missing in the visualizatoin.py file. makes them hard to find in the api docs.
  • do we still need the transformer.py?
  • cue/cue directory empty?
  • mH not used in dust_extinction.py line 137
  • can the constants in extinction_models.py be made into class fields?
  • if lookup dicts like in extinction_models.py come up more often, consider bundling them into their own module for easier maintenance.
  • unused variable in generic_models.py: x_range
  • fits.py doesn't seem to have any tests?
  • calculate_dusty_datacube_particlewise doesn't have either?
    ==> keep an eye on branch coverage too
  • I got failing tests locally on a fresh install:
    • test_get_lookup_interpolation_with_valid_config
    • test_get_lookup_interpolation_vmap
    • test_get_lookup_interpolation_pmap
    • test_get_lookup_interpolation in test_ssp_grid
    • test_apply_rotation -> arrays not equal?
    • test_rotate_galaxy -> arrays not equal?
      not sure what's wrong here, the errors seem to be numeric often.

suggestions for future development directions

  • consider using Zarr instead of or in parallel to hdf5 if you end up working a lot with stored data and the simulation data allows it: Simpler, some load-direct-to-gpu support, better support for parallelization.
    Have used this for an ML project recently and experiences so far have been good.

The license / copyright thing is rather important to make clear.
nothing else I can see here is make-or-break for publication I would say, but JOSS reviewers might pick up on some of it (or def. will in case of the lisence file).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions