Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency Management: Fixing the (Conda?) Environment #109

Closed
zedrdave opened this issue Dec 28, 2020 · 5 comments
Closed

Dependency Management: Fixing the (Conda?) Environment #109

zedrdave opened this issue Dec 28, 2020 · 5 comments

Comments

@zedrdave
Copy link

From my (limited) understanding, the current codebase is tied to geopandas 0.6.1, which is slowly approaching obsolescence with other downstream packages (eg PROJ 6+ changing the way CRS are handled). In the meantime, it does generate a lot of futurewarning

From a cursory look at the code, it seems like it might be possible to make the code compatible with some superficial changes to CRS-related code. Are there other known issues holding this bump up back?

I'll be happy to take a stab at a PR, if that seems a good idea.

@mmyrte
Copy link
Collaborator

mmyrte commented Dec 31, 2020

Hi Dave, you're absolutely right to want this; I did, too. As a hackish solution for the moment, you might have a look at my fork's develop branch, which has been upgraded to be "bleeding edge" two months ago.

Keep in mind that conda will never be as progressive as PyPI, because there are often not only minimum version requirements, but also maximum constraints. This arguably makes package ecosystems more stable, at the cost of being a complete pain to install. As @emanuel-schmid mentioned on another issue, system dependencies such as GEOS are easier to install via conda for most of our users.
While we generally agreed to stick with conda, you could use brew for the system deps. I personally have sufficient faith in the unit and integration tests that I would use the resulting installation if the tests ran smoothly.

If you don't mind, I would like to hijack/rename this thread to organise a general upgrade - since forcibly upgrading only geopandas to >0.8 is very likely to break a lot of stuff.

@zedrdave
Copy link
Author

zedrdave commented Dec 31, 2020

@mmyrte Thanks for enlightening me on possible reasons to be cautious: as it happens, I also posted in a thread about reliance on Conda

Given Climada's scope and requirements, it might make more sense to use a sophisticated package manager like pipenv or poetry (most of the same benefits as Conda, without the propensity to wreak havoc on other virtual env). But I know how tiresome it is to have to continuously chase the latest fad in package managers, and could understand the reluctance to engage in yet another migration.

for the standard reasons, I am not able/willing to install Conda on prod machines, but have managed to get Climada running fine with a manually created virtual env (pyenv+pipenv) that currently reproduces the exact Conda version combo. Taking care of external dependencies is indeed easier if you have a tool like apt or homebrew (not so much on Windows… so I can understand why Conda would be seen as the best compromise).

I wasn't aware that the version being held back was a consequence of Conda's version availability… That being said, in the time it would take to produce a fully working and tested upgrade, a satisfying version combo should certainly be available through Conda too?

If you don't mind, I would like to hijack/rename this thread to organise a general upgrade - since forcibly upgrading only geopandas to >0.8 is very likely to break a lot of stuff.

By all means. And as I've said before: I'd be happy to contribute in any way that makes sense. My knowledge of Climada is still pretty limited, but I have fairly solid experience with large Python projects (I'm trying to skirt the line between helpful and annoyingly clueless newly-arrived-on-a-project).

@mmyrte
Copy link
Collaborator

mmyrte commented Dec 31, 2020

I'm not at all calling the shots on this project, I'm just a master's student - if I were, however, I'm pretty sure that it would be very valuable to overthink our reliance on conda. Iff there is an alternative that makes it trivial for beginners to install the system deps, then that would be great. (I've just run into problems again with conda; it claims that Python packages are incompatible with Python 3.5 through 3.9, which is patently absurd.)

I think it's worth giving you the background of our target audience:

  • Most people who install climada are Master's students, some of whom have never fired up a Python shell before. Most of them will install it once, use it for part of a semester, and move on with their lives.
  • A minority of users are scientists and students involved with climada for their theses. They tend to tinker more with their environments and would profit from something more stable than conda.
  • And then there is a small minority, at least that I know of, that have your level of skill who even think in terms of production code (Deploying to one of ETH's supercomputers could certainly also be called production, but most projects are one-shots where the code needs to be run for one specific publication, not operationally).

Maybe there is a way to maintain several environment files to accomodate all use cases. At least for us who regularly work with climada, it's become obvious that there are many problems with conda. I'd rather rely on @emanuel-schmid to decide whether that's down to our complex dependencies, or due to conda.

@mmyrte mmyrte changed the title Compatibility with geopandas 0.8.x Dependency Management: Fixing the (Conda?) Environment Dec 31, 2020
@zedrdave
Copy link
Author

zedrdave commented Jan 5, 2021

@mmyrte Agreed on the need to cover a number of different audiences, with different threshold for installation effort vs maintenance vs other technical concerns.

There are 2 services generally provided by standard packaging tools:

  1. handling sub-dependencies automatically
  2. guaranteeing perfect reproducibility of a given dependency tree snapshot (.lock files)

PyPi/pip does provide 1, but not really 2.

Poetry and pipenv generally add the second, with the ability to sandbox if necessary, to prevent dependency conflicts.

Conda is more geared toward the latter, with a rather heavy-handed approach. With the added benefit that it can handle external dependencies (removing the need for a separate install through apt or brew). Unfortunately it does not do clean sandboxing and therefore doesn't play nice with other projects.

In theory, I don't think it would be impossible to support many of these packagers simultaneously (at least for a while):

  • Define minimum possible requirements in setup.py (and only freeze minor versions when there is no alternative), which should eventually let people install through pip (assuming they are able to deal with external dependencies etc).

  • Define version snapshot for Conda and/or Pipenv (the two files should be extremely similar, possibly can be generated from one another automatically) that guarantees a working environment out-of-the-box for people who want to run with the least hassle possible (and incidentally to run unit tests).

Does this make sense?

@mmyrte
Copy link
Collaborator

mmyrte commented Mar 16, 2021

I'm so sorry for not getting back earlier – I was busy finishing my thesis etc. In the meantime, @emanuel-schmid has done a lot of work on the dependency side of things, though I don't know what exactly. (See the issues under dependencies: #167 #161 #159 #158 #157 #107).

Re:

[…] I don't think it would be impossible to support many of these packagers simultaneously […] Does this make sense?

It does make sense to me, but it's a question that IMHO needs to be answered by one single person; I think that's Emanuel. I'm currently also working with CLIMADA in an operational context, so I'd be interested in a stable-but-slightly-unfriendly solution. We could branch and regularly pull from upstream as a last resort. (Tagging @bguillod so he knows about this.)

I'm closing this issue, since the dependency discussion obviously did not get consolidated here.

ps: I'm sorry for hijacking your original issue, but it looks as though the geopandas upgrade you wanted is taking place in version 2.

@mmyrte mmyrte closed this as completed Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants