Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: emirge's role #75

Open
majosm opened this issue Aug 20, 2020 · 30 comments
Open

Discussion: emirge's role #75

majosm opened this issue Aug 20, 2020 · 30 comments

Comments

@majosm
Copy link
Contributor

majosm commented Aug 20, 2020

After some back-and-forth on Slack, I noticed we still have some differing thoughts on emirge, so I wanted to start up one final (hah) discussion in an attempt to iron this out.

In emirgecom, we (tentatively) agreed that emirge should no longer parse mirgecom's requirements.txt. However, there seem to still be two views on what role emirge should play:

View A: emirge is a tool that installs an environment (for our purposes, a CEESD environment).

View B: emirge is a CEESD environment.

In View A, emirge would take as parameters: (some of these are optional) 1) a conda environment name, 2) a list of conda packages, 3) a requirements.txt, and 4) an installation directory. It would create the conda environment, install the conda packages in that environment, and install the pip packages in the specified installation directory. Everyone would use the master branch and construct their own requirements.txt for whatever package versions they want to use. These would exist outside of emirge (i.e., not tracked). (@MTCam please let me know (or edit this directly) if I've gotten any of this wrong.)

View B is much like View A, but a given branch in emirge would encode a set of CEESD package versions via a single tracked requirements.txt. Users would maintain different environments by creating branches with different versions of this file. Installation would be performed once*, with CEESD packages going into the emirge directory. Switching between environments would be done by: 1) checking out a different branch of emirge, and 2) running a helper script to go into the package directories and check out the branches specified in emirge's requirements.txt. (No conda environment switching is done as there is only one environment needed.)

(* Multiple installations are still supported via separate clones, as with anything else.)

I lean towards View B. I think there is a need for something to sit at the top level and keep track of our soon-to-be many different development environments, and now that it looks like mirgecom isn't going to fill that role anymore it has left a bit of a void. I don't think the approach in View A alone can be made to deal with the issues discussed in #53, and manually passing around requirements.txts when someone wants to share an environment or move to a different machine sounds like a mess.

As I understand it, the primary motivation for View A is that this could become something useful in a more general sense (i.e., to install things other than just a CEESD environment). This may be true, and I don't want to discourage that from being explored; but I don't think it needs to be emirge that does this, per se. We can extract that functionality into a separate package (with a more appropriate name; there isn't really anything "mirge" about it when it's installing something else) and then have emirge depend on it.

Thoughts?

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

I think that version B is just version A with some default environments included. For example, one for people who intend to develop packages, and one for those who just want to run stuff.

Edit: I agree with everything you just said @majosm about the functionality and motivation of emirge. However the motivation behind our continued discussions about the functionality isn't about whether emirge can be used as an external, or more general tool, but remaining general enough to support one of my use cases - but I think it's an important one.

What does a person do when there is a need to run multiple simulations at the same time with different versions of the code? This is often faced by our main end-user (e.g. the person doing the prediction).

It is my understanding that since only one version of each package can be used at the same time in any given environment - that this situation requires multiple installations of mirgecom , each in its own environment. Is that true?

@majosm
Copy link
Contributor Author

majosm commented Aug 20, 2020

I think that version B is just version A with some default environments included. For example, one for people who intend to develop packages, and one for those who just want to run stuff.

No, that doesn't cover it at all. Suppose I'm working on something that requires branch x of meshmode, branch y of grudge, and branch z of mirgecom (because I'm changing something that impacts all three). Then say I have another change that just requires branch m of mirgecom. Then somebody else asks for help debugging something that uses branches i, j, k, etc... 😱

Defaults don't cover this. We either encode this in git or manually pass these requirements.txt files around.

@matthiasdiener
Copy link
Member

I think that version B is just version A with some default environments included. For example, one for people who intend to develop packages, and one for those who just want to run stuff.

No, that doesn't cover it at all. Suppose I'm working on something that requires branch x of meshmode, branch y of grudge, and branch z of mirgecom (because I'm changing something that impacts all three). Then say I have another change that just requires branch m of mirgecom. Then somebody else asks for help debugging something that uses branches i, j, k, etc... 😱

Defaults don't cover this. We either encode this in git or manually pass these requirements.txt files around.

There is imho a simpler third option: Telling users to run pip install -r requirements.txt inside the mirgecom dir whenever they checkout a new mirgecom branch.

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

I

I think that version B is just version A with some default environments included. For example, one for people who intend to develop packages, and one for those who just want to run stuff.

No, that doesn't cover it at all. Suppose I'm working on something that requires branch x of meshmode, branch y of grudge, and branch z of mirgecom (because I'm changing something that impacts all three). Then say I have another change that just requires branch m of mirgecom. Then somebody else asks for help debugging something that uses branches i, j, k, etc... 😱

We agree on this, @majosm . I am not arguing against continuing to use version control with emirge, and being able to modify the "requirements.txt" file there, and check that in as a branch. To me VersionB still looks like VersionA, but with source control. I've never advocated that any of our files should be taken out of source control - and I missed that about your initial description of VersionA.

My real issue is dealing with the case where a single user needs to run multiple versions of the code at the same time on the same machine. I just need emirge to have enough functionality to allow that use case. That's the case behind all of my issues with emirge. emirge@master currently works for me - I just want to retain some of its behaviors.

@majosm
Copy link
Contributor Author

majosm commented Aug 20, 2020

We agree on this, @majosm . I am not arguing against continuing to use version control with emirge, and being able to modify the "requirements.txt" file there, and check that in as a branch. To me VersionB still looks like VersionA, but with source control. I've never advocated that any of our files should be taken out of source control - and I missed that about your initial description of VersionA.

Ok, cool. Slack comments seemed to suggest otherwise, but I'm glad we're on the same page now.

My real issue is dealing with the case where a single user needs to run multiple versions of the code at the same time on the same machine. I just need emirge to have enough functionality to allow that use case. That's the case behind all of my issues with emirge. emirge@master currently works for me - I just want to retain some of its behaviors.

Does that come from this part:

ABATE wakes up and reads a file that tells it what projects to test (currently I test mirgecom@mtc/euler only - but each "project" is a branch of mirgecom for us)

?

To me, this isn't the approach ABATE/TEESD should be taking in a post-#72 world. Part of what I was trying to get at above and in Slack is that emirge's (or whatever "top-level" package we decide on) requirements.txt should be considered the official source for an environment's package version information, not mirgecom's. Trying to circumvent it to use mirgecom's instead is the wrong way to go about it IMO.

Can ABATE/TEESD be adjusted such that its projects can be branches of emirge (or whatever "top-level") instead of mirgecom? It would be nice if we could create a branch in emirge with a specified set of package versions, then just tell TEESD, "Hey, go test this emirge branch".

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

Can ABATE/TEESD be adjusted such that its projects can be branches of emirge (or whatever "top-level") instead of mirgecom? It would be nice if we could create a branch in emirge with a specified set of package versions, then just tell TEESD, "Hey, go test this emirge branch".

Yes. That is how it works now, and how I would like for it to continue working. I misspoke before, each project is a different branch of emirge, not mirgecom. But each version of emirge that I test is associated with a particular branch of mirgecom. I was using emirge@teesd for that - but since all the emirge changes, I have switched to emirge@master and specify the --branch option to get a particular branch. I'm ok if this changes, it is sort of orthogonal to the troubles I have. As long as emirge@some_branch can install mirgecom@a-branch - any branch I need, then I'm fine with that.

If I limit the testing to one branch, this all works fine. If I need more than one version of the code to be running at the same time on the same platform (e.g. like during prediction or when testing changes or testing multiple branches) - that's when the trouble starts for me.

The feature of emirge that allows the multiple simultaneous installs to work is its ability to install each mirgecom installation into its own environment using a common conda. Current emirge does this. I seek to retain this behavior, and I don't think this feature runs afoul of our collective vision for emirge

@matthiasdiener
Copy link
Member

It is my understanding that since only one version of each package can be used at the same time in any given environment - that this situation requires multiple installations of mirgecom , each in its own environment. Is that true?

I believe that's true.

@matthiasdiener
Copy link
Member

ABATE wakes up and reads a file that tells it what projects to test (currently I test mirgecom@mtc/euler only - but each "project" is a branch of mirgecom for us)

?

To me, this isn't the approach ABATE/TEESD should be taking in a post-#72 world. Part of what I was trying to get at above and in Slack is that emirge's (or whatever "top-level" package we decide on) requirements.txt should be considered the official source for an environment's package version information, not mirgecom's. Trying to circumvent it to use mirgecom's instead is the wrong way to go about it IMO.

Uhh, I think this would be hard to do. How would you keep emirge's requirement.txt in sync with whatever branch X of mirgecom requirements.txt describe?

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

It is my understanding that since only one version of each package can be used at the same time in any given environment - that this situation requires multiple installations of mirgecom , each in its own environment. Is that true?

I believe that's true.

Then I would like if emirge could (continue to) do that.

Uhh, I think this would be hard to do. How would you keep emirge's requirement.txt in sync with whatever branch X of mirgecom requirements.txt describe?

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

@matthiasdiener
Copy link
Member

matthiasdiener commented Aug 20, 2020

It is my understanding that since only one version of each package can be used at the same time in any given environment - that this situation requires multiple installations of mirgecom , each in its own environment. Is that true?

I believe that's true.

Then I would like if emirge could (continue to) do that.

Ok, I think the current version of PR #72 can still do that.

Uhh, I think this would be hard to do. How would you keep emirge's requirement.txt in sync with whatever branch X of mirgecom requirements.txt describe?

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

I don't think there is currently a way to do this automatically, and I struggle to imagine even a theoretical way how to do that automatically. The manual way to do this (either in emirge@master or with PR #72) is to manually select a branch of mirgecom, or just tell emirge to install mirgecom@master, run git checkout <mybranch> in mirgecom/, and run pip install -r requirements.txt (to get the correct version of dependencies).

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

I don't think there is currently a way to do this automatically, and I struggle to imagine even a theoretical way how to do that automatically. The manual way to do this (either in emirge@master or with PR #72) is to manually select a branch of mirgecom, or just tell emirge to install mirgecom@master, run git checkout <mybranch> in mirgecom/, and run pip install -r requirements.txt (to get the correct version of dependencies).

Wouldn't it amount to just checking out emirge, make a branch, edit the requirements.txt to be the version of mirgecom that I want and then just check that back in as a branch of emirge? What would stop me from doing that?

@matthiasdiener
Copy link
Member

matthiasdiener commented Aug 20, 2020

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

I don't think there is currently a way to do this automatically, and I struggle to imagine even a theoretical way how to do that automatically. The manual way to do this (either in emirge@master or with PR #72) is to manually select a branch of mirgecom, or just tell emirge to install mirgecom@master, run git checkout <mybranch> in mirgecom/, and run pip install -r requirements.txt (to get the correct version of dependencies).

Wouldn't it amount to just checking out emirge, make a branch, edit the requirements.txt to be the version of mirgecom that I want and then just check that back in as a branch of emirge? What would stop me from doing that?

Are you going to create a branch in emirge for every branch in mirgecom? That seems... not good...

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

I don't think there is currently a way to do this automatically, and I struggle to imagine even a theoretical way how to do that automatically. The manual way to do this (either in emirge@master or with PR #72) is to manually select a branch of mirgecom, or just tell emirge to install mirgecom@master, run git checkout <mybranch> in mirgecom/, and run pip install -r requirements.txt (to get the correct version of dependencies).

Wouldn't it amount to just checking out emirge, make a branch, edit the requirements.txt to be the version of mirgecom that I want and then just check that back in as a branch of emirge? What would stop me from doing that?

Are you going to make a branch in emirge for every branch in mirgecom? That seems... not good...

Naw, just the ones I want to share or be able to recreate remotely. Like if I need you or Matt to look at something (for the umpteenth time), then I can just check in an emirge branch that will install exactly the packages you need to recreate my environment. Look at it, run it, then blow it away.

@matthiasdiener
Copy link
Member

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

I don't think there is currently a way to do this automatically, and I struggle to imagine even a theoretical way how to do that automatically. The manual way to do this (either in emirge@master or with PR #72) is to manually select a branch of mirgecom, or just tell emirge to install mirgecom@master, run git checkout <mybranch> in mirgecom/, and run pip install -r requirements.txt (to get the correct version of dependencies).

Wouldn't it amount to just checking out emirge, make a branch, edit the requirements.txt to be the version of mirgecom that I want and then just check that back in as a branch of emirge? What would stop me from doing that?

Are you going to make a branch in emirge for every branch in mirgecom? That seems... not good...

Naw, just the ones I want to share or be able to recreate remotely. Like if I need you or Matt to look at something (for the umpteenth time), then I can just check in an emirge branch that will install exactly the packages you need to recreate my environment. Look at it, run it, then blow it away.

That seems like a very labor-intensive process and would also duplicate the the requirements in two different files, like I mentioned above.

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

I think I can just check out emirge@xyz, and know that it install(s) the version of mirgecom that I need. Can't I?

I don't think there is currently a way to do this automatically, and I struggle to imagine even a theoretical way how to do that automatically. The manual way to do this (either in emirge@master or with PR #72) is to manually select a branch of mirgecom, or just tell emirge to install mirgecom@master, run git checkout <mybranch> in mirgecom/, and run pip install -r requirements.txt (to get the correct version of dependencies).

Wouldn't it amount to just checking out emirge, make a branch, edit the requirements.txt to be the version of mirgecom that I want and then just check that back in as a branch of emirge? What would stop me from doing that?

Are you going to make a branch in emirge for every branch in mirgecom? That seems... not good...

Naw, just the ones I want to share or be able to recreate remotely. Like if I need you or Matt to look at something (for the umpteenth time), then I can just check in an emirge branch that will install exactly the packages you need to recreate my environment. Look at it, run it, then blow it away.

That seems like a very labor-intensive process and would also duplicate the the requirements in two different files, like I mentioned above.

I don't follow that it is labor intensive, or the duplication concern. I'm not saying there would be a new requirements file - just that the requirements file can differ between emirge branches, like any other file can differ between branches.

@matthiasdiener
Copy link
Member

matthiasdiener commented Aug 20, 2020

I don't follow that it is labor intensive, or the duplication concern. I'm not saying there would be a new requirements file - just that the requirements file can differ between emirge branches, like any other file can differ between branches.

Regarding the labor intensity: Imagine you create a branch X in mirgecom, and have to create a corresponding branch X in emirge. Some days later, you change the requirements in mirgecom's branch X, and have to remember to make the corresponding changes in emirge's branch X.

Regarding the duplication: instead of the relatively coarse-grained dependencies tracked in the #72 PR, you now need to track every change to mirgecom requirements.txt (whether to the master branch or another branch) in a second file that is outside mirgecom (emirge's requirements.txt).

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

I don't follow that it is labor intensive, or the duplication concern. I'm not saying there would be a new requirements file - just that the requirements file can differ between emirge branches, like any other file can differ between branches.

Regarding the labor intensity: Imagine you create a branch X in mirgecom, and have to create a corresponding branch X in emirge. Some days later, you change the requirements in mirgecom's branch X, and have to remember to make the corresponding changes in emirge's branch X.

This seems like the required amount of labor to me. Iff I want/need emirge to be able to install my branch directly, then yes - I make a branch of emirge that does it (simply by editing the emirge/requirements.txt), and then checking that branch in.
Edit: I don't think I need to explicitly keep emirge@abc/requirements.txt in sync with mirgecom@abc/requirements.txt, because as long as emirge@abc/requirements.txt indicates mirgecom@abc as the installed branch, then when pip installs mirgecom@abc, it looks at mirgecom@abc/requirements.txt.

In many cases (including for our personal individual use), simply checking out emirge@master and installing will be enough, because we can easily ( some of us more easily than others ) switch between our branches for our current environment. No need to make an emirge branch for that.

Regarding the duplication: instead of the relatively coarse-grained dependencies tracked in the #72 PR, you now need to track every change to mirgecom requirements.txt (whether to the master branch or another branch) in a second file that is outside mirgecom (emirge's requirements.txt).

Sorry, I still don't follow where this external requirements.txt is coming from. The requirements.txt file I've been thinking about is the one in emirge. Oh I see, an extra one outside mirgecom, you said. For me this seems OK, since every package we work with has a requirements.txt file. I do not think of this as two different requirements.txt files for mirgecom, but one emirge@branch/requirements.txt for CEESD development environment (that uses mirgecom@xyz). mirgecom@xyz/requirements.txt is just requirements for mirgecom@xyz.

@majosm
Copy link
Contributor Author

majosm commented Aug 20, 2020

Edit: I agree with everything you just said @majosm about the functionality and motivation of emirge. However the motivation behind our continued discussions about the functionality isn't about whether emirge can be used as an external, or more general tool, but remaining general enough to support one of my use cases - but I think it's an important one.

What does a person do when there is a need to run multiple simulations at the same time with different versions of the code? This is often faced by our main end-user (e.g. the person doing the prediction).

It is my understanding that since only one version of each package can be used at the same time in any given environment - that this situation requires multiple installations of mirgecom , each in its own environment. Is that true?

Gotcha. Right, I think you'd want to clone emirge several times and install each into a separate environment. If #72 doesn't currently allow this, it should be fixed up to do so.

That seems like a very labor-intensive process and would also duplicate the the requirements in two different files, like I mentioned above.

I don't see it as being very labor intensive. Keep in mind, this type of duplication already exists, e.g.: if you want to use a different branch of loopy, you already need to modify requirements.txt for meshmode, grudge, and mirgecom. This just adds one more step in the chain.

Also, emirge's requirement.txt may not always remain a simple duplication of mirgecom's. It's possible we may add some extra packages that are useful within the center, but aren't strictly required for mirgecom, etc.

@matthiasdiener
Copy link
Member

I don't follow that it is labor intensive, or the duplication concern. I'm not saying there would be a new requirements file - just that the requirements file can differ between emirge branches, like any other file can differ between branches.

Regarding the labor intensity: Imagine you create a branch X in mirgecom, and have to create a corresponding branch X in emirge. Some days later, you change the requirements in mirgecom's branch X, and have to remember to make the corresponding changes in emirge's branch X.

This seems like the required amount of labor to me. Iff I want/need emirge to be able to install my branch directly, then yes - I make a branch of emirge that does it (simply by editing the emirge/requirements.txt), and then checking that branch in.

In many cases (including for our personal individual use), simply checking out emirge@master and installing will be enough, because we can easily ( some of us more easily than others ) switch between our branches for our current environment. No need to make an emirge branch for that.

Ok, fair enough.

Regarding the duplication: instead of the relatively coarse-grained dependencies tracked in the #72 PR, you now need to track every change to mirgecom requirements.txt (whether to the master branch or another branch) in a second file that is outside mirgecom (emirge's requirements.txt).

Sorry, I still don't follow where this external requirements.txt is coming from. The requirements.txt file I've been thinking about is the one in emirge.

We'll need to duplicate changes in mirgecom's requirements.txt to emirge's requirements.txt.

@matthiasdiener
Copy link
Member

matthiasdiener commented Aug 20, 2020

Keep in mind, this type of duplication already exists, e.g.: if you want to use a different branch of loopy, you already need to modify requirements.txt for meshmode, grudge, and mirgecom. This just adds one more step in the chain.

I don't think we need to modify meshmode's etc. requirements.txt if we want to use a different branch of loopy. This should come in automatically if we change mirgecom's requirements.txt.

Also, emirge's requirement.txt may not always remain a simple duplication of mirgecom's. It's possible we may add some extra packages that are useful within the center, but aren't strictly required for mirgecom, etc.

Certainly, and PR #72 already has such a facility, by using requirements.txt for "our" packages (such as loopy), and requirements_dev.txt for nice-to-have packages such as flake8.

@majosm
Copy link
Contributor Author

majosm commented Aug 20, 2020

I don't think we need to modify meshmode's etc. requirements.txt if we want to use a different branch of loopy. This will come in automatically if we change mirgecom's requirements.txt.

Oh good, then we don't need to change mirgecom's either, we can just change emirge's. 😄

Certainly, and PR #72 already has such a facility, by using requirements.txt for "our" packages (such as loopy), and requirements_dev.txt for nice-to-have packages such as flake8.

I'm not really talking about optional dev packages, I was thinking more along the lines of things we would use in our simulations that fall outside of the scope of mirgecom as a solver-focused package.

@matthiasdiener
Copy link
Member

I don't think we need to modify meshmode's etc. requirements.txt if we want to use a different branch of loopy. This will come in automatically if we change mirgecom's requirements.txt.

Oh good, then we don't need to change mirgecom's either, we can just change emirge's. 😄

Not really, since mirgecom's requirements.txt should reflect accurate requirements as well. Not everyone will use emirge to install mirgecom.

Certainly, and PR #72 already has such a facility, by using requirements.txt for "our" packages (such as loopy), and requirements_dev.txt for nice-to-have packages such as flake8.

I'm not really talking about optional dev packages, I was thinking more along the lines of things we would use in our simulations that fall outside of the scope of mirgecom as a solver-focused package.

Ok, we could add another file, requirements_sim.txt or so.

@majosm
Copy link
Contributor Author

majosm commented Aug 20, 2020

Not really, since mirgecom's requirements.txt should reflect accurate requirements as well. Not everyone will use emirge to install mirgecom.

Nor will everyone use mirgecom to install meshmode or grudge. 🙂

To clarify one thing: I don't think every dependency of mirgecom should go in emirge's requirements.txt. It should just contain our CEESD packages (and maybe a few odds and ends that don't make sense as dependencies of any particular subpackage). Other dependencies will get installed by the package that needs them.

Ok, we could add another file, requirements_sim.txt or so.

Why?

@matthiasdiener
Copy link
Member

matthiasdiener commented Aug 20, 2020

Not really, since mirgecom's requirements.txt should reflect accurate requirements as well. Not everyone will use emirge to install mirgecom.

Nor will everyone use mirgecom to install meshmode or grudge. 🙂

Sure, but wouldn't that be the responsibility of the meshmode etc. package then? Just to clarify: If you need a different loopy branch due to a change in mirgecom, changing the requirements.txt in mirgecom should be enough; you normally don't need to change the requirements.txt in other packages that also happen to require loopy.

To clarify one thing: I don't think every dependency of mirgecom should go in emirge's requirements.txt. It should just contain our CEESD packages (and maybe a few odds and ends that don't make sense as dependencies of any particular subpackage). Other dependencies will get installed by the package that needs them.

Right, and thats the situation with #72 currently (minus the setup.py vs. requirements.txt shenanigans), right?

Ok, we could add another file, requirements_sim.txt or so.

Why?

To make them easier to install (optionally)?

@MTCam
Copy link
Member

MTCam commented Aug 20, 2020

This might call for a few model use cases - especially the ones we all think we want and/or need. Maybe we can tweak our use cases and the proposed solution i.e. PR #72 (if any tweaking is needed), until they meet up. I have some expectation that my use case may be part of what needs "debugged" here.

But here it is:

Multiple working installs for the end-user.

The end-user has some additional packages to install to their environment specified by conda_extra_package_names.txt and pip_extra_requirements.txt. Assume the user has already tailored those lists for platform compatibility. The end-user requires two different branches of mirgecom, say branch1, and branch2. These two branches are already assumed to have working, valid requirements.txt files inside - indicating whatever branches of subprojects, etc.

Install 1:

git clone git@github.com:/illinois-ceesd/emirge emirge.branch1
cd emirge.branch1
./install.sh --conda-path= --env-name=branch1.env --branch=branch1 --conda-pkgs=conda_extra_package_names.txt --pip-pkgs=pip_extra_requirements.txt

Some explaining of the arguments:

  • --conda-path= is required because two different installations of conda fight with each other when installed side-by-side
  • --env-name=branch1.env is really optional, end-user could let this default to "dgfem", but he opts to name the environment (the only requirement is that this environment is named other than the one used for Install 2)
  • --branch=branch1 is also optional because end-user can always just install master, and switch the branch manually, but the option is there, and it works so he uses it [ this option is removed in use pip to install dependencies #72 ]
  • --pip-pkgs=pip_req.txt - Optional, could apply manually [ option removed by use pip to install dependencies #72 ]
  • --conda-pkgs=conda_extra.txt - This is optional, end-user could apply them manually

After #72 and the removal of those options - the user must either follow either:
manual install path 1: install miregecom@master in editable then switch branch(es) -or-
manual install path 2: edit the emirge@requirements.txt to indicate the branch he needs, branch1, and optionally add the additional pip packages he needs
Question: In (manual install path 1), does the switch of mirgecom branch automatically take care of switching the sub-package branches if required in mirgecom@branch/requirements.txt?

Now the end-user has a fully working mirgecom@branch1 installed on the platform using <common conda> with environment named branch1.env.

Install 2

git clone git@github.com:/illinois-ceesd/emirge emirge.branch2
cd emirge.branch2
./install.sh --conda-path= --env-name=branch2.env --branch=branch2 --conda-pkgs=conda_extra_package_names.txt --pip-pkgs=pip_extra_requirements.txt

After #72, the user will have used one of the manual install paths, MIP1, or MIP2 or some combo thereof. I contend that MIP2 is superior because:

Now the user wants to take his simulations or a portion of those say from Quartz to Lassen, or share his setup with others (which he does). Having followed MIP2, then he can just check in his emirge changes in a branch (e.g. emirge@eu_branch1).

In this sense, MIP2 seems superior even to emirge@master. Additionally if the user wants to automate this process later, it is easier because the automaton can just checkout emirge@eu_branch1 and "hit go", instead of scripting manual steps.

@majosm
Copy link
Contributor Author

majosm commented Aug 21, 2020

Right, and thats the situation with #72 currently (minus the setup.py vs. requirements.txt shenanigans), right?

Yeah, and so apart from occasional hacks to me this doesn't seem like it's creating too much duplication or making it very labor-intensive to maintain.

Sure, but wouldn't that be the responsibility of the meshmode etc. package then? Just to clarify: If you need a different loopy branch due to a change in mirgecom, changing the requirements.txt in mirgecom should be enough; you normally don't need to change the requirements.txt in other packages that also happen to require loopy.

To make them easier to install (optionally)?

(I was thinking more non-optional packages.) Anyway, the point I was trying to make with these two lines of discussion is that we're trying not to treat mirgecom as the "head honcho" package. The idea is for the mirgecom package to have a targeted role (similar to, e.g., grudge), with the possibility that there may be other packages we create in CEESD that sit at the same (or higher?) level in the dependency hierarchy.

@majosm
Copy link
Contributor Author

majosm commented Aug 21, 2020

  • --conda-path= is required because two different installations of conda fight with each other when installed side-by-side
  • --env-name=branch1.env is really optional, end-user could let this default to "dgfem", but he opts to name the environment (the only requirement is that this environment is named other than the one used for Install 2)

Personally, I'd like to see the conda installation/environment creation done in a separate (optional) script, if possible.

  • --branch=branch1 is also optional because end-user can always just install master, and switch the branch manually, but the option is there, and it works so he uses it [ this option is removed in use pip to install dependencies #72 ]

I'm leaning towards saying this should go away...

  • --pip-pkgs=pip_req.txt - Optional, could apply manually [ option removed by use pip to install dependencies #72 ]
  • --conda-pkgs=conda_extra.txt - This is optional, end-user could apply them manually

Question: does the install script do anything extra with these beyond pip install x and conda install y for each x and y in the lists? If it does, I think I'm ok with these, at least for now. This might be motivation to separate the general installer script functionality into its own package at some point, though.

After #72 and the removal of those options - the user must either follow either:
manual install path 1: install miregecom@master in editable then switch branch(es) -or-

I think you would want to switch branches before installing, right?

Question: In (manual install path 1), does the switch of mirgecom branch automatically take care of switching the sub-package branches if required in mirgecom@branch/requirements.txt?

It doesn't, as far as I know. We would need a script for that I think.

@MTCam
Copy link
Member

MTCam commented Aug 21, 2020

  • --conda-path= is required because two different installations of conda fight with each other when installed side-by-side
  • --env-name=branch1.env is really optional, end-user could let this default to "dgfem", but he opts to name the environment (the only requirement is that this environment is named other than the one used for Install 2)

Personally, I'd like to see the conda installation/environment creation done in a separate (optional) script, if possible.

I've been advocating for making the conda step separate for some time. It is a convenience at best, and just in the way most of the time.

  • --branch=branch1 is also optional because end-user can always just install master, and switch the branch manually, but the option is there, and it works so he uses it [ this option is removed in use pip to install dependencies #72 ]

I'm leaning towards saying this should go away...

Indeed, this goes away with #72.

  • --pip-pkgs=pip_req.txt - Optional, could apply manually [ option removed by use pip to install dependencies #72 ]
  • --conda-pkgs=conda_extra.txt - This is optional, end-user could apply them manually

Question: does the install script do anything extra with these beyond pip install x and conda install y for each x and y in the lists? If it does, I think I'm ok with these, at least for now. This might be motivation to separate the general installer script functionality into its own package at some point, though.

No, this is just convenience. Consider that if you wanted to automate this, then first you'd need to get emirge, install the ceesd env to get conda and a new compatible environment, then script installing extra stuff above-and-beyond ceesd requirements into the new conda/env. This is more difficult and error-prone than you might think. If the install script has the option to put in extra stuff just by listing packages - then this makes it much easier to automate environment customization.

After #72 and the removal of those options - the user must either follow either:
manual install path 1: install miregecom@master in editable then switch branch(es) -or-

I think you would want to switch branches before installing, right?

As @matthiasdiener has been saying - we can just install from master, then go in and switch the branch of mirgecom. Before installing, I have no mirgecom directory to go into to switch the branch. Manual install path 2 is the one where I switch the branch in the emirge/requirements.txt before installing.

Question: In (manual install path 1), does the switch of mirgecom branch automatically take care of switching the sub-package branches if required in mirgecom@branch/requirements.txt?

It doesn't, as far as I know. We would need a script for that I think.
So - another superiority of manual install path 2. Switch the branch before installing.

@majosm
Copy link
Contributor Author

majosm commented Aug 21, 2020

This is more difficult and error-prone than you might think.

This is kind of what I meant by "doing something extra" (maybe poor wording). i.e., it's not exactly equivalent to something like:

./install.sh <args>
for x in <extra conda packages>; do
    conda install x
done
for y in <extra pip packages>; do
    pip install y
done

Instead there's some additional processing going on inside the script that makes having those options worthwhile. Right?

As @matthiasdiener has been saying - we can just install from master, then go in and switch the branch of mirgecom. Before installing, I have no mirgecom directory to go into to switch the branch. Manual install path 2 is the one where I switch the branch in the emirge/requirements.txt before installing.

So - another superiority of manual install path 2. Switch the branch before installing.

Ahhh ok, I misunderstood what MIP1 was doing. Got it now.

@MTCam
Copy link
Member

MTCam commented Aug 21, 2020

Instead there's some additional processing going on inside the script that makes having those options worthwhile. Right?

I meant to say, no, there is nothing special about allowing these extra options. It only makes it more convenient for adding stuff to the environment in the install process. The --pip-pkgs=additional_requirements.txt option just passes that requirements option through to pip. The --conda-pkgs=my_list.txt file does basically the operation you outlined.

The thing that makes these options nice is that they allow the user to insert additional things that the emirge install script can easily put into the CEESD environment on-the-fly. Without the options, then I need to extract knowledge about which conda to use and which environment name to add the packages to, and then ensure that my additional install scripts pick up the right environment settings. On the command line, this is a trivial thing to do, but in scripts it is sort of cumbersome - and introduces yet another set of scripts to run to setup testing environments, etc.

Since these options do not harm, and provide a useful function (useful to me), I'd say keep them.

One other thing that you asked about earlier @majosm and I've just experienced again today reminding me why it was there....

You asked me something like "why do you check out your own mirgecom since emirge has just done already when it installed mirgecom?".

I gave the wrong answer when I said I don't need to any more. When ABaTe does its Continuous mode building, it checks the repo for updates every 5-10 minutes. It does this just by doing git pull and see if the revision number changes (detects changes in submodules too). Detecting one, it triggers a build and test on the remote platforms. When emirge had submodules (including mirgecom), this worked great. A change in mirgecom could trigger the builds. But when emirge changed to not have submodules - this became much harder. Now Continuous testing needs to detect changes in mirgecom to trigger a build-and-test of emirge. So I toyed with a construction that would have an independent mirgecom submodule just so the revision number would update when the Continuous ABaTe runner checked the repo.

The Continuous issue was never solved. But we subvert it by just not doing Continuous-type ABaTe testing. Nightly's will be enough for the production compute platforms and we'll let the github CI do all the continuous testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants