Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dev container; fix test discovery bug; commit LF on all systems #45

Merged
merged 60 commits into from
Feb 2, 2023

Conversation

EliahKagan
Copy link
Owner

This adds an experimental dev container configuration. It also makes two other significant changes that support it, which I think are reasonable even if dev containers are not used.

I have not made a custom Docker image. As detailed in doc/install-dev-container.md, this dev container config uses a general-purpose Python image, adds dev container features, and adds more customization and tools as well as installing palgoviz and its dependencies via both mamba and poetry in a script triggered by a postCreateCommand. So the dev container is quite fresh each time it is created, but it also takes a long time to create. It would be an improvement to have a Docker image, provided it were updated frequently enough. As further detailed there, JupyterLab is not yet working well enough in this dev container to recommend the dev container as a good alternative to other workflows that use JupyerLab--at least not when the dev container is used in a codespace. The dev container can also be used locally, with VS Code and Docker.

On the test discovery bug, this was due to conda/conda#12082 (see also microsoft/vscode-python#19069 and microsoft/vscode-python#18493). That conda bug, as indicated there, affects all systems in general, but it never seems to have affected palgoviz on Windows. It affects palgoviz on some GNU/Linux systems and not others. But one affected system was the dev container when used locally (it worked fine in a codespace), so I decided to fix that here, too. The fix affects the procedures we document for installing palgoviz, recommending pip install -e . in place of conda develop ., allowing a dependency that triggered the problem to be removed. This might also be a better recommendation for other reasons, in view of conda/conda-build#1992 (comment) and conda/conda-build#4251. The main commits in palgoviz with more information about this are c99cb6f and 84dbe59 (but 0b56e9e is also relevant).

On using LF line endings in our source code files on all systems--even Windows, which usually uses CRLF line endings--there is controversy in general about whether this is a better default than using system-specific line endings. The factor that made this change important to dev containers is microsoft/vscode-remote-release#302. When the dev container is used locally, and when the system it is used on is Windows, the git command in Windows is used to clone and check out the repository initially. When git is configured in the usual way in Windows (which is often expected), CLRF line endings are checked out and LF line endings are committed. But that causes the container to get CLRF line endings, even though git in the system running in the container--which in this case is Debian--expects LF line endings! See bdd5789 (and 3bfbb01) for the change.

It may need more customization later.
This is simpler than what I was doing before, and I'm hoping it
will work better, too. (VS Code in the dev container has not been
automatically selecting the virtual environment.)
This no longer puts any venv in the project dir.

Both mamba and poetry are installed in a synchronous postCreate.
Sourcing ~/.bashrc doesn't help, since the shell is noninteractive
(it checks). But it's also not the best way to do this.

Also, I forgot to pass --yes, even though the update otherwise
prompts for confirmation.
It changed the meaning in the comment.
I'm using absolute paths (with tilde expansion) everywhere, except
in the command passed to "mamba run".

This also undoes the change in the last commit, which didn't help
and shouldn't be necessary.
For when using poetry.

(conda/mamba put graphviz in the environment, taking precedence.)
Instead of downloading and installing them in postCreate.

+ Modify postCreate accordingly, to still do config
+ Add shellcheck as a dev feature
Permissions are not set to allow it, at least in the case of
poetry, installed via pipx but through a feature. I don't want to
elevate privileges to do it, and I'm not sure it's a good idea in
general, really.

If we get a sufficiently recent version of poetry (and differences
to the latest stable are not security-related), then it's fine to
use a slightly older version.
This adds commands to postCreate to set up Git prompt integration
in PowerShell (pwsh) using the posh-git module.
This removes the code from postCreate to install Graphviz by
downloading and building it from source, and instead installs it
via Homebrew, using the homebrew-package dev container feature.

Instead of building it from source.
+ Configure it for the workspace (not for just the dev container).
palgoviz (not specifically the ek/container branch) is affected by
conda/conda#12082.

The effect in palgoviz is that whether pytest discovery works in VS
Code seems to be determined by details of how conda is installed.
In practice, test discovery seems always to work on Windows but only
occasionally on GNU/Linux (even though that does not appear to be
the general pattern of the conda bug).

See also:

- microsoft/vscode-python#19069
- microsoft/vscode-python#18493

This bug should be worked around in palgoviz because, currently,
the conda instructions in README.md and doc/install-with-conda.md
will likely fail to work for a substantial fraction of users.

We depend on conda-build for "conda develop ." or "mamba develop ."
but "pip install -e ." should meet our needs (though I feel it is
less elegant). This commit removes the conda-build dependency and
runs "pip install -e ." in the dev container.

NOTE: These are NOT the only changes that need to be made. If we
are going use this workaround, then those two documentation files
must also be updated to recommend it.
Homebrew was installing a lot of dependencies for it, including
linux-headers, which failed with "fatal: not in a git directory".

Unlike Homebrew, which has bottles, I don't think Nix installs
binary packages, so that is a disadvantage. It might be worthwhile
to get Homebrew working for it and compare the size and time to
install. For now, I'm trying Nix.
It appears I was mistaken in 7044a6a: the powershell feature break
was not due to the apt-packages feature being added. Instead, it
seems to have been a coincidence, where access to that container
was temporarily disrupted.

I was able to produce that with several versions of the dev
container config, including multiple versions that had worked
previously. Now I cannot reproduce it at all, so I am hoping it
was a temporary problem that was fixed or otherwise went away.

This commit brings things back to the way they were in e823fc1.
This only affects the dev container. It has a few advantages:

(1) Mambaforge is placed in a directory where users are more likely
    to expect it, and where it is more obviously owned by the user.

(2) Only the "condabin" directory is added to $PATH when no conda
    environment is activated, so python and python3 continue to
    refer to the Python 3.11.1 installation that is supplied by
    mcr.microsoft.com/devcontainers/python:0-3.11, rather than the
    Mambaforge base environment's Python 3.10.8.

(3) Maybe also related to $PATH, it avoids a pytest test discovery
    error in VS Code in the dev container. The error is due to:

    conda/conda#12082

    For more on the connection to test discovery in VS Code, see:

    microsoft/vscode-python#18493
    microsoft/vscode-python#19069

    But I don't know why this avoids it. In particular, merely
    keeping the conda/mamba-provided bin directory containing its
    base environment's python out of $PATH does not fix that. This
    problem seems to happen on many, though not all, GNU/Linux
    systems with palgoviz. It seems never to happen in Windows,
    with palgoviz. I don't know why that is, either.

    Unless conda fixes the bug soon, palgoviz ought to work around
    it. See c99cb6f, though that branch may or may not be merged.
    By having the dev container install Mambaforge in postCreate,
    assuming this continues to avoid the effects of the bug, it may
    be feasible/reasonable to merge the dev container config to
    main before deciding if/how palgoviz should work around the
    bug. This could be beneficial because having the dev container
    should make some kinds of testing easier.
It's a bash script, so it should have Unix-style (LF) line endings
no matter where it is cloned. This may matter for some ways of
cloning the repository into a Docker container via Git for Windows.
This puts devcontainer.json and the accompanying postCreate script
in .devcontainer/experimental instead of directly in .devcontainer,
so that VS Code doesn't automatically suggest reopening in a dev
container, which most users, at least at this point, should not do.
For both the command line and VS Code.
Before it was sort of the default, but only in VS Code settings.
This resulted in bash being used when resuming the container and in
various other contexts, including terminals inside JupyterLab.
It needs to have the protocol (it is a URL, not just a hostname).
This changes postStart so it:

(1) Rewrites the JupyterLab configuration file so long as it has
    no content not generated by postStart. This includes the case
    where the file doesn't exist, so it still covers newly created
    containers. But it should also cover containers that have been
    renamed and then restarted. (And those that are restarted
    without having been renamed. But in that case, the new
    configuration is the same as the old.)

(2) Permits any port from 8000 to 9999, rather than only port 8888.
This doesn't name the container created from it, but it does serve
to document that the configuration is experimental.

(I might be moving the config back up directly into .devcontainer
soon, if it looks like it is nicer to use it -- even though it is
still experimental -- than the "Default Codespaces configuration".)
This puts the dev container config (devcontainer.json and the
associated scripts) back directly in .devcontainer, rather than in
a subdirectory.

Note that the dev container is still experimental. I just want to
make it easier to use, and harder to not use when one means to.
I forgot to change the paths in .gitattributes when moving files,
including scripts whose line endings must always be LF (never
CRLF), back from ~/.devcontainer/experimental to ~/.devcontainer.
The dev containers extension for VS Code on Windows needs this, or
the scripts fail to run.
When the dev container is not running in a codespace, such as when
running it locally, this produced an error.
This goes with the Git configuration change in bdd5789.
The chsh command in postCreate, and the VS Code configuration
(terminal.integrated.defaultProfile.linux provided by the fish dev
container feature), were together sufficient for most purposes.

But when using the container *not* in a GitHub Codespace (with the
VS Code dev containers extension), $SHELL is still /bin/bash. Maybe
this will help with that, and address the occasional situations
where bash runs when fish is expected, in non-codespace containers.
This changes the top-level readme and the detailed conda/mamba
instructions to recommend running "pip install -e ." instead of
"conda develop .".

This documentation change corresponds to c99cb6f.

See also 0b56e9e. I'm doing this now, rather than as a completely
separate change from the introduction of the dev container config,
because while the dev container config seems unaffected by this bug
when used in a GitHub codespace, it is affected when deployed
locally via the VS Code dev containers extension.

The changes to the poetry instructions are just copyediting. The
actual procedure using poetry is not affected. ("poetry install"
takes care of the corresponding step automatically.)
This follows up on 84dbe59 (the preceding commit) by changing
affected documentation other than our two main sources of conda
instructions (which were taken care of in that preceding commit).
In addition to making that change in notes/omissions.md, this
commit also applies some small copyediting changes in another file.

NOTE: None of the changes in this commit are related to the recent
documentation changes recommending "pip install -e ." instead of
"conda develop .".
We're using 0BSD, which imposes no restrictions, but it's still
best for stated copyright years to be correct. There are two
changes to make:

(1) Some new files this year have 2023 as their correctly stated
    copyright year, but this year has not been added to LICENSE.

(2) While we're at it, there is actually some material here from
    late 2021, so 2021 should be listed in LICENSE, and in files
    that list copyright years (i.e., those with the full 0BSD
    statement, such as Python modules).

To help find files to update, especially for case (2), I've written
a small script. This commit just adds the script; it does not
update any copyright years. The script is pretty limited and I
think it can be removed in the forthcoming commit that updates the
years (but it may be useful to have it in the commit history).
As detailed in 1392b94 (the previous commit), we're using 0BSD,
which imposes no restrictions, but it is still best for stated
copyright years to be fully accurate.

This adds 2023 to LICENSE, reflecting substantial additions this
year, which include the configuration and scripts in .devcontainer.

It also corrects an earlier omission of mine where Python modules
we made and committed in late 2021 (rather than 2022) didn't show
that year.

There are also notebooks from December 2021, but we're not showing
full notices with copyright years in notebooks, so those don't have
to be updated.

This commit also removes the script added in 1392b94 that I used to
help me find some of the relevant files.
This also updates some other documentation.
@EliahKagan EliahKagan merged commit c665c7e into main Feb 2, 2023
@EliahKagan EliahKagan deleted the ek/container branch February 2, 2023 05:05
EliahKagan added a commit that referenced this pull request Feb 13, 2023
Currently the repo is set to LF for all text files except batch
files, which applies to all platforms, so it's not necessary to
carry in core.autocrlf.

Since there are arguably some advantages of cloning most files with
native line endings (CRLF on Windows, LF on other systems), and the
original reason for departing from that behavior was the absence of
a technique, like the on here, to carry core.autocrlf in from the
host, that might be undone. If so, then the initialize shell script
will have to be modified correspondingly. Carrying in core.autocrlf
at this time was unintentional and a bug.

Probably the work so far on the ek/container branch should be
merged to main before doing line ending changes affecting how most
files in the repository are handled.

See #45 for context.
EliahKagan added a commit to EliahKagan/EmbeddingScratchwork that referenced this pull request Mar 18, 2023
This adds a pyproject.toml file. It is handwritten, but it declares
poetry as its backend tool. Yet we are not (as of now) using or
supporting Poetry, and that tool need not actually be installed.

The actual project does have dependencies (per environment.yml),
but this new handwritten pyproject.toml file does not list them.
Currently, that file only exists to make the project installable,
so that an editable install can be made using: "pip install -e ."

That command should, of course, only be run in the conda
environment. (If we later also support Poetry, then
"poetry install" could be used outside the conda environment, but
"pip install -e ." would still not be used.)

This makes it so that "import embed" works even when run from a
different directory. That's important because the current directory
in a Jupyter notebook varies by application, with VS Code having it
be the directory in which the notebook resides, even if that is not
the folder that VS Code has open. Now we can run "pip install -e ."
to make "import embed" statements in notebooks work again.

A possible alternative is running "conda develop ." (or
"mamba develop .") but this requires the conda-build package to be
installed. We shouldn't require that it be installed in the base
environment, because while that would be fine for the dev
container, it would impose a requirement to make a global change on
users who are running locally (without a dev container). But when
conda-build is installed in the project environment, a conda bug
causes test discovery to break in the VS Code test runner (beaker).
See the "On the test discovery bug" paragraph and its links, in
EliahKagan/palgoviz#45, for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant