Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests for Python 3.10 & 3.11, bump version to 0.2.0, assorted WIP paper updates #89

Merged
merged 11 commits into from
Aug 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions .github/workflows/ci-tests-jupyter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,19 @@ jobs:
- python-version: 3.8
ipython-version: 7.3.0 # earliest version to support Python 3.8
- python-version: 3.8
ipython-version: latest
ipython-version: 8.12.2 # latest version to support Python 3.8
- python-version: 3.9
ipython-version: 7.15 # earliest version to support Python 3.9
- python-version: 3.9
ipython-version: latest
- python-version: '3.10'
ipython-version: 8.0 # earliest version to support Python 3.10
- python-version: '3.10'
ipython-version: latest
- python-version: 3.11
ipython-version: 8.8.0 # earliest version to support Python 3.11
- python-version: 3.11
ipython-version: latest
env:
HEAD_FORK: ${{ github.repository_owner }}
HEAD_SHA: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
Expand Down Expand Up @@ -140,9 +148,10 @@ jobs:
pip install "ipykernel==5.0.0" \
ipython-genutils \
requests \
"scipy<=1.7.3" \
fastdtw==0.3.4 \
tqdm==4.41.1
tqdm==4.41.1 \
"numpy<=1.23.5"
[[ "$PYTHON_VERSION" =~ ^3.11$ ]] && pip install scipy==1.11.1 || pip install "scipy<=1.7.3"
if [[ "$IPYTHON_VERSION" == "latest" ]]; then
pip install --upgrade IPython
else
Expand Down
78 changes: 43 additions & 35 deletions paper/main.bib
Original file line number Diff line number Diff line change
Expand Up @@ -5,49 +5,55 @@ @misc{BickEtal07
title = {{v}irtualenv: Virtual {P}ython {E}nvironment builder},
year = {2007}}

@misc{Chatify,
author = {{Chatify~Team}},
@misc{MannEtal23b,
author = {J R Manning and H Manjunatha and K P K{\"{o}}rding},
doi = {10.5281/zenodo.8152315},
howpublished = {\url{https://github.com/ContextLab/chatify}},
month = {July},
title = {Chatify: {A} {J}upyter extension for adding LLM-driven chatbots to interactive notebooks},
year = {2023},
doi = {10.5281/zenodo.8152315}}
title = {Chatify: {A} {J}upyter extension for adding {LLM}-driven chatbots to interactive notebooks},
year = {2023}}

@misc{Neuromatch,
author = {{Neuromatch~Team}},
howpublished = {\url{https://compneuro.neuromatch.io/}},
@article{vanVEtal21,
author = {T van Veigen and A Akrami and K Bonnen and E DeWitt and A Hyafil and H Ledmyr and G W Lindsay and P Mineault and J D Murray and X Pitkow and A Puce and M Sedigh-Savestani and C Stringer and T Achakulvisut and E Alikarami and M Selim Atay and E Batty and J C Erlich and B V Galbraith and Y Guo and A L Juavinett and M R Krause and S Li and M Pachitariu and E Straley and D Valeriani and E Vaughan and M Vaziri-Pashkam and M L Waskom and G Blohm and K P K{\"{o}}rding and P Schrater and B Wyble and S Escola and M A K Peters},
doi = {10.1016/j.tics.2021.03.018},
journal = {Trends in Cognitive Sciences},
month = {July},
publisher = {Online},
title = {{Neuromatch Academy: A free, online, and hands-on computational neuroscience school}},
year = {2023}}
number = {7},
pages = {535--538},
title = {Neuromatch {A}cademy: Teaching computational neuroscience with global accessibility},
volume = {25},
year = {2021}}

@article{MannEtal23,
@article{MannEtal23a,
author = {J R Manning and E C Whitaker and P C Fitzpatrick and M R Lee and A M Frantz and B J Bollinger and D Romanova and C E Field and A C Heusser},
doi = {10.31234/osf.io/erzfp},
journal = {{PsyArXiv}},
pages = {doi.org/10.31234/osf.io/erzfp},
month = {January},
title = {Feature and order manipulations in a free recall task affect memory for current and future lists},
year = {2023}}

@article{OwenMann23,
author = {L L W Owen and J R Manning},
doi = {10.1101/2023.03.17.533152},
journal = {{bioRxiv}},
pages = {doi.org/10.1101/2023.03.17.53315},
month = {March},
title = {High-level cognition is supported by information-rich but compressible brain activity patterns},
year = {2023}}

@article{ZimaEtal23,
author = {K Ziman and M R Lee and A R Martinez and E D Adner and J R Manning},
doi = {10.31234/osf.io/2ps6e},
journal = {{PsyArXiv}},
title = {Category-based and location-based volitional covert attention affect memory at different timescales},
volume = {doi.org/10.31234/osf.io/2ps6e},
year = {2023}}

@article{PimeEtal19,
@inproceedings{PimeEtal19,
author = {J F Pimentel and L Murta and V Braganholo and J Freire},
journal = {{IEEE} International Conference on Mining Software Repositories},
booktitle = {2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)},
doi = {10.1109/MSR.2019.00077},
organization = {{IEEE}},
pages = {507--517},
title = {A large-scale study about quality and reproducibility of {Jupyter} notebooks},
volume = {16},
year = {2019}}

@book{vanR95,
Expand All @@ -67,10 +73,9 @@ @misc{Pyth03
year = {2003}}

@misc{cond15,
author = {{Conda-forge~community}},
author = {{conda-forge community}},
doi = {10.5281/zenodo.4774217},
force = {True},
howpublished = {\url{https://doi.org/10.5281/zenodo.4774217}},
month = {July},
publisher = {Zenodo},
title = {{The conda-forge Project: Community-based Software Distribution Built on the conda Package Format and Ecosystem}},
Expand Down Expand Up @@ -131,7 +136,7 @@ @article{Gold74

@article{AltiEtal05,
author = {Y Altintas and C Brecher and M Weck and S Witt},
doi = {https://doi.org/10.1016/S0007-8506(07)60022-5},
doi = {10.1016/S0007-8506(07)60022-5},
journal = {{CIRP} Annals},
number = {2},
pages = {115--138},
Expand All @@ -154,16 +159,16 @@ @article{Merk14
month = {March},
number = {2},
pages = {2},
title = {Docker: lightweight linux containers for consistent development and deployment},
title = {Docker: {L}ightweight {L}inux containers for consistent development and deployment},
volume = {239},
year = {2014}}

@article{KurtEtal17,
author = {Gregory M Kurtzer and Vanessa Sochat and Michael W Bauer},
journal = {{PLoS} One},
doi = {10.1371/journal.pone.0177459},
journal = {{PLoS} {ONE}},
number = {5},
pages = {e0177459},
publisher = {Public Library of Science San Francisco, {CA} {USA}},
title = {Singularity: {S}cientific containers for mobility of compute},
volume = {12},
year = {2017}}
Expand Down Expand Up @@ -239,14 +244,13 @@ @article{HarrEtal20
@misc{AbadEtal15,
author = {Mart\'{i}n~Abadi and Ashish~Agarwal and Paul~Barham and Eugene~Brevdo and Zhifeng~Chen and Craig~Citro and Greg~S.~Corrado and Andy~Davis and Jeffrey~Dean and Matthieu~Devin and Sanjay~Ghemawat and Ian~Goodfellow and Andrew~Harp and Geoffrey~Irving and Michael~Isard and Yangqing Jia and Rafal~Jozefowicz and Lukasz~Kaiser and Manjunath~Kudlur and Josh~Levenberg and Dandelion~Man\'{e} and Rajat~Monga and Sherry~Moore and Derek~Murray and Chris~Olah and Mike~Schuster and Jonathon~Shlens and Benoit~Steiner and Ilya~Sutskever and Kunal~Talwar and Paul~Tucker and Vincent~Vanhoucke and Vijay~Vasudevan and Fernanda~Vi\'{e}gas and Oriol~Vinyals and Pete~Warden and Martin~Wattenberg and Martin~Wicke and Yuan~Yu and Xiaoqiang~Zheng},
force = {True},
note = {Software available from tensorflow.org},
title = {{TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems}},
url = {https://www.tensorflow.org/},
year = {2015}}

@article{McInEtal18b,
@article{McInEtal18,
author = {L McInnes and J Healy and N Saul and L Gro{\ss}berger},
doi = {https://doi.org/10.21105/joss.00861},
doi = {10.21105/joss.00861},
journal = {Journal of Open Source Software},
number = {29},
pages = {861},
Expand Down Expand Up @@ -276,9 +280,11 @@ @article{Wask21

@article{HeusEtal17,
author = {A C Heusser and P C Fitzpatrick and C E Field and K Ziman and J R Manning},
doi = {10.21105/joss.00424},
journal = {Journal of Open Source Software},
title = {Quail: a {Python} toolbox for analyzing and plotting free recall data},
volume = {10.21105/joss.00424},
number = {18},
title = {Quail: A {Python} toolbox for analyzing and plotting free recall data},
volume = {2},
year = {2017}}

@misc{FredEtal15,
Expand Down Expand Up @@ -306,7 +312,7 @@ @article{BleiEtal03
volume = {3},
year = {2003}}

@misc{Mann21d,
@misc{Mann21a,
author = {J R Manning},
doi = {10.5281/zenodo.5182775},
howpublished = {\url{https://github.com/ContextLab/storytelling-with-data}},
Expand All @@ -316,7 +322,7 @@ @misc{Mann21d
year = {2021}}

@misc{Mann22,
author = {Jeremy Manning},
author = {J R Manning},
doi = {10.5281/zenodo.6596762},
force = {True},
howpublished = {\url{https://github.com/ContextLab/experimental-psychology/tree/v1.0}},
Expand All @@ -325,7 +331,7 @@ @misc{Mann22
title = {{ContextLab/experimental-psychology: v1.0 (Spring, 2022)}},
year = {2022}}

@misc{Mann21e,
@misc{Mann21b,
author = {J R Manning},
doi = {10.5281/zenodo.7261831},
force = {True},
Expand All @@ -336,20 +342,22 @@ @misc{Mann21e

@article{GaoEtal20,
author = {Gao, Leo and Biderman, Stella and Black, Sid and Golding, Laurence and Hoppe, Travis and Foster, Charles and Phang, Jason and He, Horace and Thite, Anish and Nabeshima, Noa and Presser, Shawn and Leahy, Connor},
doi = {10.48550/arXiv.2101.00027},
force = {True},
journal = {{arXiv} preprint ar{X}iv:2101.00027},
journal = {{arXiv} preprint},
title = {{The Pile: An 800GB Dataset of Diverse Text for Language Modeling}},
year = {2020}}

@misc{BlacEtal21,
author = {Black, Sid and Gao, Leo and Wang, Phil and Leahy, Connor and Biderman, Stella},
doi = {10.5281/zenodo.5297715},
force = {True},
howpublished = {\url{http://github.com/eleutherai/gpt-neo}},
month = {March},
title = {{GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow}},
version = {1.0},
year = {2021}}

@article{HeusEtal18a,
@article{HeusEtal18,
author = {A C Heusser and K Ziman and L L W Owen and J R Manning},
journal = {Journal of Machine Learning Research},
number = {152},
Expand Down
Binary file modified paper/main.pdf
Binary file not shown.
46 changes: 15 additions & 31 deletions paper/main.tex
Original file line number Diff line number Diff line change
Expand Up @@ -309,7 +309,7 @@ \subsubsection{The \texttt{smuggle} statement}\label{subsec:smuggle}
missing dependencies of those packages) into a notebook-specific, virtual
environment-like directory called a ``project'' (see
Sec.~\ref{subsec:projects}). In turn, \texttt{smuggle} statements executed in a
particular notebook will preferentially load packages from the notebook's
particular notebook will preferentially load packages from that notebook's
project directory whenever they are available, rather than searching for them
in the user's main Python environment. In this way, \texttt{smuggle}
statements can be substituted for \texttt{import} statements to automatically
Expand Down Expand Up @@ -387,32 +387,23 @@ \subsubsection{The onion comment}\label{subsec:onion}

\subsubsection{Projects}\label{subsec:projects}

Standard approaches to installing new packages from within a notebook can alter the local Python environment in potentially unexpected and undesired ways. For example, running a notebook that installs its dependencies via IPython system shell commands (prefixed with ``\texttt{!}'') or magic commands (prefixed with ``\texttt{\%}'') may cause other existing packages in the user's environment to be uninstalled and replaced with alternate versions. This can lead to incompatibilities between packages, affect the behavior of other scripts or notebooks, or even interfere with system applications.
Standard approaches to installing packages from within a notebook can alter the local Python environment in potentially unexpected and undesired ways. For example, running a notebook that installs its dependencies via system shell commands (prefixed with ``\texttt{!}'') or IPython magic commands (prefixed with ``\texttt{\%}'') may cause other existing packages in the user's environment to be uninstalled and replaced with alternate versions. This can lead to incompatibilities between installed packages, affect the behavior of the user's other scripts or notebooks, or even interfere with system applications.

To prevent Davos-enhanced notebooks from having unwanted side-effects on the user's environment, Davos isolates packages installed via \texttt{smuggle} statements using a custom scheme called ``projects.'' Davos projects function similarly to simplified versions of standard Python virtual environments (e.g., created with the standard library's \texttt{venv} module or a third-party tool like \texttt{virtualenv}~\cite{BickEtal07}) with a few differences: they do not need to be manually activated and deactivated, they do not contain separate Python or \texttt{pip} executables, and they \textit{extend} the main Python environment rather than replace it.
To prevent Davos-enhanced notebooks from having unwanted side-effects on the user's environment, Davos automatically isolates packages installed via \texttt{smuggle} statements using a custom scheme called ``projects.'' Functionally, a Davos project is similar to a standard Python virtual environment (e.g., created with the standard library's \texttt{venv} module or a third-party tool like \texttt{virtualenv}~\cite{BickEtal07}): it consists of a directory (within a hidden \texttt{.davos} folder in the user's home directory) that houses third-party packages needed for a particular project or task. However, Davos projects do not need to be manually activated and deactivated, do not contain separate Python or \texttt{pip} executables, and \textit{extend} the user's main Python environment rather than replace it.

When Davos is imported into a notebook, a notebook-specific project directory is automatically created (if it does not exist already).
%When Davos is imported into a notebook, a notebook-specific project directory is automatically created (if it does not exist already), named for the absolute path to the notebook file.


% Davos implements a custom scheme for isolating packages installed by \texttt{smuggle} statements, called ``projects.''
Notebook-specific projects are named for the absolute path to the notebook file.

%A Davos project consists of a directory
%
%
% function similarly to simplified versions of
%
%
%
%Additionally, Davos projects \textit{extend}, rather than replace, the main Python environment.
%
%
%Davos projects function similarly to a simplified version of a virtual environment created with Python's built-in \texttt{venv} module, or a third-party tool like

%Davos projects function similarly to simplified versions of standard Python virtual environments (e.g., created with the standard library's \texttt{venv} module or a third-party tool like \texttt{virtualenv}~\cite{BickEtal07}) with a few differences: they do not need to be manually activated and deactivated, they do not contain separate Python or \texttt{pip} executables, and they \textit{extend} the main Python environment rather than replace it.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% ADD THIS EITHER TO START OF PROJECTS SUBSUBSECTION OR IN IMPACT SECTION %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Standard approaches to installing new packages from within a notebook can alter a user's Python environment in potentially unexpected and undesired ways. For example, running a notebook that installs its dependencies using IPython system shell commands (prefixed with ``\texttt{!}'') or magic commands (prefixed with ``\texttt{\%}'') may cause other existing packages in the user's environment to be uninstalled and replaced with alternate versions. This can lead to conflicts between package versions, affect the behavior of the user's other scripts or notebooks, or even interfere with system applications.
%
%A common way of avoiding this is to create a virtual environment in which to run the notebook, and instruct anyone with whom the notebook is shared to do the same. While effective, this added requirement introduces additional
%
%
Expand All @@ -425,13 +416,6 @@ \subsubsection{Projects}\label{subsec:projects}
% \item{davos's solution is projects}
%\end{itemize}

%Standard approaches to installing new packages from within a notebook (e.g., system shell commands prefixed with ``\texttt{!}'' or IPython magic commands prefixed with ``\texttt{\%}'') can alter the local Python environment in potentially unexpected and undesired ways. For example, running a notebook that installs its dependencies in the user's main environment may cause other existing packages to be uninstalled and replaced with alternate versions. This can lead to conflicts between package versions, affect the behavior of the user's other scripts or notebooks, or even break system applications.

%To prevent Davos-enhanced notebooks from having unwanted side-effects on the user's environment, Davos implements its own virtual environment-like scheme for isolating packages it installs.




\bigskip\bigskip\textcolor{red}{\textbf{========== TODO: finish editing from here to end ==========}}\bigskip\bigskip

%Because Davos can install new packages, running the code in a
Expand Down Expand Up @@ -755,7 +739,7 @@ \section{Illustrative Example}\label{sec:illustrative-example}
these options enabled, lines 18--19 \texttt{smuggle}
\texttt{TensorFlow}~\cite{AbadEtal15}, a powerful end-to-end platform
for building and working with machine learning models, and
\texttt{UMAP}~\cite{McInEtal18b}, a package that implements a family
\texttt{UMAP}~\cite{McInEtal18}, a package that implements a family
of related manifold learning techniques. The onion comment in line 19
also specifies that \texttt{UMAP} should be installed with the
optional requirements needed for its ``plot'' and ``parametric\_umap''
Expand Down Expand Up @@ -904,19 +888,19 @@ \section{Impact}

Since its initial release, Davos has found use in a variety of applications. In
addition to managing computing environments for multiple prior and ongoing
research studies~\citep{MannEtal23, OwenMann23, ZimaEtal23}, Davos is being
research studies~\citep{MannEtal23a, OwenMann23, ZimaEtal23}, Davos is being
used by both students and instructors in programming and methods courses such
as Storytelling with Data~\cite{Mann21d} (an open course on data science,
as Storytelling with Data~\cite{Mann21a} (an open course on data science,
visualization, and communication) and Laboratory in Psychological
Science~\cite{Mann22} (an open course on experimental and statistical methods
for psychology research) to simplify distributing lessons and submitting
assignments, as well as in online demos such as
\texttt{abstract2paper}~\cite{Mann21e} (an example application of
\texttt{abstract2paper}~\cite{Mann21b} (an example application of
GPT-Neo~\cite{GaoEtal20, BlacEtal21}) to share ready-to-run code that installs
dependencies automatically. The 2023 offering of Neuromatch
Academy~\cite{Neuromatch} also included an ``experimental'' module that uses
Academy~\cite{vanVEtal21} also included an ``experimental'' module that uses
Davos to manage dependencies related to a large language model-based
tutor~\cite{Chatify}.
tutor~\cite{MannEtal23b}.

Our work also has several more subtle ``advanced'' use cases and potential
impacts. Whereas Python's built-in \texttt{import} statement is agnostic to
Expand All @@ -940,7 +924,7 @@ \section{Impact}
IPython notebooks' internal code parsing and execution machinery. We
note that, while other popular packages similarly use these mechanisms
to providing notebook-specific functionality (e.g.,
\cite{Hunt07,HeusEtal18a}), this approach also has the potential to be
\cite{Hunt07,HeusEtal18}), this approach also has the potential to be
exploited for more nefarious purposes. For example, a malicious user
could design a Python package that, when imported, substantially
changes the notebook's functionality by adding new \textit{unexpected}
Expand Down
Loading
Loading