Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed Python #537

Closed
casey opened this issue Nov 14, 2019 · 38 comments
Closed

Embed Python #537

casey opened this issue Nov 14, 2019 · 38 comments

Comments

@casey
Copy link
Owner

casey commented Nov 14, 2019

@runeimp and I discussed this a bit in #531, but I thought that it would be good to give it its own issue.

In that thread, we discussed embedding a scripting language to give people an alternative to shell.

I think Python would be the best choice, since it's widely known and very nice for scripting tasks. Unfortunately, Python is hard to embed. That might change with RustPython though.

@casey
Copy link
Owner Author

casey commented Nov 14, 2019

@runeimp, continuing the discussion from #531: I think implementing it is probably out of the question, but there's actually a rust implementation of Python3, and that might be easy to embed one day.

It would probably increase the size of the binary by a few megs, but it could be optional, so people could turn it off if they wanted. (Although honestly, in an age of 100+ meg electron apps, I don't think anyone would notice :P)

Can you give me some examples of how you would use embedded scripting?

@runeimp
Copy link

runeimp commented Nov 16, 2019

RustPython looks cool. Installed it and immediately ran into a problem. But it's still in development so not too surprising. I also may have done something dumb in a rush to play with it. 👼

Examples of scripting I use in Justfiles:

  • Flow control: if then else. I'd also like switch statements but it's the one major failing of Python. Switch statements in Go and Ruby are sooo sweet!
  • I/O: File and path creation, file update and deletion. Temp directory generation.
  • Strings: Creation, modification, and deletion. Formatted strings for printing.
  • Date & TIme: ISO 8601 DateTime creation for generating parts of files names and logging.
  • Terminal Control: I like to clear the screen buffer before running the content of most recipes. Also renaming the screen.

That's about 95% of what I do scripting wise. I typically do it in Bash as that's what's common for my terminal environment most of the time. Though for Windows I'll use CMD.EXE if Git Bash is not installed. Being able to do it all in Python without having to install Python would be awesome! 😃

@casey
Copy link
Owner Author

casey commented Nov 16, 2019

Gotcha, that all makes sense and sounds reasonable. I suppose that the feature would take the form of a recipe annotation indicating to Just that the recipe should be evaluated with a built-in python3 interpreter:

[python]
foo:
  print("Hello from python!")

@casey casey added this to the eventually milestone Nov 21, 2019
@dionjwa
Copy link

dionjwa commented Dec 19, 2019

I'm against this. I really like that the just binary is small, and python is trivial to add to the host (either you own the host, or you're in a docker container). With the addition of python, just becomes very opinionated.

@runeimp
Copy link

runeimp commented Dec 19, 2019

@dionjwa the binary is already about 2.6MB. It's hardly a lightweight. The addition of a meg or so shouldn't really affect the choice to utilize it. It needs a bit of scripting ability for flow control and the like. Relying on the presence of sh is not a good cross-platform solution. Relying on anything outside of itself makes it less portable and prone to deployment issues. It needs to have something built in that is cross platform and ideally has a similar flow (whitespace delimited blocks). Python ticks all those boxes. And it doesn't preclude using sh or anything else if your hard set against Python for some reason. So how is it very opinionated just by having the availability of Python included?

@dionjwa
Copy link

dionjwa commented Dec 20, 2019

I agree that sh/bash is actually a pretty terrible scripting solution compared to e.g. python. If it's only a meg or so (I wasn't aware it would be that size) then that criticism of mine is not valid. Would there be issues if you required a specific version of python? Like if the embedded version was different to the host? If so, then I would be supportive, because scripting complex commands in sh/bash is somehow always painful, and often hard to read later.

@runeimp
Copy link

runeimp commented Dec 20, 2019

Yeah, I expected it would be bigger too but in my earlier discussions on the topic with @casey he'd already looked into it a bit and expects a megabyte or so of additional size. As I understand it (I not on the dev team nor am I a Rust programmer) the single just binary would have an embedded version completely independent of any version installed on the system. Just already has the ability to specify what interpreter to use. The default is currently (will always be?) sh but the ability to specify the interpreter for an individual recipe has been present since the beginning via a shebang as the first line of the recipe. At this point the ability to specify the default interpreter for the entire Justfile is in place. I don't expect either of those options to disappear soon, if ever. So you could potentially do something evil like specify the Justfile to use python2.7 when there is (well, eventually will be) a perfectly good built in version of python3.

I just hope it's at least Python 3.6 so I can use f-strings. I love python f-strings. One of the few things I really miss in every other language I use.

@casey casey removed this from the eventually milestone Jul 2, 2020
@runeimp
Copy link

runeimp commented Aug 2, 2022

Hey @casey might Starlark for Rust be an option? Is modeled on Python 3, is the primary language for Bazel and is thus geared more towards configuration and such. Just came across the Rust version today.

@dionjwa
Copy link

dionjwa commented Aug 2, 2022

An alternative to python would be deno (typescript)

Pros:

  • probably a lot easier to embed, deno is built in rust and designed for this scenario
  • the import story with deno is amazing and fits wonderfully with just
    • you just import URLs so that sharing code and versioning is simple, easy, and works across repositories
    • importing shared python libraries is a huge PITA
  • it's great for small tasks that include web servers, and interacting with the web
  • it's async model is better then pythons
  • typescripts typing is so much better than the mess that is python's typing
  • I would argue it's use cases and story is more aligned with just than pythons

Cons:

  • it's not python and that has a similar user base

I tried using python with just but it just didn't seem to fit, mostly the import part, whereas deno has been seamless and now I can share and version all the scripts I use with just.

@casey
Copy link
Owner Author

casey commented Aug 3, 2022

@runeimp I think the disadvantage of Starlark is that it's pretty similar to Python, but just different enough that it will confuse people.

@dionjwa Deno sounds like a good option. As much as as I dislike typescript, importing URLs sounds pretty nice.

@dbohdan
Copy link

dbohdan commented May 25, 2024

I like the idea of (optionally) embedding RustPython.

One instance of prior art: Task embeds a cross-platform shell interpreter written in Go. I have not found a POSIX-compatible shell in Rust that is similarly mature, or one that builds on Windows.

@laniakea64
Copy link
Contributor

As nice as an embedded scripting language could be for performance, isn't it likely to make backwards-compatibility a minefield?

If just uses a third-party crate for embedded scripting support, then just's backwards-compatibility guarantee becomes dependent on either

  • the third-party crate having a similar backwards-compat guarantee, or
  • a particular version/series of the third-party crate both continuing to compile with newer Rust versions and not having security vulnerabilities.

In the case of Python specifically, I have a lot of custom Python scripts and some of them have broken from one minor Python 3.x version to the next. So just would have to stick with a specific 3.x minor version of Python in order to maintain backwards compatibility.

Theoretically maybe just could avoid some of these concerns by implementing its own interpreter, but is that as much too involved as it sounds?

It would be nice if there is a solution that would make all these concerns a non-issue, but can't think of what that might be ☹️

@dbohdan
Copy link

dbohdan commented May 26, 2024

You are right that Python version compatibility is a potential problem. In my opinion, if just embeds Python, it should not include it in the semver public interface. Anything else would entail forking Python, which seems like an unsustainable amount of work, and would also lead to users asking, "Where is feature X? I used it two Python releases ago."

This is something where POSIX shell has an advantage. It is not going to break compatibility. Lua 5.1 and JavaScript have it, too. While 5.x releases of Lua make breaking changes, Lua 5.1 is going to be supported indefinitely thanks to LuaJIT. The situation is similar with the gradually-typed fork of Lua 5.1 Luau, because it is the scripting language of Roblox.

My own preference goes, Python > POSIX-compatible shell > Lua > JavaScript. When it comes to JavaScript, note that Deno is planning a 2.0 release. Among other things, Deno 2.0 is going to remove Deno.run. This change would affect just if it embedded Deno.

An option I have thought of is to embed a Wasm VM and let the user choose their own programming language runtime. With Wasmer, you could leverage their packages, including the versioning. This means that example 1 could become something like example 2.

Example 1:

# This works right now.
# It requires Wasmer to be installed as a separate binary.
test:
  #! /usr/bin/env -S wasmer run python/python@0.1.0 --mapdir /tmp:/tmp --mapdir .:.
  from pathlib import Path

  test_file = Path("test.txt")
  test_file.write_text("Hello from Python!\n")
  print(test_file.read_text(), end="")

Example 2:

# A theoretical example of how things could work
# with Wasmer embedded in just.
set script-wasmer-package := "python/python@0.1.0"
# Access to the temporary directory just creates is granted implicitly.
set script-wasmer-dirs := [".:."]
# set script-wasmer-dirs := [".", "."] # ?

[script]
test:
  from pathlib import Path

  test_file = Path("test.txt")
  test_file.write_text("Hello from Python!\n")
  print(test_file.read_text(), end="")

@runeimp
Copy link

runeimp commented May 29, 2024

@laniakea64 I don't see the potential for a backwards compatibility problem. If just ever embeds a Python interpreter, or any other scripting language interpreter, your good. The inclusion of a scripting languages does not itself suggest that newer versions of said language or a given interpreter would ever need to be updated. I personally could care less if the interpreter is ever updated. As long as it exists and my Justfiles that use said language continue to work. It would, in fact, be beneficial for the interpreter never be updated except to fix known bugs. The feature wouldn't be to promote the language itself. It would simply be to have a cross-platform scripting solution.

Any special features added via a language/interpreter update could be locked behind a specific edition.

@runeimp
Copy link

runeimp commented May 29, 2024

@dbohdan I'd love to see Lua used but the discussion has already happened and Python was the preference. See #531 and others.

Having wasmer or anything that has it's own package manager just to get scripting to do anything seems like hoops creating hoops to jump through. Maybe I'm "not seeing the forest for the trees" or something but if I was interested in such a solution I'd just use the code as-is in your example. Even then I'm completely unlikely to bother as it would require installing wasmer and the desired modules on every system that I'd use this with. Even if wasmer was built-in I'd have to make sure the modules installed successfully on every target system.

In my mind, the point is to have much better scripting that is cross-platform and I only have to update the one executable (just) as needed.

@dbohdan
Copy link

dbohdan commented May 29, 2024

It would, in fact, be beneficial for the interpreter never be updated except to fix known bugs.

This would effectively mean forking RustPython and maintaining the fork. According to tokei, just has 17k lines of Rust code in src/, while RustPython has 70k only in vm/. It would be a lot of extra code to maintain. As just's fork aged, the gap between it and current Python would widen. Users would increasingly complain about not being able to write Python the way they were used to.

I think a never-upgrade approach isn't viable because it amounts to forking. I would prefer either tracking stable RustPython without it being part of just's commitment to backward compatibility or some way to choose the Python version.

Having wasmer or anything that has it's own package manager just to get scripting to do anything seems like hoops creating hoops to jump through.

The idea is to let the user specify a fixed version of Python (or another interpreter) to run their scripts with. This would give you both stability (old versions don't change) and upgrades (new versions become available).

Even if wasmer was built-in I'd have to make sure the modules installed successfully on every target system.

Wasmer automatically downloads packages before running them. I am not sure how reliable it is. If it is reliable, you would only need to install things manually on a machine that wasn't connected to the Internet.

In my mind, the point is to have much better scripting that is cross-platform and I only have to update the one executable (just) as needed.

This would be the main reason to embed Wasmer. That being said, I am not advocating for Wasmer over tracking RustPython. I am suggesting it as a more speculative high-risk, high-reward option.

@runeimp
Copy link

runeimp commented May 29, 2024

@dbohdan one of the many problems of adding anything to just is that it is often used in CI/CD systems. So features can be added but they shouldn't change in a way that could break one of these systems. And in my mind adding something as useful as built-in Python would end up getting used heavily by people using these systems. Once in the tool, the feature's interface and essential functionality should probably never change. Features could be added but initial features of the interpreter should remain, essentially the same.

I'm not the project author so that would be up to @casey on how he wants to handle all of that. But as we've been discussing this feature for several years now I'm guessing that is going to be close to how he sees it. High-risk, high-reward is rarely done these days. When it started out that may have been an option. But lots of users depend on the stability of just so high-risk is typically not an option. But maybe locked behind an edition? #1201

@casey there was a time where the Rust concept of Editions was going to be added to just but I don't see it noted in the docs anymore. Maybe never implemented?

@dbohdan
Copy link

dbohdan commented May 29, 2024

@runeimp

one of the many problems of adding anything to just is that it is often used in CI/CD systems. So features can be added but they shouldn't change in a way that could break one of these systems. And in my mind adding something as useful as built-in Python would end up getting used heavily by people using these systems.

Right, I agree this is a concern. How do you see this being implemented in just?

Edit: What I want to ask about is how you see it happening on a technical level. Do you consider forking RustPython an acceptable solution (the correct solution)? Are you thinking of adding a certain version of RustPython as a dependency and never going past it (without a fork)?

Features could be added but initial features of the interpreter should remain, essentially the same.

I am not sure you meant this, but if staying essentially the same is the goal (i.e., minor breaks in compatibility are allowed), this is pretty much how Python is developed.

@starthal
Copy link

starthal commented Jun 6, 2024

Just since it hasn't been mentioned: is Nushell under consideration here?
Disadvantages:

  • Not widely used/known
  • Not 1.0 yet (implying its use would have to be --unstable until then)

Advantages:

  • Designed as a shell/scripting language first, with sugar like non-quoted strings
  • Easier to embed in the just binary
  • Possible to synchronize with the just version as needed for stability
  • Its approach to linewise execution vs script execution maps closely to just's (AFAICT)

@runeimp
Copy link

runeimp commented Jun 19, 2024

@starthal I think Nu is a great option and was suggested early on in the discussion. But suffers from its current lack of popularity. Python on the other hand is one of the, if not the, most popular languages known across several domains. So chances are high that familiarity with Python, for anyone who would benefit from the tool, is extremely high.

@alluring-mushroom
Copy link

I understand the desire to use a language that is well known, but I want to make the claim that Python doesn't succeed at that:

Most Justfile usage calls out to external programs. Python makes this very verbose and difficult, and I believe this is not a common task for most Python programmers, meaning they will need to check the documentation to perform this.

This code:

subprocess.run(["git", "lfs", "lock", path])

is equivalent to this:

git lfs lock "$path"

Definitely the quoting isn't great in the Bash version, but it also isn't fantastic in the Python version.
When conditionals are used, the case for a normal shell obviously falls, but I would argue these are less common.

But what I'm really arguing here is not to go with bash, but to go with any language where the shell is first class. This includes nushell, oilshell and also Xonsh. If you wished for the familiarity of Python loops and conditionals, but still have first class shell, this would be it, but I don't know how it would go embedding into something like Just.

All of these languages have a better story around quoting, conditionals and loop.

Condtionals are likely the next most common action, and I think basically all newer shells fix this, and all modern shells look similar. Similar argument for loops.

Making these scripts is going to require everyone to learn something new, whether it be subprocess calling notation, or the exact syntax for an if statement. But the majority will simply want to string a series of commands together, and this should be the easiest action in the chosen language. People will still be able to choose an external language to run a script in.

@lucabello
Copy link

My two cents on this issue, specifically on the Python version compatibility.

uv is extremely fast at managing different Python environments, and it's written in Rust as well. Compared to the wasmer suggestion, you don't need a separate place to specify dependencies; you can do so in the recipe itself (see here).

TL;DR: you can specify python version and dependencies in the recipe itself.

Here are some examples:

[python]
hello:
  print("Hello from Python!")

[python]
goodbye:
  # /// script
  # requires-python = ">=3.11"
  # dependencies=["sh"]
  # ///
  import sh
  sh.echo("Goodbye from Python!")

All just would need to do, would be to convert the [python] attribute to the following shebang (respectively):

#!/usr/bin/env -S uv run --script
#!/usr/bin/env -S uv run --python='>=3.11' --script

I recognize this means depending on uv, but maybe there is a good way to connect the two things?

@casey
Copy link
Owner Author

casey commented Dec 12, 2024

@lucabello That's a really good suggestion! I think with Python packaging being the way it is, this might be the best option. Python packaging and versioning is such a mess that, even if just were to embed a pure-rust Python implementation, it would leave a lot of other issues to be solved. And since uv is portable and written in Rust, installing it is probably about as easy as installing just.

Also, this works very nicely with the [script] attribute and script-interpreter setting:

set unstable

set script-interpreter := ['uv', 'run', '--script']

[script]
hello:
  print("Hello from Python!")

[script]
goodbye:
  # /// script
  # requires-python = ">=3.11"
  # dependencies=["sh"]
  # ///
  import sh
  sh.echo("this doesn't print anything")
  print("this does print something")

However, the sh.echo line doesn't print anything, but the print line below does. Do you know what's going wrong?

@starthal
Copy link

@casey by default, sh.foo() returns foo's stdout as a string. Use print(sh.foo()) to see it directly.

@casey
Copy link
Owner Author

casey commented Dec 12, 2024

@starthal Okay nice.

Yah, I think maybe this is what we should be recommending. I think when I first opened this issue, I was hopeful we would see a feature-complete Rust implementation of Python, but that doesn't seem to be materializing.

Suggesting uv with the [script] attribute seems like it's actually a really great option, at the expense of one additional dependency.

@casey
Copy link
Owner Author

casey commented Dec 12, 2024

Added a little documentation in #2526. I think I'm probably inclined to close this issue now, just because I think embedding python would wind up being a huge headache, and would require a lot of supporting features to manage the installation, environment, etc, which uv does out of the box.

@starthal
Copy link

Maybe out-of-scope for this issue, but: in the hypothetical future where just embeds a scripting language, do you have an annotation syntax in mind to distinguish "use embedded interpreter X" from [script]'s "invoke this binary"?

I guess at worst this would result in a new annotation, so probably not a major concern.

@casey
Copy link
Owner Author

casey commented Dec 12, 2024

@starthal I haven't thought too much about it, but yeah, probably a different annotation, or something like set script-interpreter := builtin.

@runeimp
Copy link

runeimp commented Dec 13, 2024

@casey by default, sh.foo() returns foo's stdout as a string. Use print(sh.foo()) to see it directly.

I'm down voting this just because sh.echo() absolutely should push to stdout. That's just lame. Other than that seems like a pretty comprehensive setup. 👍

@casey
Copy link
Owner Author

casey commented Dec 13, 2024

Yah I think this is a pretty comprehensive solution, so I'll go ahead and close this. God bless the uv developers.

@casey casey closed this as completed Dec 13, 2024
@macintacos
Copy link

Once [script] becomes a stable feature, could the above solution about using uv + requirements / dependencies inlined be provided as an example somewhere in the documentation? Apologies if it's already documented.

@arathunku
Copy link

arathunku commented Dec 14, 2024

👋 I'd like to mention that in the same way that uv can be used, you can use Elixir too!

set unstable
set script-interpreter := ['elixir']
# add this if you want to force a different directory for dependencies, and easily cache them on CI
# export MIX_INSTALL_DIR := justfile_dir() + "/.cache"

[script]
my-ip:
  Mix.install([{:req, "~> 0.5"}])
  Req.get!("https://icanhazip.com").body |> IO.puts()

[script]
error:
  IO.puts(:stderr, "Some error")
$ just my-ip
82.125.67.147

$ just error
Some error

I wrote about it here and there are a lot of more example in mix install examples repo.

I wish just embedded nushell/fish, not python, but with default interpreter, there's not much need for a builtin.

@casey
Copy link
Owner Author

casey commented Dec 15, 2024

Once [script] becomes a stable feature, could the above solution about using uv + requirements / dependencies inlined be provided as an example somewhere in the documentation? Apologies if it's already documented.

Yup, added! https://github.com/casey/just/?tab=readme-ov-file#python-recipes-with-uv

@casey
Copy link
Owner Author

casey commented Dec 15, 2024

👋 I'd like to mention that in the same way that uv can be used, you can use Elixir too!

Very nice! If you think this is useful, feel free to open a PR to add it to the readme. (Probably under the Python example with uv.)

@dbohdan
Copy link

dbohdan commented Dec 15, 2024

I think this is a good resolution. Even half a year ago, rye was on the scene, and the future of Python tooling was less certain. Now it looks like uv will play a big role in it.

👋 I'd like to mention that in the same way that uv can be used, you can use Elixir too!

There is a number of tools and runtimes with this feature. I have compiled a list, which includes uv and Elixir: https://dbohdan.com/scripts-with-dependencies. (To my surprise and joy, it got referenced in PEP 722, which led to PEP 723.)

@casey
Copy link
Owner Author

casey commented Dec 16, 2024

@dbohdan Damn that's a great post!

@arathunku
Copy link

@dbohdan that's a great post! Looks like all most popular languages can do that to some extent! Thanks for sharing. I won't submit any PR for just Elixir in this case, too niche 😄

@dbohdan
Copy link

dbohdan commented Dec 16, 2024

Thanks, @casey and @arathunku! Feel free to contact me if you would like to suggest additions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants