How to get remaining arguments #12

cloudhead · 2022-01-25T15:44:55Z

To implement external sub-commands, eg. git foo --bar --baz, I need to invoke a binary eg. git-foo in this case with arguments --bar --baz.

I can't figure out though, how to get the remaining arguments as plain strings, and not as Long/Short/Value. Any ideas how I might achieve that?

The text was updated successfully, but these errors were encountered:

cloudhead · 2022-01-25T15:55:38Z

I found a solution though I don't think it's very nice:

            Value(val) if command.is_none() => {
                let cmd = val.to_string_lossy().into_owned();
                let mut args = vec![cmd];

                while let Some(a) = parser.next()? {
                    match a {
                        Long(s) => args.push(format!("--{}", s)),
                        Short(c) => args.push(format!("-{}", c)),
                        Value(v) => args.push(v.to_string_lossy().into_owned()),
                    }
                }
                command = Some(Command::External(args))
            }

blyxxyz · 2022-01-25T15:58:49Z

Assuming the last argument was a Value, you can do this by repeatedly calling parser.value(). (You can wrap this up in an iterator with std::iter::from_fn(|| parser.value().ok()).) It was not originally intended, but it works out exactly right.

It's mentioned in a single sentence in the docs but I would like to have a proper API for this.

cloudhead · 2022-01-26T19:57:55Z

Yeah, this works perfectly, and I had missed it from the docs. I would suggest creating a function for it to make it easier to find, eg. parser.rest() or parser.collect(). You could even make parser an Iterator, then collect() would work as expected.

blyxxyz · 2022-01-26T20:07:11Z

My current plan is a raw_args() method that borrows self and returns a peekable iterator (or an error if part of an argument is left over). That way you can take as many arguments as you want and still continue parsing afterwards if you don't take them all.

blyxxyz · 2022-02-05T21:47:42Z

I've implemented raw_args() in e0026c3. How does it look?

I'll make a new release in a few days.

cloudhead · 2022-02-07T17:31:53Z

Looks promising!

One related issue is when delegating arguments to a sub-parser, I wonder if you have an ideas:

        while let Some(arg) = parser.next()? {
            match arg {
                Long("foo") => {
                    foo = true;
                }
                _ => {
                    // Here we basically want to delegate the rest of the arguments to be parsed to another parser.
                    // This doesn't quite work because we've already consumed `arg`!
                    let remaining = parser.raw_args();
                    let result = sub_parser.parse_args(remaining)?;
                    sub_options = result;
                    break;
                }
            }
        }

blyxxyz · 2022-02-07T17:51:31Z

If arg is a Value then this should work:

Value(arg) => {
    let remaining = std::iter::once(arg).chain(parser.raw_args()?);
    ...

It gets trickier if it doesn't have to be a Value. One problem is that if the raw argument is --option=123, arg will be Long("option") and you're not allowed to call raw_args until you consume 123 by calling .value().

raw_args might not be the right approach for a subparser. If the subparser also uses lexopt then it'd make more sense to pass parser itself. But you'd also want to pass arg to have the subparser handle it first, and you can't pass parser until arg is gone (or has had its lifetime readjusted using an ugly trick).

I'm going to think about this. I have a few halfbaked ideas.

Do you actually need the _ pattern in this case or would Value(arg) (with another catch-all branch that returns an error) do?

cloudhead · 2022-02-07T19:25:20Z

Yes this is exactly the stuff I've been running into.

I don't think I can match on Value because the sub-parser might need to handle flags as well.

The workaround I've been using so far is this:

let mut unparsed: Vec<OsString> = Vec::new();

while let Some(arg) = parser.next()? {
  match arg {
  ...
  _ => unparsed.push(args::format(arg))
  }
}

let sub = lexopt::Parser::from_args(unparsed);
...

And the implementation of args::format is:

pub fn format(arg: lexopt::Arg) -> OsString {
    match arg {
        lexopt::Arg::Long(flag) => format!("--{}", flag).into(),
        lexopt::Arg::Short(flag) => format!("-{}", flag).into(),
        lexopt::Arg::Value(val) => val,
    }
}

It would already be helpful to have a fmt::Display instance of Arg like the one above as part of the library.

cloudhead · 2022-02-07T19:28:07Z

One option could be to have a way to put back an arg into the parser, eg.

arg => {
  parser.input(arg); // "un-parse" arg
  sub_parser.parse(parser.raw_args());
}

Another interesting function might be Parser::into_raw_args which would consume the parser, returning the remaining args.

EDIT:

Just saw this comment:

It gets trickier if it doesn't have to be a Value. One problem is that if the raw argument is --option=123, arg will be Long("option") and you're not allowed to call raw_args until you consume 123 by calling .value().

This is something I'm not handling indeed, so it would break at the moment, though I don't use those types of options.

blyxxyz · 2022-02-07T20:22:41Z

Does the subparser use lexopt too?

Do I understand correctly that the arguments can be interleaved, so that e.g. for the command line a b --foo c d the subparser should see a b c d?

A solution I have been pondering is a .current() method that returns the last return value of .next(), or a .peek() method that returns the next one. That'd work best if Value held an &OsStr instead of an OsString (which also has other benefits). But there are edge cases I'm unsure about, particularly around errors.

That would be somewhat similar to your parser.input() idea.

Another solution could be to turn your logic inside-out. In the outermost parser, define a closure like this:

let mut foo = false;
let handle = |arg| -> bool {
    match arg {
        Long("foo") => foo = true,
        _ => return false,
    }
    true
};

Then pass handle to the subparser, and let it drive the parsing but give the closure the first say:

while let Some(arg) = parser.next()? {
    match arg {
        arg if handle(&arg) => (),
        ...

(The closure indicates that it "consumed" arg by returning true.)

I'm not sure if that's worthwhile as-is. It's kind of complicated and doesn't let you call parser.value() inside the closure. But there might be something to the basic idea.

cloudhead · 2022-02-08T22:42:12Z

Does the subparser use lexopt too?

Yes

Do I understand correctly that the arguments can be interleaved, so that e.g. for the command line a b --foo c d the subparser should see a b c d?

That's right, at least I support that with my current workaround in some cases.

Will think about some of the ideas you proposed, thanks! I've gone through many options parsers over the last few years: pico-args, argh, structopt and clap. So for this one is the best, so kudos!

cloudhead · 2022-02-10T09:46:37Z

After some more exploration, I've found that the method above with some tweaks is the cleanest and most flexible. Basically, I have a trait like this, implemented by all parsers:

pub trait Args: Sized {
    fn from_args(args: Vec<OsString>) -> anyhow::Result<(Self, Vec<OsString>)>;
}

Then when parsing, I do something like this:

impl Args for MainOptions {
    fn from_args(args: Vec<OsString>) -> anyhow::Result<(Self, Vec<OsString>)> {
        let (sub_opts1, unparsed) = SubOptions1::from_args(args)?;
        let (sub_opts2, unparsed) = SubOptions2::from_args(unparsed)?;
        let (sub_opts3, unparsed) = SubOptions3::from_args(unparsed)?;
        // etc ...

        // Now the main option parser working with the remaining options.
        let mut parser = lexopt::Parser::from_args(unparsed);
        while let Some(arg) = parser.next()? { ... }
        // ...

The sub-parsers build an unparsed list from ignored args with _ => unparsed.push(args::format(arg)). It would of course be cleaner if there was a way to have a lexopt::Arg::to_owned or something to store these unparsed args, and for lexopt::Parser::from_args to work with a Vec<lexopt::Arg>.

blyxxyz · 2022-02-10T10:10:31Z

Ah, that's a nice way to think about it!

I don't think I'll ever support that properly, because to parse arguments the correct/pedantic way you have to do a single pass with all the information available. An argument -ab could mean either -a -b or -a=b. If SubOptions1 handles -b and SubOptions2 handles -a then you get in a tangle because SubOptions2 has to decide whether -b occurs at all.

But I think you said none of your options take arguments, so you don't suffer from this problem.

I've had an OwnedArg implementation stashed for months but I'm not really happy with it because Arg is already half-borrowed, half-owned. And it doesn't need to be part of lexopt proper, it can be implemented outside the crate.

Would something like this work?

use std::ffi::OsString;

use lexopt::Arg;

enum OwnedArg {
    Value(OsString),
    Short(char),
    Long(String),
}

impl From<Arg<'_>> for OwnedArg {
    fn from(arg: Arg<'_>) -> OwnedArg {
        match arg {
            Arg::Value(value) => OwnedArg::Value(value),
            Arg::Short(opt) => OwnedArg::Short(opt),
            Arg::Long(opt) => OwnedArg::Long(opt.to_owned()),
        }
    }
}

fn main() -> anyhow::Result<()> {
    let mut args = Vec::new();
    let mut parser = lexopt::Parser::from_env();
    while let Some(arg) = parser.next()? {
        args.push(OwnedArg::from(arg));
    }
    let (parsed, _) = MainOptions::from_args(args)?;
    Ok(())
}

from_args would take a Vec<Arg> instead of a Vec<OsString>, and you would only ever use a Parser once. I think that if you're never calling .value(), just looping through a Vec is enough.

cloudhead · 2022-02-10T11:35:25Z

Yes, I think that would work!

Parser::from_args<T: Into<OsString>>(args: Vec<T>) perhaps and have an impl for OwnedArg -> OsString.

cloudhead · 2022-02-10T11:36:40Z

I'd probably also implement Arg: ToOwned instead of From/Into

blyxxyz · 2022-02-10T11:50:15Z

What I was getting at is that you only need to use a Parser once. If you have an args: Vec<OwnedArg>, instead of doing this:

let mut parser = lexopt::Parser::from_args(args);
while let Some(arg) = parser.next()? {
    match arg {

Can't you just do this?

for arg in args {
    match arg {

If you're only doing options and positional arguments (and no options with values) then that seems like it would be enough.

A fundamental problem with processing multiple times is that you lose the meaning of --. -a -b -- -c becomes Short('a'), Short('b'), Value("-c"), but if you naively transform that back you get -a -b -c, which is then parsed as Short('a'), Short('b'), Short('c').

Running the arguments through a Parser exactly once avoids that problem, and possibly others as well.

cloudhead · 2022-02-10T12:40:45Z

Can't you just do this?

Ah I see, but how about optional values and stuff? Eg. --foo 123 wouldn't work with the above.

A fundamental problem with processing multiple times is that you lose the meaning of --

Interesting.. This could be solved if -- returned Arg::EndOfFlags or something, at which point you could break out of the loop. But that would break other things. Somehow though you'd want to parse up to -- and leave everything after as positional values for the main parser..

EDIT: Actually doesn't getting a Value(_) signify the end of the flags? At which point the sub-parser could break out, but I guess you'll lose the -- distinction, ie. -a -b -- -c will turn into -a -b - c after the first parser, unless you do:

Value(a) => {
  unparsed.push("--");
  unparsed.push(a);
  unparsed.extend(parser.raw_args());
  break;
}

Or simply split the unparsed flags from the unparsed values..

blyxxyz · 2022-02-10T13:43:42Z

Ah I see, but how about optional values and stuff? Eg. --foo 123 wouldn't work with the above.

Yep, that's a limitation. I thought you didn't use options like those, but I guess you only meant you didn't spell them --foo=123?

EDIT: Actually doesn't getting a Value(_) signify the end of the flags?

That's how POSIX does it, but most modern parsers (including lexopt) don't do it like that. You can put flags after positional arguments if you want. If you do want the POSIX behavior I wrote up an example here.

But I don't think that works for your use case, because the 123 argument to --foo will look like a Value until you get to the subparser that consumes it.

If you want to be able to do --foo 123 then to get all the details right I think you really do need to do it in a single pass. So that would mean only a single while let Some(arg) = parser.next()? { loop, and somehow delegating the arguments to the subparsers inside that loop.

That, or ditching the subparsers and combining it all into one big match. Would that be possible/practical?

cloudhead · 2022-02-10T16:46:03Z

Can you remind me what the issue is with --foo 123 with my approach? Is it not supporting -- to separate flags from values? I'll try a few things and see if I can break it.

The reason I have sub-parsers is simply that some flag sets are shared across tools, and I don't want to have to duplicate the matches and parsing. So it's a question of keeping things in one place.

blyxxyz · 2022-02-10T17:16:26Z

Can you remind me what the issue is with --foo 123 with my approach? Is it not supporting -- to separate flags from values? I'll try a few things and see if I can break it.

The main issue I meant is that the Vec<OwnedArg> approach doesn't work flawlessly. If it did work you wouldn't need to re-parse, and re-parsing indeed breaks --.

Spellings like --foo=123 and -ovalue also break. And if you write --foo -123 where -123 is supposed to be a negative number you run into trouble.

(You could decide that these shortcomings are acceptable.)

The reason I have sub-parsers is simply that some flag sets are shared across tools, and I don't want to have to duplicate the matches and parsing. So it's a question of keeping things in one place.

Ah, I understand now! I don't have a good solution yet but I'll keep thinking about it.

cloudhead · 2022-02-12T17:02:40Z

Ok interesting, these are ok trade-offs for me at the moment, but obviously a more robust solution would be interesting.

blyxxyz · 2022-07-10T18:24:17Z

raw_args() is now released in version 0.2.1.

Feel free to open another issue for subparsers.

blyxxyz closed this as completed Jul 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to get remaining arguments #12

How to get remaining arguments #12

cloudhead commented Jan 25, 2022

cloudhead commented Jan 25, 2022

blyxxyz commented Jan 25, 2022

cloudhead commented Jan 26, 2022

blyxxyz commented Jan 26, 2022

blyxxyz commented Feb 5, 2022 •

edited

Loading

cloudhead commented Feb 7, 2022

blyxxyz commented Feb 7, 2022

cloudhead commented Feb 7, 2022

cloudhead commented Feb 7, 2022 •

edited

Loading

blyxxyz commented Feb 7, 2022

cloudhead commented Feb 8, 2022

cloudhead commented Feb 10, 2022 •

edited

Loading

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 10, 2022

cloudhead commented Feb 10, 2022

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 10, 2022 •

edited

Loading

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 10, 2022

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 12, 2022

blyxxyz commented Jul 10, 2022

How to get remaining arguments #12

How to get remaining arguments #12

Comments

cloudhead commented Jan 25, 2022

cloudhead commented Jan 25, 2022

blyxxyz commented Jan 25, 2022

cloudhead commented Jan 26, 2022

blyxxyz commented Jan 26, 2022

blyxxyz commented Feb 5, 2022 • edited Loading

cloudhead commented Feb 7, 2022

blyxxyz commented Feb 7, 2022

cloudhead commented Feb 7, 2022

cloudhead commented Feb 7, 2022 • edited Loading

blyxxyz commented Feb 7, 2022

cloudhead commented Feb 8, 2022

cloudhead commented Feb 10, 2022 • edited Loading

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 10, 2022

cloudhead commented Feb 10, 2022

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 10, 2022 • edited Loading

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 10, 2022

blyxxyz commented Feb 10, 2022

cloudhead commented Feb 12, 2022

blyxxyz commented Jul 10, 2022

blyxxyz commented Feb 5, 2022 •

edited

Loading

cloudhead commented Feb 7, 2022 •

edited

Loading

cloudhead commented Feb 10, 2022 •

edited

Loading

cloudhead commented Feb 10, 2022 •

edited

Loading