-
Notifications
You must be signed in to change notification settings - Fork 75
[for discussion, POC]: make loader arguments discoverable #1419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Merging this PR will degrade performance by 17.71%
Performance Changes
Comparing |
dissect/target/tools/query.py
Outdated
|
|
||
| # The loader now knows how to print its own help. | ||
| # We just need to instantiate it. Note that the path is not important here. | ||
| loader_instance = loader_cls(pathlib.Path()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing a dummy path is a bit dubious.
generate_argparse_for_plugin_class
dissect/target/loader.py
Outdated
| def map(self, target: Target) -> None: | ||
| """Wrapper around the _map function that handles argument passing.""" | ||
| # The loader now parses its own arguments | ||
| loader_options = self._parser.parse_args(self.loader_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of rest arguments, use parse_known_args
dissect/target/tools/utils/cli.py
Outdated
|
|
||
|
|
||
| def open_targets(args: argparse.Namespace) -> Iterator[Target]: | ||
| def open_targets(args: argparse.Namespace, loader_args: list[str] | None = None) -> Iterator[Target]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative: use loader namespace
dissect/target/loader.py
Outdated
| self, path: Path, *, parsed_path: urllib.parse.ParseResult | None = None, resolve: bool = True, | ||
| **kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| self, path: Path, *, parsed_path: urllib.parse.ParseResult | None = None, resolve: bool = True, | |
| **kwargs | |
| self, path: Path, *, parsed_path: urllib.parse.ParseResult | None = None, resolve: bool = True, **kwargs |
dissect/target/target.py
Outdated
| elif not path.is_dir(): | ||
| raise TargetError(f"Failed to find loader for {path}: path is not a directory") | ||
|
|
||
| for entry in path.iterdir(): | ||
| for target in _open_all(entry, include_children=include_children): | ||
| at_least_one_loaded = True | ||
| yield target | ||
| if path.is_dir(): | ||
| for entry in path.iterdir(): | ||
| for target in _open_all(entry, include_children=include_children): | ||
| at_least_one_loaded = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted
dissect/target/loader.py
Outdated
| # Pass the parsed options directly to the implementation map function | ||
| return self._map(target, **vars(loader_options)) | ||
|
|
||
| def _map(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _map(self) -> None: | |
| def _map(self, **kwargs) -> None: |
| force_dirfs = "force-directory-fs" in self.parsed_query | ||
| fallback_to_dirfs = "fallback-to-directory-fs" in self.parsed_query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would force this new method of doing these command line arguments where acquire will stop functioning properly as it uses these query parameters. https://github.com/fox-it/acquire/blob/c109fe4f5e642078931fc8c3d80ab7f39e1c496f/acquire/acquire.py#L2433
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can keep both approaches if we combine the force_directory_fs argument with the one from self.parsed_query.
However, I also think it is a bit weird to apply "stringified" arguments when using target from acquire. Perhaps we can get rid of this URI scheme altogether. The question then becomes how to pass arguments to the loader programmatically.
- Do not instantiate full instance of loader when displaying help - Process all known arguments
c09be57 to
f36dc28
Compare
f36dc28 to
95dba81
Compare
dissect/target/loader.py
Outdated
| # Pass the parsed options directly to the implementation map function | ||
| return self._map(target, **vars(loader_options)) | ||
|
|
||
| def _map(self, **kwargs) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _map(self, **kwargs) -> None: | |
| def _map(self, target: Target, *, **kwargs) -> None: |
After looking at it again, maybe like this instead as a target always gets passed down and we don't want any further positional arguments as far as I know
| # Do not continue processing, we requested a child but got none | ||
| raise | ||
|
|
||
| return target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a good idea to just remove your changes from this specific file
| def _create_parser(cls: type[Loader]) -> argparse.ArgumentParser: | ||
| """Creates the argument parser for this loader.""" | ||
| parser = argparse.ArgumentParser( | ||
| prog=f"loader:{cls.__name__.lower()}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to keep it in line with how the help is done for functions
E.g, like:
$ target-query -f activitiescache -h
usage: target-query -f activitiescache [-h]
as using -h for the loader will now not give any information on how to use it with the tool:
$ target-query -L local -h
usage: loader:localloader [--force-directory-fs] [--fallback-to-directory-fs]
which is a bit odd for a user perspective
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment, can be deferred when we implement it IMO
| if not args.loader: | ||
| parser.print_help() | ||
| return 0 | ||
|
|
||
| if (loader_cls := LOADERS_BY_SCHEME.get(args.loader, None)) is None: | ||
| print(f"Error: Loader '{args.loader}' not found.") | ||
| return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moving this check to e.g. process generic arguments function in tools/utils/cli.py will add this behaviour to all target-* tooling instead of just query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do this in the proper implementation
| dict(urllib.parse.parse_qsl(parsed_path.query, keep_blank_values=True)) if parsed_path else {} | ||
| ) | ||
|
|
||
| self._parser = self._create_parser(self.__class__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not quite sure we want to create the parser in the __init__ of the loader. As it feels (to me) that the argument parser should be created / handled before the loader is initiated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the previous objection was that showing the help message required instantiation of the loader, which is no longer the case?
From an architectural standpoint, parsing arguments typically is done in a different (outermost) layer of a program. Is that the reason you want it done before the loader is initiated?
I don't necessarily disagree with that, but plugin arguments are also processed pretty "late":
| known_args, _ = parser.parse_known_args(args) |
Moreover, originally the parsing of URI arguments was also done "late":
[target.parsed_path = urllib.parse.urlparse(str(target.path))](
| target.parsed_path = urllib.parse.urlparse(str(target.path)) |
TLDR I think the poc is in line what is already there, while simultaneously agreeing it is architecturally unsound. If we change it, I propose we moving the parsing of arguments to the outermost layer of the program for all cases.
No description provided.