Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: add option --clean-workdir to verdi node delete #6756

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

khsrali
Copy link
Contributor

@khsrali khsrali commented Feb 14, 2025

Implements #6732 also solves #4693.

Like the tiles says, now, with verdi node delete --clean-workdir one can also remove all remote directories of workflows and workchains.

Copy link

codecov bot commented Feb 14, 2025

Codecov Report

Attention: Patch coverage is 98.07692% with 1 line in your changes missing coverage. Please review.

Project coverage is 78.19%. Comparing base (f4c55f5) to head (8555370).

Files with missing lines Patch % Lines
src/aiida/orm/utils/remote.py 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6756      +/-   ##
==========================================
+ Coverage   78.15%   78.19%   +0.05%     
==========================================
  Files         564      564              
  Lines       42643    42677      +34     
==========================================
+ Hits        33322    33367      +45     
+ Misses       9321     9310      -11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@superstar54 superstar54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @khsrali , thanks for addressing the issues. I added a few comments on the code. I am not familiar with tests in aiida-core, so no comments on that part.

@click.option(
'--clean-workdir',
is_flag=True,
help='Also clean the remote work directory. (only applies to workflows and CalcJobNodes)',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only applies to workflows and CalcJobNodes

It is better to revise this message because this applies to any node, as a node can be an input of the workchain or calcjob.

e.g., but could be improved.

Suggested change
help='Also clean the remote work directory. (only applies to workflows and CalcJobNodes)',
help='Also clean the remote work directories associated with the nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point I wanted to make clear, is that doesn't apply if one pass a RemoteData node.
Just to respect traversal rule, and to make it easier, RemoteData directories can only be cleaned if the parent calcjob is passed to the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point I wanted to make clear, is that doesn't apply if one pass a RemoteData node.

I think it will also work if you pass a RemoteData node, because the associated process node will be deleted thus the RemoteData node itself will be cleaned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. I can improve the help message.

Comment on lines 401 to 412
user = User.collection.get_default()
counter = 0
for computer_uuid, paths in path_mapping.items():
computer = load_computer(uuid=computer_uuid)
transport = AuthInfo.collection.get(
dbcomputer_id=computer.pk, aiidauser_id=user.pk
).get_transport()

with transport:
for remote_folder in paths:
remote_folder._clean(transport=transport)
counter += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar code is used in calcjob_cleanworkdir. Consider refactoring into a reusable function to avoid duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wanted modularize, but then I thought it's also used only in these two functions (here and calcjob_cleanworkdir), Also it's specific codeme, I'm not sure if it worth to modularize it. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to modularize as much as we can 😆 .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean it reduces the readability. If it's only for two use case.
But ok, I'll see what I can do. ;)


with transport:
for remote_folder in paths:
remote_folder._clean(transport=transport)
Copy link
Member

@superstar54 superstar54 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if the remote folder is already cleaned or not, either here or inside the _clean method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That already been done via flag only_not_cleaned:

path_mapping = get_calcjob_remote_paths(calcjobs_pks, only_not_cleaned=True,)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good!

'Remote directories of nodes with the following pks would be cleaned: '
+ ' '.join(map(str, descendant_pks))
)
click.confirm('Shall I continue?', abort=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command asks for confirmation twice: 1) cleaning remote folders and 2) deleting nodes. I would prefer a single confirmation prompt by merging the messages into one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I did this, was to provide additional flexibility so to only clean Remote directories without actually deleting the nodes.
For instance in #4693 that might be useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, it makes sense to me. 👍

Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment in the original OP, I think it should not add to verdi node.

For the implementation, to avoid multi-layer of if..else, please consider using early return.

Comment on lines +373 to +374
if not calcjobs_pks:
echo.echo_report('--clean-workdir ignored. No CalcJobNode associated with the given node, found.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can do early return here to avoid nested if..else.

Comment on lines 381 to 382
if not path_mapping:
echo.echo_report('--clean-workdir ignored. CalcJobNode work directories are already cleaned.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early return here as well?

else:
descendant_pks = [remote_folder.pk for paths in path_mapping.values() for remote_folder in paths]

if not force and not dry_run:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early return?

)
click.confirm('Shall I continue?', abort=True)

if dry_run:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early return as well?

@khsrali
Copy link
Contributor Author

khsrali commented Feb 15, 2025

@unkcpz Thanks for your suggestions,

For the implementation, to avoid multi-layer of if..else, please consider using early return.

Early return had to be avoided, because --clean-workdir is an additional flag, the command should continue executing even if there is no directories to clean. And cleaning has to come before deleting nodes.

For more detailed behaviour, please also have a look at the tests. All scenarios and the expected behaviour are pictured, there.

In aiida-core, at the moment, only calcjob and workchain will have remote resource that need to cleaned so I think it >makes no sense to add functionality to add --clean-remote-folder to all node type since there are tons of other node type >that has no remote folder attached. Even if you decide to add this to aiida-core. will it make more sense to add to verdi >process subcommand?

The only fact it doesn't apply to add nodes, is not an issue itself, because verdi node delete already supports options that are specific to wrokflow nodes, for example:

  --call-calc-forward / --no-call-calc-forward
                                  Whether to expand the node set by following
                                  CALL links to calculations forwards.
                                  [default: call-calc-forward]
  --call-work-forward / --no-call-work-forward
                                  Whether to expand the node set by following
                                  CALL links to workflows forwards.  [default:
                                  call-work-forward]

Anyways, I like your idea of having it as a subcommand of verdi process, however that introduces more challenge, as the process might already being run so first has to be do a verdi process kill, etc.
And tbh this feature is more of a node feature, so that it cleans remote files as well, while deleting a node. So I more inclined to @superstar54 suggestion that it fits more to verdi node delete

@unkcpz
Copy link
Member

unkcpz commented Feb 15, 2025

Early return had to be avoided, because --clean-workdir is an additional flag

I believe you can ;) Just wrap the block inside if clean_workchain into a function and ask chatgpt how to do early return. The happy path of the code is bit more clear.

For more detailed behaviour, please also have a look at the tests.

Nice.

because verdi node delete already supports options that are specific to wrokflow nodes, for example:

You may misunderstood here for --call-calc-forward and --call-work-forward. This is from graph_traversal_rule and apply for nodes that along the traverse. However, I am talking about the node (the pk or uuid) you passed to the command, the clean_workdir is ONLY applied to workchain and calcjob.

@@ -328,9 +328,14 @@ def extras(nodes, keys, fmt, identifier, raw):
@click.argument('identifier', nargs=-1, metavar='NODES')
@options.DRY_RUN()
@options.FORCE()
@click.option(
'--clean-workdir',
is_flag=True,
Copy link
Member

@unkcpz unkcpz Feb 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put default to True is backward incompatible and potentially danger.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's exactly what this command does :-)
It define it as a flag with is_flag=True so the default value is False

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Then could you write default=False to be more explicit?

@unkcpz
Copy link
Member

unkcpz commented Feb 15, 2025

however that introduces more challenge, as the process might already being run so first has to be do a verdi process kill, etc.

I don't get want you mean, sorry. Put the subcommand in verdi node didn't solve the issue you mentioned. The process can still run but your verdi node delete can still called without kill the process first, no?

@khsrali
Copy link
Contributor Author

khsrali commented Feb 17, 2025

You may misunderstood here for --call-calc-forward and --call-work-forward. This is from graph_traversal_rule and apply >for nodes that along the traverse. However, I am talking about the node (the pk or uuid) you passed to the command, the >clean_workdir is ONLY applied to workchain and calcjob.

I understand, but traversal rules affects only descendants of workchain and calcjob. As it has no effect or whatsoever on individual RemoteData nodes for example. The affect only appears if a calcjob node that includes that RemoteData has to be deleted according to the given traversal rules. To be clear:
verdi node delete <RemoteData> --call-calc-forward is exactly identical to:
verdi node delete <RemoteData>

In the exact same way, '--clean-workdir' has no effect on RemoteData nodes, when specified individually:
verdi node delete <RemoteData> --clean-workdir is exactly identical to:
verdi node delete <RemoteData

@khsrali
Copy link
Contributor Author

khsrali commented Feb 17, 2025

I don't get want you mean, sorry. Put the subcommand in verdi node didn't solve the issue you mentioned. The process >can still run but your verdi node delete can still called without kill the process first, no?

True, but I feel like that's the expected behavior of verdi node delete "delete the node no matter what's happening or what's the stage" in fact, it already does that. It does not care at all, if the process is running or whatever.

While for verdi process that's not the case. It's expected behavior is different, one supposed to
pause, play, repair a process. Is not meant to support a "reckless" command, in my opinion.

@superstar54
Copy link
Member

I agree with @khsrali ,

  • verdi process command is designed to interact with Process (most of the time, the living Process), not the Node.
  • because of the graph_traversal_rule, the --clean-workdir flag works not only for WorkChain node, but for all node type.

@khsrali
Copy link
Contributor Author

khsrali commented Feb 18, 2025

@superstar54 applied your review. Now, it's more modular, and more readable.
Please let me know if you have more concerns

@superstar54
Copy link
Member

Please let me know if you have more concerns

thanks for the work, LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add option to clean remote folder when deleting nodes
3 participants