-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI
: add option --clean-workdir
to verdi node delete
#6756
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -328,9 +328,14 @@ def extras(nodes, keys, fmt, identifier, raw): | |||||
@click.argument('identifier', nargs=-1, metavar='NODES') | ||||||
@options.DRY_RUN() | ||||||
@options.FORCE() | ||||||
@click.option( | ||||||
'--clean-workdir', | ||||||
is_flag=True, | ||||||
help='Also clean the remote work directory. (only applies to workflows and CalcJobNodes)', | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It is better to revise this message because this applies to any node, as a node can be an input of the workchain or calcjob. e.g., but could be improved.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The point I wanted to make clear, is that doesn't apply if one pass a RemoteData node. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think it will also work if you pass a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, you are right. I can improve the help message. |
||||||
) | ||||||
@options.graph_traversal_rules(GraphTraversalRules.DELETE.value) | ||||||
@with_dbenv() | ||||||
def node_delete(identifier, dry_run, force, **traversal_rules): | ||||||
def node_delete(identifier, dry_run, force, clean_workdir, **traversal_rules): | ||||||
"""Delete nodes from the provenance graph. | ||||||
|
||||||
This will not only delete the nodes explicitly provided via the command line, but will also include | ||||||
|
@@ -349,6 +354,65 @@ def node_delete(identifier, dry_run, force, **traversal_rules): | |||||
except ValueError: | ||||||
pks.append(NodeEntityLoader.load_entity(obj).pk) | ||||||
|
||||||
if clean_workdir: | ||||||
from aiida.manage import get_manager | ||||||
from aiida.orm import AuthInfo, CalcJobNode, QueryBuilder, User, load_computer | ||||||
from aiida.orm.utils.remote import get_calcjob_remote_paths | ||||||
from aiida.tools.graph.graph_traversers import get_nodes_delete | ||||||
|
||||||
backend = get_manager().get_backend() | ||||||
# For here we ignore missing nodes will be raised via func:``delete_nodes`` in the next block | ||||||
pks_set_to_delete = get_nodes_delete( | ||||||
pks, get_links=False, missing_callback=lambda missing_pks: None, backend=backend, **traversal_rules | ||||||
)['nodes'] | ||||||
|
||||||
qb = QueryBuilder() | ||||||
qb.append(CalcJobNode, filters={'id': {'in': pks_set_to_delete}}, project='id') | ||||||
calcjobs_pks = [result[0] for result in qb.all()] | ||||||
|
||||||
if not calcjobs_pks: | ||||||
echo.echo_report('--clean-workdir ignored. No CalcJobNode associated with the given node, found.') | ||||||
Comment on lines
+385
to
+386
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you can do early return here to avoid nested if..else. |
||||||
else: | ||||||
path_mapping = get_calcjob_remote_paths( | ||||||
calcjobs_pks, | ||||||
only_not_cleaned=True, | ||||||
) | ||||||
|
||||||
if not path_mapping: | ||||||
echo.echo_report('--clean-workdir ignored. CalcJobNode work directories are already cleaned.') | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Early return here as well? |
||||||
else: | ||||||
descendant_pks = [remote_folder.pk for paths in path_mapping.values() for remote_folder in paths] | ||||||
|
||||||
if not force and not dry_run: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Early return? |
||||||
echo.echo_warning( | ||||||
f'YOU ARE ABOUT TO CLEAN {len(descendant_pks)} REMOTE DIRECTORIES! ' 'THIS CANNOT BE UNDONE!' | ||||||
) | ||||||
echo.echo_info( | ||||||
'Remote directories of nodes with the following pks would be cleaned: ' | ||||||
+ ' '.join(map(str, descendant_pks)) | ||||||
) | ||||||
click.confirm('Shall I continue?', abort=True) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The command asks for confirmation twice: 1) cleaning remote folders and 2) deleting nodes. I would prefer a single confirmation prompt by merging the messages into one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main reason I did this, was to provide additional flexibility so to only clean Remote directories without actually deleting the nodes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, it makes sense to me. 👍 |
||||||
|
||||||
if dry_run: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. early return as well? |
||||||
echo.echo_report( | ||||||
'Remote folders of these node are marked for deletion: ' + ' '.join(map(str, descendant_pks)) | ||||||
) | ||||||
else: | ||||||
user = User.collection.get_default() | ||||||
counter = 0 | ||||||
for computer_uuid, paths in path_mapping.items(): | ||||||
computer = load_computer(uuid=computer_uuid) | ||||||
transport = AuthInfo.collection.get( | ||||||
dbcomputer_id=computer.pk, aiidauser_id=user.pk | ||||||
).get_transport() | ||||||
|
||||||
with transport: | ||||||
for remote_folder in paths: | ||||||
remote_folder._clean(transport=transport) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Check if the remote folder is already cleaned or not, either here or inside the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That already been done via flag path_mapping = get_calcjob_remote_paths(calcjobs_pks, only_not_cleaned=True,) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good! |
||||||
counter += 1 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar code is used in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I wanted modularize, but then I thought it's also used only in these two functions (here and There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's good to modularize as much as we can 😆 . There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean it reduces the readability. If it's only for two use case. |
||||||
|
||||||
echo.echo_report(f'{counter} remote folders cleaned on {computer.label}') | ||||||
|
||||||
def _dry_run_callback(pks): | ||||||
if not pks or force: | ||||||
return False | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put default to True is backward incompatible and potentially danger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that's exactly what this command does :-)
It define it as a flag with
is_flag=True
so the default value isFalse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Then could you write
default=False
to be more explicit?