Skip to content

Conversation

SantaMcCloud
Copy link
Contributor

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

@SantaMcCloud
Copy link
Contributor Author

@bernt-matthias

@SantaMcCloud
Copy link
Contributor Author

i had to change nameing of the softlink a bit to remove the file extension which was in the name. Now '.' should be fine!

@bernt-matthias bernt-matthias changed the title Fix senitizer from SemiBin Fix sanitizer from SemiBin Oct 20, 2025
#for $e in $mode.multi_fasta.input_fasta
#set $identifier = re.sub('[^\s\w\-\\.]', '_', str($e.element_identifier))
#set $identifier = re.sub('[^\s\w\-]', '_', str($e.element_identifier))
#set $final_identifier = '_'.join($identifier.split('_')[:-1])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good idea. If people have collection element identifiers containing any sanitized character or an underscore this will remove part of the identifier. Users should remove file extensions during collection creation.

I guess you had some problems in the test due to the file extensions. Then its better to adapt test assumptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem in thr test was a code error because of the new naming of the file. The code tried to run to find something with S1 but since the name of the file was S1_fasta it could not find it. So therefor i dont know how to fix beside of deleteing the test which should not be the point.

Do you might have any idea for this because i didnt thought about your point. I can also add a check to see if there is any file extension left of so cut it out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess S1 is somewhere hardcoded in the test data files. This then need to be changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some work it should be fine now. I had to add extra files just for the one test case where the naming of the file was a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to merge it now. The fix is also implemented in my other PR!

Comment on lines 128 to 132
#if $e.ext.endswith(".gz")
gunzip -c '$e' > '${final_identifier}.fasta' &&
gunzip -c '$e' > '${identifier}.fasta' &&
#else
ln -s '$e' '${final_identifier}.fasta' &&
ln -s '$e' '${identifier}.fasta' &&
#end if
Copy link
Contributor

@bernt-matthias bernt-matthias Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if $e.ext.endswith(".gz")
gunzip -c '$e' > '${final_identifier}.fasta' &&
gunzip -c '$e' > '${identifier}.fasta' &&
#else
ln -s '$e' '${final_identifier}.fasta' &&
ln -s '$e' '${identifier}.fasta' &&
#end if
ln -s '$e' '${identifier}.${e.ext}' &&

semibin should be able to unzip itself: https://github.com/BigDataBiology/SemiBin/blob/886913b48814cc0cd426ee46ec20d22cee875bf5/SemiBin/fasta.py#L23

also we should allow fasta.bz2 in the formats of the input parameter (macros input-fasta-single and input-fasta-multiple). But this probably also requires changes in other tools?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change it later today :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants