You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both MS²Rescore and Sage rely on the timsrust package to directly read Bruker raw files (.d directory or miniTDF). However, there is an inconsistency between the two tools in how they interpret scan numbers. This leads to an off-by-one issue. For instance, a spectrum with ID "3033" in MS²Rescore would have spectrum ID "3034" in the Sage results file.
This problem should be fixed in future Sage versions (lazear/sage#128 and lazear/sage#140). In the meantime, the following Python script can be used to correct the scan number in Sage result files, supporting both TSV and Parquet formats.
correct_sage_scannr.py
frompathlibimportPathimportclickimportpandasaspddefcorrect_scannr(sage_psms: pd.DataFrame) ->None:
sage_psms["scannr"] = (sage_psms["scannr"].astype(int) -1).astype(str)
@click.command()@click.argument("input_path", type=click.Path(exists=True, dir_okay=False, path_type=Path))defcorrect_sage_scannr(input_path: Path):
ifinput_path.suffix==".tsv":
sage_psms=pd.read_csv(input_path, sep="\t")
elifinput_path.suffix==".parquet":
sage_psms=pd.read_parquet(input_path)
else:
raiseValueError("Input file must be a TSV or Parquet file")
output_path=input_path.with_name(f"{input_path.stem}_corrected{input_path.suffix}")
sage_psms["scannr"] = (sage_psms["scannr"].astype(int) -1).astype(str)
ifinput_path.suffix==".tsv":
sage_psms.to_csv(output_path, sep="\t", index=False)
elifinput_path.suffix==".parquet":
sage_psms.to_parquet(output_path)
if__name__=="__main__":
correct_sage_scannr()
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Both MS²Rescore and Sage rely on the timsrust package to directly read Bruker raw files (
.d
directory or miniTDF). However, there is an inconsistency between the two tools in how they interpret scan numbers. This leads to an off-by-one issue. For instance, a spectrum with ID"3033"
in MS²Rescore would have spectrum ID"3034"
in the Sage results file.This problem should be fixed in future Sage versions (lazear/sage#128 and lazear/sage#140). In the meantime, the following Python script can be used to correct the scan number in Sage result files, supporting both TSV and Parquet formats.
correct_sage_scannr.py
Run, for example:
Beta Was this translation helpful? Give feedback.
All reactions