[BUG] inpainting metrics are not computed correctly #957

neuronflow · 2024-10-10T08:10:15Z

GaNDLF produces metrics that are different from our official inpainting package (https://pypi.org/project/inpainting/).

To compute metrics with the official package:

pip install inpainting

from inpainting.challenge_metrics_2023 import generate_metrics, read_nifti_to_tensor


def compute_image_quality_metrics(
    prediction: str,
    healthy_mask: str,
    reference_t1: str,
    voided_t1: str,
) -> dict:
    print("computing metrics!")
    print("prediction:", prediction)
    print("healthy_mask:", healthy_mask)
    print("reference_t1:", reference_t1)
    print("voided_t1:", voided_t1)

    prediction_data = read_nifti_to_tensor(prediction)
    healthy_mask_data = read_nifti_to_tensor(healthy_mask).bool()
    reference_t1_data = read_nifti_to_tensor(reference_t1)
    voided_t1_data = read_nifti_to_tensor(voided_t1)

    metrics = generate_metrics(
        prediction=prediction_data,
        target=reference_t1_data,
        normalization_tensor=voided_t1_data,
        mask=healthy_mask_data,
    )

    return metrics


if __name__ == "__main__":
    official_metrics = compute_image_quality_metrics(
    prediction="path_to_prediction.nii.gz",
    healthy_mask"path_to_healthy_mask.nii.gz",
    reference_t1"path_to_reference.nii.gz",
    voided_t1"path_to_voided.nii.gz",
    )
    
    print(official_metrics)

@MarcelRosier will upload some test data to reproduce.

The text was updated successfully, but these errors were encountered:

MarcelRosier · 2024-10-10T08:31:13Z

Test data: INP-BraTS-GLI-00000-000.zip
(The Prediction was generated using last years winning algorithm)

github-actions · 2024-12-13T19:47:53Z

Stale issue message

sarthakpati · 2024-12-16T18:18:18Z

I am getting an error with this. Here are the steps I followed:

> conda create -p ./venv python=3.11 -y
[SNIP!]
> conda activate ./venv
> pip install inpainting
[SNIP!]
> python

>>> from inpainting.challenge_metrics_2023 import generate_metrics, read_nifti_to_tensor
>>> pred=read_nifti_to_tensor(r"C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\prediction.nii.gz")
>>> mask=read_nifti_to_tensor(r"C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\mask-healthy.nii.gz")
>>> reft1=read_nifti_to_tensor(r"C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\t1n-reference.nii.gz")
>>> voit1=read_nifti_to_tensor(r"C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\t1n-voided.nii.gz")
>>> generate_metrics(pred,reft1,voit1,mask)
Error: tensors used as indices must be long, int, byte or bool tensors
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Projects\temp_brats_synthesis_metrics\venv\Lib\site-packages\inpainting\challenge_metrics_2023.py", line 266, in generate_metrics
    output["ssim"] = _structural_similarity_index(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Projects\temp_brats_synthesis_metrics\venv\Lib\site-packages\inpainting\challenge_metrics_2023.py", line 92, in _structural_similarity_index
    return ssim_idx.mean()
           ^^^^^^^^
UnboundLocalError: cannot access local variable 'ssim_idx' where it is not associated with a value

neuronflow · 2024-12-16T18:19:49Z

can you try with Python 3.10? I believe we used that. not sure whether this your problem though

sarthakpati · 2024-12-16T18:23:45Z

Unsure if this has anything to do with the python version, but I will give it a go.

sarthakpati · 2024-12-16T18:38:28Z

Same error with 3.10. For reference, here is the result from GaNDLF:

(C:\Projects\GaNDLF\venv) PS C:\Projects\GaNDLF> gandlf generate-metrics -c "C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\config_synthesis.yaml" -i "C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\metrics_data_csv_gandlf.csv"
The ``converters`` are currently experimental. It may not support operations including (but not limited to) Functions in ``torch.nn.functional`` that involved data dimension
C:\Projects\GaNDLF\venv\Lib\site-packages\_distutils_hack\__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")
2024-12-16 13:26:55 - INFO - The logs are saved in C:\Users\sarth\.gandlf\20241216_132655.log
WARNING: Initializing 'norm_type' as 'batch'
WARNING: This is a special case for multi-class computation, where different labels are processed together, `reverse_one_hot` will need mapping information to work correctly
WARNING: Defining 'patch_sampler' as a string will be deprecated in a future release, please use a dictionary instead
WARNING: Initializing 'verbose' as False
WARNING: Initializing 'medcam_enabled' as False
WARNING: Initializing 'save_training' as False
WARNING: Initializing 'save_output' as False
WARNING: Initializing 'in_memory' as False
WARNING: Initializing 'pin_memory_dataloader' as False
WARNING: Initializing 'scaling_factor' as 1
WARNING: Initializing 'clip_grad' as None
WARNING: Initializing 'track_memory_usage' as False
WARNING: Initializing 'memory_save_mode' as False
WARNING: Initializing 'print_rgb_label_warning' as True
WARNING: Initializing 'grid_aggregator_overlap' as crop
WARNING: Initializing 'determinism' as False
WARNING: Initializing 'previous_parameters' as None
WARNING: Initializing 'clip_mode' as None
WARNING: Setting default step_size to: 0.1
  0%|                                                                                                         | 0/1 [00:00<?, ?it/s]2024-12-16 13:26:56 - py.warnings - WARNING - warnings:_showwarnmsg:109 - C:\Projects\GaNDLF\venv\Lib\site-packages\torchmetrics\utilities\prints.py:62: FutureWarning: Importing `StructuralSimilarityIndexMeasure` from `torchmetrics` was deprecated and will be removed in 2.0. Import `StructuralSimilarityIndexMeasure` from `torchmetrics.image` instead.
  _future_warning(

2024-12-16 13:26:56 - py.warnings - WARNING - warnings:_showwarnmsg:109 - C:\Projects\GaNDLF\GANDLF\metrics\synthesis.py:34: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen/native/IndexingUtils.h:28.)
  ssim_idx = ssim_idx_full_image[mask]

2024-12-16 13:33:14 - py.warnings - WARNING - warnings:_showwarnmsg:109 - C:\Projects\GaNDLF\GANDLF\cli\generate_metrics.py:382: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen/native/IndexingUtils.h:28.)
  gt_image_infill = gt_image_infill[mask]

2024-12-16 13:33:14 - py.warnings - WARNING - warnings:_showwarnmsg:109 - C:\Projects\GaNDLF\GANDLF\cli\generate_metrics.py:383: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen/native/IndexingUtils.h:28.)
  output_infill = output_infill[mask]

2024-12-16 13:33:14 - py.warnings - WARNING - warnings:_showwarnmsg:109 - C:\Projects\GaNDLF\venv\Lib\site-packages\torchmetrics\utilities\prints.py:62: FutureWarning: Importing `PeakSignalNoiseRatio` from `torchmetrics` was deprecated and will be removed in 2.0. Import `PeakSignalNoiseRatio` from `torchmetrics.image` instead.
  _future_warning(

100%|████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [06:18<00:00, 378.84s/it]
{1: {'mae': 0.0005344419041648507,
     'mse': 5.740170649914944e-07,
     'msle': 5.466881134452706e-07,
     'ncc_max': 0.9996438986031039,
     'ncc_mean': 1.736029613960155e-06,
     'ncc_min': -0.0003618681524009567,
     'ncc_std': 0.0016800671663601758,
     'psnr': 32.42278289794922,
     'psnr_01': 62.41074752807617,
     'psnr_01_eps': 62.41075134277344,
     'psnr_eps': 32.422786712646484,
     'ssim': 0.9964786171913147}}
Finished.

Relevant files:

Config: metrics_data_csv_gandlf.csv
Data csv:

SubjectID,Target,Prediction,Mask
001,"C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\t1n-reference.nii.gz","C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\prediction.nii.gz","C:\Users\sarth\Downloads\INP-BraTS-GLI-00000-000\INP-BraTS-GLI-00000-000\mask-healthy.nii.gz"

neuronflow · 2024-12-16T19:30:00Z

I checked and for me the code works without issues, maybe it is not compatible with Windows?

This is a Python3.10 env on an Ubuntu machine:

from inpainting.challenge_metrics_2023 import generate_metrics, read_nifti_to_tensor


def compute_image_quality_metrics(
    prediction: str,
    healthy_mask: str,
    reference_t1: str,
    voided_t1: str,
) -> dict:
    print("computing metrics!")
    print("prediction:", prediction)
    print("healthy_mask:", healthy_mask)
    print("reference_t1:", reference_t1)
    print("voided_t1:", voided_t1)

    prediction_data = read_nifti_to_tensor(prediction)
    healthy_mask_data = read_nifti_to_tensor(healthy_mask).bool()
    reference_t1_data = read_nifti_to_tensor(reference_t1)
    voided_t1_data = read_nifti_to_tensor(voided_t1)

    metrics = generate_metrics(
        prediction=prediction_data,
        target=reference_t1_data,
        normalization_tensor=voided_t1_data,
        mask=healthy_mask_data,
    )

    return metrics


if __name__ == "__main__":
    official_metrics = compute_image_quality_metrics(
    prediction="/home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/prediction.nii.gz",
    healthy_mask="/home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/mask-healthy.nii.gz",
    reference_t1="/home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/t1n-reference.nii.gz",
    voided_t1="/home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/t1n-voided.nii.gz",
    )
    
    print(official_metrics)

(ipt) florian@a4000-21an1:~/flow/inpainting_test$ python check.py 
computing metrics!
prediction: /home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/prediction.nii.gz
healthy_mask: /home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/mask-healthy.nii.gz
reference_t1: /home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/t1n-reference.nii.gz
voided_t1: /home/florian/flow/inpainting_test/INP-BraTS-GLI-00000-000/t1n-voided.nii.gz
{'ssim': 0.9964787364208368, 'mse': 0.000503217859659344, 'rmse': 0.022432517260313034, 'msle': 0.0001831933914218098, 'mae': 0.01582399569451809, 'psnr': 32.57281062728802, 'psnr_eps': 32.57281284650872, 'psnr_01': 32.98243713378906, 'psnr_01_eps': 32.98244094848633}

neuronflow · 2024-12-16T19:36:31Z

What is interesting here is that you get the same/very similar values, though. However, we get different results from Synapse when they run GNDLF.

sarthakpati · 2024-12-16T19:38:54Z

This is using the latest version [ref]. Perhaps they were using an older version? Could you tag the person from Synapse who was running this?

neuronflow · 2024-12-16T19:48:09Z

I don't know their GitHub accounts. Rong and Verena both ran this code base.

Or perhaps some algorithms save files in a different format, and then something goes wrong with the file loading in GaNDLF?

The issue must be somewhere in this direction.

neuronflow · 2024-12-16T20:15:32Z

@vpchung, can you please have a look? Sarthak is unable to reproduce the issue.

sarthakpati · 2024-12-16T20:16:08Z

GaNDLF is letting SimpleITK do its thing WRT loading, so there is nothing special going on there.

Update: sent email to Rong and Verena.

vpchung · 2024-12-17T21:38:15Z

I was able to generate the same scores as shared above:

$ conda create -n inpainting python=3.10 -y && conda activate inpainting
$ pip install inpainting numpy==1.26.4

>>> from inpainting.challenge_metrics_2023 import generate_metrics, read_nifti_to_tensor
>>> def compute_image_quality_metrics(
...     ... (truncated for readability)
...     ...
... )
>>>
>>> scores = compute_image_quality_metrics(
...     prediction=os.path.join(parent, "prediction.nii.gz"),
...     healthy_mask=os.path.join(parent, "mask-healthy.nii.gz"),
...     reference_t1=os.path.join(parent, "t1n-reference.nii.gz"),
...     voided_t1=os.path.join(parent, "t1n-voided.nii.gz")
... )
computing metrics!
prediction: /Users/vchung/Downloads/INP-BraTS-GLI-00000-000/prediction.nii.gz
healthy_mask: /Users/vchung/Downloads/INP-BraTS-GLI-00000-000/mask-healthy.nii.gz
reference_t1: /Users/vchung/Downloads/INP-BraTS-GLI-00000-000/t1n-reference.nii.gz
voided_t1: /Users/vchung/Downloads/INP-BraTS-GLI-00000-000/t1n-voided.nii.gz
>>> 
>>> pprint(scores)
{'mae': 0.01582399569451809,
 'mse': 0.000503217801451683,
 'msle': 0.00018319336231797934,
 'psnr': 32.57281494140625,
 'psnr_01': 32.98244094848633,
 'psnr_01_eps': 32.98244094848633,
 'psnr_eps': 32.57281494140625,
 'rmse': 0.022432517260313034,
 'ssim': 0.996478796005249}
>>>

As mentioned in my email response, I think the mismatch of scores may actually be due to the challenge re-using the BraTS 2023 metrics MLCube, rather than this being a GaNDLF issue.

neuronflow · 2024-12-17T21:41:05Z

As mentioned in my email response, I think the mismatch of scores may actually be due to the challenge re-using the BraTS 2023 metrics MLCube, rather than this being a GaNDLF issue.

thanks, what is running under the hood there?

Would it be possible to have an ML cube wrapping around our metric pkg for the upcoming light house challenge?

vpchung · 2024-12-17T22:00:44Z

what is running under the hood there?

I'm not sure, as I did not create any of the metrics MLCubes. My best guess is that this is the source used to create the inpainting metrics MLCube, which from the setup README, uses this branch from Felix's GaNDLF fork.

Would it be possible to have an ML cube wrapping around our metric pkg for the upcoming light house challenge?

Yes, in my opinion, it would be best to create a new metrics MLCube. But perhaps @sarthakpati (or someone from MLCommons) has a better suggestion.

sarthakpati · 2024-12-18T14:07:31Z

We are currently working on having a common solution for all metrics (see #942). This would allow a single "source of truth" for all metrics, and organizers would only need to incorporate their implementations in GaNDLF. The mlcube generation and subsequent steps will be automatically taken care of.

Any feedback/help would be much appreciated!

vpchung · 2024-12-18T19:15:05Z

@sarthakpati that sounds amazing! Is there a proposed timeline for this effort? i.e. would it be ready in time for the 2025 Lighthouse challenge? I don't know of any of the dates yet (maybe @neuronflow does) but I imagine the MLCube portion would start around July/August again, like the previous BraTS challenges.

sarthakpati · 2024-12-18T20:23:57Z

The goal is for us to have this PR ready for public testing around the end of Jan.

Since this PR ties in with another major effort, I am tagging @hasan7n for more clarification regarding the specific timeline.

neuronflow · 2024-12-19T00:48:55Z

should the inpainting metrics package be incorporated into GaNDLF then?

@vpchung I don't know the exact dates, but from my understanding it will be similar to 2023/2024. Spyros should know.

sarthakpati · 2024-12-19T13:47:38Z

should the inpainting metrics package be incorporated into GNDLF then?

Since the outputs are basically the same [1, 2], does it make sense to do so? Might as well have one less package to support from your end, right?

This does raise the question about the segmentation metrics, though.

Regardless, I believe this issue is now resolved, and any further discussion should be done on a separate thread. Thoughts @vpchung @neuronflow?

sarthakpati · 2024-12-19T15:53:51Z

FYI, I discovered a significant source of variation between the metrics calculated by inpainting and GaNDLF: the use of the voided image. This was not something that was communicated to the original developer of the synthesis metrics before, hence they had only put the "Mask" option for normalization. Anyway, I just added it in sarthakpati@1c4afd7, and here are the results:

'mae': 0.01582399569451809,
'mse': 0.000503217801451683,
'msle': 0.00018319336231797934,
'psnr': 32.42278289794922,
'psnr_01': 32.98244094848633,
'psnr_01_eps': 32.98244094848633,
'psnr_eps': 32.422786712646484,
'rmse': 0.022432517260313034,
'ssim': 0.996478796005249

As you can see, the results are pretty much the same as what inpainting calculates. The added advantage with the GaNDF metrics is that the normalization can also be done on the basis of a reference brain mask as well as a voided image [ref].

github-actions bot added the no-issue-activity label Dec 13, 2024

sarthakpati removed the no-issue-activity label Dec 16, 2024

sarthakpati mentioned this issue Dec 19, 2024

Ensure synthesis metrics have an option to take voided image #981

Merged

11 tasks

sarthakpati closed this as completed in #981 Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] inpainting metrics are not computed correctly #957

[BUG] inpainting metrics are not computed correctly #957

neuronflow commented Oct 10, 2024 •

edited by sarthakpati

Loading

MarcelRosier commented Oct 10, 2024 •

edited

Loading

github-actions bot commented Dec 13, 2024

sarthakpati commented Dec 16, 2024

neuronflow commented Dec 16, 2024

sarthakpati commented Dec 16, 2024

sarthakpati commented Dec 16, 2024 •

edited

Loading

neuronflow commented Dec 16, 2024 •

edited

Loading

neuronflow commented Dec 16, 2024

sarthakpati commented Dec 16, 2024

neuronflow commented Dec 16, 2024 •

edited by sarthakpati

Loading

neuronflow commented Dec 16, 2024

sarthakpati commented Dec 16, 2024 •

edited

Loading

vpchung commented Dec 17, 2024 •

edited

Loading

neuronflow commented Dec 17, 2024

vpchung commented Dec 17, 2024

sarthakpati commented Dec 18, 2024

vpchung commented Dec 18, 2024

sarthakpati commented Dec 18, 2024

neuronflow commented Dec 19, 2024 •

edited by sarthakpati

Loading

sarthakpati commented Dec 19, 2024

sarthakpati commented Dec 19, 2024

[BUG] inpainting metrics are not computed correctly #957

[BUG] inpainting metrics are not computed correctly #957

Comments

neuronflow commented Oct 10, 2024 • edited by sarthakpati Loading

MarcelRosier commented Oct 10, 2024 • edited Loading

github-actions bot commented Dec 13, 2024

sarthakpati commented Dec 16, 2024

neuronflow commented Dec 16, 2024

sarthakpati commented Dec 16, 2024

sarthakpati commented Dec 16, 2024 • edited Loading

neuronflow commented Dec 16, 2024 • edited Loading

neuronflow commented Dec 16, 2024

sarthakpati commented Dec 16, 2024

neuronflow commented Dec 16, 2024 • edited by sarthakpati Loading

neuronflow commented Dec 16, 2024

sarthakpati commented Dec 16, 2024 • edited Loading

vpchung commented Dec 17, 2024 • edited Loading

neuronflow commented Dec 17, 2024

vpchung commented Dec 17, 2024

sarthakpati commented Dec 18, 2024

vpchung commented Dec 18, 2024

sarthakpati commented Dec 18, 2024

neuronflow commented Dec 19, 2024 • edited by sarthakpati Loading

sarthakpati commented Dec 19, 2024

sarthakpati commented Dec 19, 2024

neuronflow commented Oct 10, 2024 •

edited by sarthakpati

Loading

MarcelRosier commented Oct 10, 2024 •

edited

Loading

sarthakpati commented Dec 16, 2024 •

edited

Loading

neuronflow commented Dec 16, 2024 •

edited

Loading

neuronflow commented Dec 16, 2024 •

edited by sarthakpati

Loading

sarthakpati commented Dec 16, 2024 •

edited

Loading

vpchung commented Dec 17, 2024 •

edited

Loading

neuronflow commented Dec 19, 2024 •

edited by sarthakpati

Loading