Skip to content

Bug Report: Redundant Conditional Block for MCD_mode == "plain" #147

@Zijun-Mo

Description

@Zijun-Mo

Description

In the function [average_mcd](https://github.com/modelscope/ClearerVoice-Studio/blob/main/speechscore/scores/mcd.py), the condition

if MCD_mode == "plain":

is checked twice in a row, causing logical redundancy and potential dead code paths.

Here is the problematic section:

if MCD_mode == "plain":
    # pad 0
    if len(loaded_ref_wav) < len(loaded_syn_wav):
        loaded_ref_wav = np.pad(loaded_ref_wav, (0, len(loaded_syn_wav) - len(loaded_ref_wav)))
    else:
        loaded_syn_wav = np.pad(loaded_syn_wav, (0, len(loaded_ref_wav) - len(loaded_syn_wav)))

    # extract MCEP features
    ref_mcep_vec = self.wav2mcep_numpy(loaded_ref_wav, score_rate)
    syn_mcep_vec = self.wav2mcep_numpy(loaded_syn_wav, score_rate)

    if MCD_mode == "plain":  # ← redundant check
        path = []
        for i in range(len(ref_mcep_vec)):
            path.append((i, i))
    elif MCD_mode == "dtw":
        _, path = fastdtw(ref_mcep_vec[:, 1:], syn_mcep_vec[:, 1:], dist=euclidean)
    elif MCD_mode == "dtw_sl":
        cof = len(ref_mcep_vec)/len(syn_mcep_vec) if len(ref_mcep_vec)>len(syn_mcep_vec) else len(syn_mcep_vec)/len(ref_mcep_vec)
        _, path = fastdtw(ref_mcep_vec[:, 1:], syn_mcep_vec[:, 1:], dist=euclidean)

Expected Behavior

The outer if MCD_mode == "plain": should likely be removed or changed to allow all three modes (plain, dtw, dtw_sl) to follow a similar processing pipeline (padding + feature extraction).
Currently, the inner conditions for "dtw" and "dtw_sl" will never be executed, because they are nested inside a block that only runs if MCD_mode == "plain".

Actual Behavior

  • When MCD_mode is "dtw" or "dtw_sl", the code inside the inner elif blocks is never reached.
  • As a result, DTW-based MCD computation does not execute as intended.

Proposed Fix

Move the MCEP extraction part outside of the "plain" condition and structure it like this:

# pad shorter wav
if len(loaded_ref_wav) < len(loaded_syn_wav):
    loaded_ref_wav = np.pad(loaded_ref_wav, (0, len(loaded_syn_wav) - len(loaded_ref_wav)))
else:
    loaded_syn_wav = np.pad(loaded_syn_wav, (0, len(loaded_ref_wav) - len(loaded_syn_wav)))

# extract features
ref_mcep_vec = self.wav2mcep_numpy(loaded_ref_wav, score_rate)
syn_mcep_vec = self.wav2mcep_numpy(loaded_syn_wav, score_rate)

# choose path strategy
if MCD_mode == "plain":
    path = [(i, i) for i in range(len(ref_mcep_vec))]
elif MCD_mode == "dtw":
    _, path = fastdtw(ref_mcep_vec[:, 1:], syn_mcep_vec[:, 1:], dist=euclidean)
elif MCD_mode == "dtw_sl":
    cof = len(ref_mcep_vec) / len(syn_mcep_vec) if len(ref_mcep_vec) > len(syn_mcep_vec) else len(syn_mcep_vec) / len(ref_mcep_vec)
    _, path = fastdtw(ref_mcep_vec[:, 1:], syn_mcep_vec[:, 1:], dist=euclidean)

Environment

Additional Context

This bug likely causes incorrect MCD computation for "dtw" and "dtw_sl" modes, leading to invalid or nan results during evaluation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions