Tensorboard metrics not loaded properly #6907

timr1101 · 2024-09-06T22:02:21Z

I have the problem that in Tensorboard the metrics are not loaded correctly (the column is always empty), although the scalars are saved correctly. I am working with torch.utils.tensorboard.

Relevant code:

writer = SummaryWriter(log_dir=f'./logs/studies/{study_name}/')

In the training loop:
writer.add_scalar(tag='validation/min_loss', scalar_value=min_val_loss, global_step=trial.number)

Add the hyperparameter to the summary writer (args_dict is a dictionary with all hyperparameters)
writer.add_hparams(hparam_dict=args_dict, metric_dict={'validation/min_loss': min_val_loss}, run_name=run_name)
writer.close()

The text was updated successfully, but these errors were encountered:

JamesHollyer · 2024-09-16T18:56:44Z

Are the metrics showing up in the Time Series or Scalar tabs? Did you try selecting the "show metrics" check boxes?

timr1101 · 2024-09-18T09:58:33Z

The scalars associated with the metrics are loaded correctly in both the TIME SERIES and SCALARS tabs. The only problem is that no metrics are displayed in the HPARAMS tab. When I select the "show metrics" checkboxes, a completely empty chart pops up.

JamesHollyer · 2024-09-18T16:58:42Z

Wow that is strange! I do not see why that would happen and I cannot seem to reproduce it. Is this happening with other logs or just this one?

timr1101 · 2024-09-18T21:18:56Z

Yes, it's weird. It doesn't seem to be a problem only with these specific logs. I've also used other scalars as metrics, but that didn't change the result. It is perhaps also noteworthy that I encountered exactly the same problem with a completely different implementation, namely the code from the Official Guide to Hyperparameter Optimization with tensorboard (this is a tensorflow implementation). The scalars were displayed correctly in the TIME SERIES and SCALARS tab, but the column of the corresponding metric „Accuracy“ in the HPARAMS tab remained empty.

Related code (from the official guide):


import tensorflow as tf
from tensorboard.plugins.hparams import api as hp


fashion_mnist = tf.keras.datasets.fashion_mnist

(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

HP_NUM_UNITS = hp.HParam('num_units', hp.Discrete([16, 32]))
HP_DROPOUT = hp.HParam('dropout', hp.RealInterval(0.1, 0.2))
HP_OPTIMIZER = hp.HParam('optimizer', hp.Discrete(['adam', 'sgd']))

METRIC_ACCURACY = 'accuracy'

with tf.summary.create_file_writer('logs/hparam_tuning').as_default():
  hp.hparams_config(
    hparams=[HP_NUM_UNITS, HP_DROPOUT, HP_OPTIMIZER],
    metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')],
  )

def train_test_model(hparams):
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(hparams[HP_NUM_UNITS], activation=tf.nn.relu),
    tf.keras.layers.Dropout(hparams[HP_DROPOUT]),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax),
  ])
  model.compile(
      optimizer=hparams[HP_OPTIMIZER],
      loss='sparse_categorical_crossentropy',
      metrics=['accuracy'],
  )

  model.fit(x_train, y_train, epochs=1) # Run with 1 epoch to speed things up for demo purposes
  _, accuracy = model.evaluate(x_test, y_test)
  return accuracy

def run(run_dir, hparams):
  with tf.summary.create_file_writer(run_dir).as_default():
    hp.hparams(hparams)  # record the values used in this trial
    accuracy = train_test_model(hparams)
    tf.summary.scalar(METRIC_ACCURACY, accuracy, step=1)


session_num = 0

for num_units in HP_NUM_UNITS.domain.values:
  for dropout_rate in (HP_DROPOUT.domain.min_value, HP_DROPOUT.domain.max_value):
    for optimizer in HP_OPTIMIZER.domain.values:
      hparams = {
          HP_NUM_UNITS: num_units,
          HP_DROPOUT: dropout_rate,
          HP_OPTIMIZER: optimizer,
      }
      run_name = "run-%d" % session_num
      print('--- Starting trial: %s' % run_name)
      print({h.name: hparams[h] for h in hparams})
      run('logs/hparam_tuning/' + run_name, hparams)
      session_num += 1

JamesHollyer · 2024-09-19T20:38:07Z

Is it possible for you to send me your log files?

timr1101 · 2024-09-19T21:26:15Z

Sure. But since I'm currently on vacation, I can't do this until the beginning of next week.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensorboard metrics not loaded properly #6907

Tensorboard metrics not loaded properly #6907

timr1101 commented Sep 6, 2024 •

edited

Loading

JamesHollyer commented Sep 16, 2024

timr1101 commented Sep 18, 2024

JamesHollyer commented Sep 18, 2024

timr1101 commented Sep 18, 2024 •

edited

Loading

JamesHollyer commented Sep 19, 2024

timr1101 commented Sep 19, 2024

Tensorboard metrics not loaded properly #6907

Tensorboard metrics not loaded properly #6907

Comments

timr1101 commented Sep 6, 2024 • edited Loading

JamesHollyer commented Sep 16, 2024

timr1101 commented Sep 18, 2024

JamesHollyer commented Sep 18, 2024

timr1101 commented Sep 18, 2024 • edited Loading

JamesHollyer commented Sep 19, 2024

timr1101 commented Sep 19, 2024

timr1101 commented Sep 6, 2024 •

edited

Loading

timr1101 commented Sep 18, 2024 •

edited

Loading