Configure Metrics (CUL)

Modified

August 16, 2025

Metrics are used to monitor training and validation process, and evaluate the model and algorithm during testing process.

Under the framework of PyTorch Lightning, callbacks are used to add additional actions and functionalities integrated in different timing of the experiment, which includes before, during, or after training, validating, or testing process. The metrics in our packages are implemented as metric callbacks, which can do:

Calculate metrics and save their data to files.
Visualize metrics as plots from the saved data.

The details of the actions can be configured by the metric callbacks. Each group of metrics is organized as one metric callback, for example, CULDistributionDistance and CULAccuracyDifference correspond to DD and AD metrics of continual unlearning. We can apply multiple metrics at the same time.

Continual unlearning is an experiment on top of continual learning with unlearning capabilities; therefore, it shares the same metrics with CL to measure regular CL performance, please refer to Configure Metrics (CL Main) section. The metrics to measure unlearning performance must be used in CUL full evaluation experiment, please refer to CUL Full Evaluation.

Metrics is a sub-config under the experiment index config (CUL Main), as well as experiment index config (CUL full evaluation). To configure custom metrics, you need to create a YAML file in metrics/ folder. At the moment, we only support uniform metrics across all tasks. Below shows examples of the metrics config.

Example

configs
├── __init__.py
├── entrance.yaml
├── experiment
│   ├── example_culmain_train.yaml
│   └── ...
├── metrics
│   ├── cl_default.yaml
...

configs/experiment/example_culmain_train.yaml

defaults:
  ...
  - /metrics: cl_default.yaml
  ...

The metrics config is a list of metric callback objects:

configs/metrics/cl_default.yaml

- _target_: clarena.metrics.CLAccuracy
  save_dir: ${output_dir}/results/
  test_acc_csv_name: acc.csv
  test_acc_matrix_plot_name: acc_matrix.png
  test_ave_acc_plot_name: ave_acc.png
- _target_: clarena.metrics.CLLoss
  save_dir: ${output_dir}/results/
  test_loss_cls_csv_name: loss_cls.csv
  test_loss_cls_matrix_plot_name: loss_cls_matrix.png
  test_ave_loss_cls_plot_name: ave_loss_cls.png

Supported Metrics & Required Config Fields

In CLArena, we implemented many metric callbacks in clarena.metrics module that you can use for CUL main and full evaluation experiment.

The _target_ field of each callback must be assigned to the corresponding class name, such as clarena.metrics.CLAccuracy for CLAccuracy. Each metric callback has its own required fields, which are the same as the arguments of the class specified by _target_. The arguments of each metric callback class can be found in API documentation.

API Reference (Metrics) Source Code (Metrics)

Below is the full list of supported metric callbacks. Note that the “Metric Callback” is exactly the class name that the _target_ field is assigned.

CL Metrics

These metric callbacks can be applied to CUL main experiment, because they can be applied to CL main experiment.

Please refer to Supported Metrics & Required Config Fields in Configure Metrics (CL Main) for more details.

Unlearning Metrics

These metric callbacks can be applied to CUL full evaluation experiment only.

Metric Callback	Description	Required Config Fields
CULDistributionDistance	Provides all actions that are related to CUL distribution distance (DD) metric, which include: Defining, initializing and recording DD metric. Saving DD metric to files. Visualizing DD metric as plots. The callback is able to produce the following outputs: CSV files for DD in each task. Coloured plot for DD in each task.	Same as CULDistributionDistance class arguments
CULAccuracyDifference	Provides all actions that are related to CUL accuracy difference (AD) metric, which include: Defining, initializing and recording AD metric. Saving AD metric to files. Visualizing AD metric as plots. The callback is able to produce the following outputs: CSV files for AD in each task. Coloured plot for AD in each task.	Same as CULAccuracyDifference class arguments

CULDistributionDistance

Provides all actions that are related to CUL distribution distance (DD) metric, which include:

Defining, initializing and recording DD metric.
Saving DD metric to files.
Visualizing DD metric as plots.

The callback is able to produce the following outputs:

CSV files for DD in each task.
Coloured plot for DD in each task.

Same as CULDistributionDistance class arguments

CULAccuracyDifference

Provides all actions that are related to CUL accuracy difference (AD) metric, which include:

Defining, initializing and recording AD metric.
Saving AD metric to files.
Visualizing AD metric as plots.

The callback is able to produce the following outputs:

CSV files for AD in each task.
Coloured plot for AD in each task.

Same as CULAccuracyDifference class arguments