Continual Learning Full Experiment

Modified

October 6, 2025

The continual learning main experiment and evaluation can only produce basic evaluation results. To fully evaluate a continual learning model, reference experiments in addition to the continual learning main experiment are required:

Joint Learning: A Multi-Task Learning (MTL) experiment, where all tasks are trained jointly in a multi-task fashion on the mixed dataset (mixing the CL dataset into a big one).
Independent Learning: Each task is trained independently on a separate copy of the model.
Random Learning: A randomly initialized model without being trained on any task.

Their results can be used to compute advanced metrics, which include Backward Transfer (BWT), Forward Transfer (FWT), and Forgetting Rate (FR), which is called continual learning full evaluation. The entire pipeline (including main experiment, reference experiments, and full evaluation) is called continual learning full experiment. We introduce their running and configuration instructions below.

Reference Joint Learning Experiment

Instead of constructing it manually, you can run the reference joint learning experiment easily through specifying the CL_REF_JOINT_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_REF_JOINT_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a joint learning config by:

Set the output_dir as subfolder refjoint/ under the output_dir of the main experiment.
Use /cl_dataset to construct the corresponding multi-task learning dataset clarena.mtl_datasets.MTLDatasetsFromCL.
Add field /mtl_algorithm and set it to joint learning.
Remove fields related to continual learning, such as cl_paradigm, /cl_algorithm, /cl_dataset.
Switch /metrics and /callbacks to multi-task learning counterparts.

For details, please check the source code.

Reference Independent Learning Experiment

Instead of constructing it manually, you can run the reference independent learning experiment easily through specifying the CL_REF_INDEPENDENT_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_REF_INDEPENDENT_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into an independent learning config by:

Set the output_dir as subfolder refindependent/ under the output_dir of the main experiment.
Set field /cl_algorithm to independent learning.

For details, please check the source code.

Reference Random Learning Experiment

Instead of constructing it manually, you can run the reference random learning experiment easily through specifying the CL_REF_RANDOM_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_REF_RANDOM_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a random learning config by:

Set the output_dir as subfolder refrandom/ under the output_dir of the main experiment.
Set field /cl_algorithm to random learning.

For details, please check the source code.

Full Evaluation

Continual learning full evaluation pipeline computes advanced metrics from saved evaluation result files of the main and reference experiments. Its output results are summarized in Output Results (CL).

Evaluation Pipeline

Compute Backward Transfer (BWT) using the main experiment accuracy metrics from saved CSV files. Save the data and figures.
Compute Forward Transfer (FWT) using the main experiment and the reference independent learning experiment accuracy metrics from saved CSV files. Save the data and figures.
Compute Forgetting Rate (FR) using the main experiment, the reference joint learning experiment and the reference random learning experiment accuracy metrics from saved CSV files. Save the data and figures.

Running

To run a continual learning full evaluation, specify the CL_FULL_EVAL indicator in the command:

clarena pipeline=CL_FULL_EVAL index=<index-config-name>

Configuration

To run a custom continual learning full evaluation, create a YAML file in the index/ folder as index config. Below is an example.

Example

example_configs/index/example_cl_full_eval.yaml

# @package _global_
# make sure to include the above commented global setting!

# evaluation info
pipeline: CL_FULL_EVAL
eval_tasks: 10

# evaluation target
main_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/results/acc.csv
refjoint_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/refjoint/results/acc.csv
refindependent_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/refindependent/results/acc.csv
refrandom_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/refrandom/results/acc.csv

# components
defaults:
  - /hydra: default.yaml
  - /misc: default.yaml

# outputs
output_dir: outputs/example_cl_main_expr/2023-10-01_12-00-00  # output to the same folder as the experiment

bwt_save_dir: ${output_dir}/results
bwt_csv_name: bwt.csv
bwt_plot_name: bwt_plot.png

fwt_save_dir: ${output_dir}/results
fwt_csv_name: fwt.csv
fwt_plot_name: fwt_plot.png

fr_save_dir: ${output_dir}/results
fr_csv_name: fr.csv

Required Config Fields

Below is the list of required config fields for the index config of continual learning full evaluation.

Field	Description	Allowed Values
`pipeline`	The default pipeline that `clarena` use the config to run	Choose from supported pipeline indicators Only `CL_FULL_EVAL`, `CL_FULL_EVAL_ATTACHED` are allowed
`eval_tasks`	The list of task IDs¹ to evaluate	List of integers $t_{k}$ : At least 1 Integer $T$ : Equivalent to list of integer [ $1, \dots, T$ ]. At least 0 (no task to evaluate) These tasks must exist in the accuracy CSV files
`main_acc_csv_path`	The path to the main experiment accuracy CSV file	Relative path to where you run the `clarena` command
`refjoint_acc_csv_path`	The path to the reference joint learning experiment accuracy CSV file	Relative path to where you run the `clarena` command Optional. If not specified, computing FR will be skipped
`refindependent_acc_csv_path`	The path to the reference independent learning experiment accuracy CSV file	Relative path to where you run the `clarena` command Optional. If not specified, computing FWT will be skipped
`refrandom_acc_csv_path`	The path to the reference random learning experiment accuracy CSV file	Relative path to where you run the `clarena` command Optional. If not specified, computing FR will be skipped
`/hydra`	Configuration for Hydra	Choose from sub-config YAML files in `hydra/` folder Please refer to Other Configs section
`/misc`	Miscellaneous configs that are less related to the experiment	Choose from sub-config YAML files in `misc/` folder Please refer to Other Configs section
`output_dir`	The folder storing the evaluation results	Relative path to where you run the `clarena` command
`bwt_save_dir`	The folder storing the BWT metric results	Relative path to where you run the `clarena` command
`bwt_csv_name`	The file name to store the BWT metrics as CSV file	Relative path to `bwt_dir`
`bwt_plot_name`	The file name to store the BWT metrics as plot figure	Relative path to `bwt_dir` Optional. If not specified, the BWT plot figure will not be saved
`fwt_save_dir`	The folder storing the FWT metric results	Relative path to where you run the `clarena` command Optional. When `refindependent_acc_csv_path` is not provided, this field can be excluded
`fwt_csv_name`	The file name to store the FWT metrics as CSV file	Relative path to `fwt_dir` Optional. When `refindependent_acc_csv_path` is not provided, this field can be excluded
`fwt_plot_name`	The file name to store the FWT metrics as plot figure	Relative path to `fwt_dir` Optional. If not specified, the FWT plot figure will not be saved. When `refindependent_acc_csv_path` is not provided, this field can be excluded
`fr_save_dir`	The folder storing the FR metric results	Relative path to where you run the `clarena` command Optional. When `refjoint_acc_csv_path` or `refrandom_acc_csv_path` are not provided, this field can be excluded.
`fr_csv_name`	The file name to store the FR metrics as CSV file	Relative path to `fr_dir` Optional. When `refjoint_acc_csv_path` or `refrandom_acc_csv_path` are not provided, this field can be excluded

Note

The continual learning full evaluation is managed by a CLFullEvaluation class. To learn how these fields work, please refer to its source c o de.

Full Experiment

Instead of constructing and executing above pipelines one by one manually, CLArena integrates them into one command for a continual learning full experiment.

Running

To run a continual learning full experiment, specify the CL_FULL_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_FULL_EXPR index=<main-experiment-index-config-name>

This effectively runs:

clarena pipeline=CL_MAIN_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CL_REF_JOINT_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CL_REF_INDEPENDENT_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CL_REF_RANDOM_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CL_FULL_EVAL index=<index-config-name> where the config is constructed based on the main and reference experiments, detailed as follows (see the source code):
- Set eval_task to train_task of the main experiment.
- Align main_acc_csv_path, refjoint_acc_csv_path, refindependent_acc_csv_path, refrandom_acc_csv_path with the path where the main and reference experiments output the accuracy metrics data to.
- Set output_dir and all save_dir as the same as the output_dir of main experiment, so that all the results are saved in the same folder.
- Set bwt_csv_name, fwt_csv_name, fr_csv_name, bwt_plot_name, fwt_plot_name, fr_plot_name as the default names.

We also allow running reference experiments separately and providing them to the full experiment (This is useful when multiple main experiments share the same reference runs, allowing reuse without redundant retraining). To do this, specify additional arguments:

clarena pipeline=CL_FULL_EXPR index=<main-experiment-index-config-name> +refjoint_acc_csv_path=... +refindependent_acc_csv_path=... +refrandom_acc_csv_path=...

For any reference experiment whose accuracy CSV path is provided, execution is skipped and the results provided are used directly. You may specify any of the paths.

Footnotes

The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎

Reference Joint Learning Experiment

Reference Independent Learning Experiment

Reference Random Learning Experiment

Full Evaluation

Evaluation Pipeline

Running

Configuration

Example

Required Config Fields

Field

Description

Allowed Values

Full Experiment

Running

Footnotes