Shawn’s Blog
  • 🗂️ Collections
    • 🖥️ Slides Gallery
    • 💻 LeetCode Notes
    • 🧑‍🍳️ Cooking Ideas
    • 🍱 Cookbook
    • 💬 Language Learning
    • 🎼 Songbook
  • ⚙️ Projects
    • ⚛ Continual Learning Arena
  • 📄 Papers
    • AdaHAT
    • FG-AdaHAT
  • 🎓 CV
    • CV (English)
    • CV (Mandarin)
    • CV (Mandarin, Long Version)
  • About
  1. Continual Learning (CL)
  2. Full Experiment
  • Welcome to CLArena
  • Getting Started
  • Configure Pipelines
  • Continual Learning (CL)
    • CL Main Experiment
    • Save and Evaluate Model
    • Full Experiment
    • Output Results
  • Continual Unlearning (CUL)
    • CUL Main Experiment
    • Full Experiment
    • Output Results
  • Multi-Task Learning (MTL)
    • MTL Experiment
    • Save and Evaluate Model
    • Output Results
  • Single-Task Learning (STL)
    • STL Experiment
    • Save and Evaluate Model
    • Output Results
  • Components
    • CL Dataset
    • MTL Dataset
    • STL Dataset
    • CL Algorithm
    • CUL Algorithm
    • MTL Algorithm
    • STL Algorithm
    • Backbone Network
    • Optimizer
    • Learning Rate Scheduler
    • Trainer
    • Metrics
    • Lightning Loggers
    • Callbacks
    • Other Configs
  • Custom Implementation
    • CL Dataset
    • MTL Dataset
    • STL Dataset
    • CL Algorithm
    • CUL Algorithm
    • MTL Algorithm
    • STL Algorithm
    • Backbone Network
    • Callback
  • API Reference
  • FAQs

On this page

  • Reference Joint Learning Experiment
  • Reference Independent Learning Experiment
  • Reference Random Learning Experiment
  • Full Evaluation
    • Evaluation Pipeline
    • Running
    • Configuration
      • Example
      • Required Config Fields
  • Full Experiment
    • Running
  1. Continual Learning (CL)
  2. Full Experiment

Continual Learning Full Experiment

Modified

October 6, 2025

The continual learning main experiment and evaluation can only produce basic evaluation results. To fully evaluate a continual learning model, reference experiments in addition to the continual learning main experiment are required:

  • Joint Learning: A Multi-Task Learning (MTL) experiment, where all tasks are trained jointly in a multi-task fashion on the mixed dataset (mixing the CL dataset into a big one).
  • Independent Learning: Each task is trained independently on a separate copy of the model.
  • Random Learning: A randomly initialized model without being trained on any task.

Their results can be used to compute advanced metrics, which include Backward Transfer (BWT), Forward Transfer (FWT), and Forgetting Rate (FR), which is called continual learning full evaluation. The entire pipeline (including main experiment, reference experiments, and full evaluation) is called continual learning full experiment. We introduce their running and configuration instructions below.

Reference Joint Learning Experiment

Instead of constructing it manually, you can run the reference joint learning experiment easily through specifying the CL_REF_JOINT_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_REF_JOINT_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a joint learning config by:

  • Set the output_dir as subfolder refjoint/ under the output_dir of the main experiment.
  • Use /cl_dataset to construct the corresponding multi-task learning dataset clarena.mtl_datasets.MTLDatasetsFromCL.
  • Add field /mtl_algorithm and set it to joint learning.
  • Remove fields related to continual learning, such as cl_paradigm, /cl_algorithm, /cl_dataset.
  • Switch /metrics and /callbacks to multi-task learning counterparts.

For details, please check the source code.

Reference Independent Learning Experiment

Instead of constructing it manually, you can run the reference independent learning experiment easily through specifying the CL_REF_INDEPENDENT_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_REF_INDEPENDENT_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into an independent learning config by:

  • Set the output_dir as subfolder refindependent/ under the output_dir of the main experiment.
  • Set field /cl_algorithm to independent learning.

For details, please check the source code.

Reference Random Learning Experiment

Instead of constructing it manually, you can run the reference random learning experiment easily through specifying the CL_REF_RANDOM_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_REF_RANDOM_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a random learning config by:

  • Set the output_dir as subfolder refrandom/ under the output_dir of the main experiment.
  • Set field /cl_algorithm to random learning.

For details, please check the source code.

Full Evaluation

Continual learning full evaluation pipeline computes advanced metrics from saved evaluation result files of the main and reference experiments. Its output results are summarized in Output Results (CL).

Evaluation Pipeline

  • Compute Backward Transfer (BWT) using the main experiment accuracy metrics from saved CSV files. Save the data and figures.
  • Compute Forward Transfer (FWT) using the main experiment and the reference independent learning experiment accuracy metrics from saved CSV files. Save the data and figures.
  • Compute Forgetting Rate (FR) using the main experiment, the reference joint learning experiment and the reference random learning experiment accuracy metrics from saved CSV files. Save the data and figures.

Running

To run a continual learning full evaluation, specify the CL_FULL_EVAL indicator in the command:

clarena pipeline=CL_FULL_EVAL index=<index-config-name>

Configuration

To run a custom continual learning full evaluation, create a YAML file in the index/ folder as index config. Below is an example.

Example

example_configs/index/example_cl_full_eval.yaml
# @package _global_
# make sure to include the above commented global setting!

# evaluation info
pipeline: CL_FULL_EVAL
eval_tasks: 10

# evaluation target
main_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/results/acc.csv
refjoint_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/refjoint/results/acc.csv
refindependent_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/refindependent/results/acc.csv
refrandom_acc_csv_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/refrandom/results/acc.csv

# components
defaults:
  - /hydra: default.yaml
  - /misc: default.yaml

# outputs
output_dir: outputs/example_cl_main_expr/2023-10-01_12-00-00  # output to the same folder as the experiment

bwt_save_dir: ${output_dir}/results
bwt_csv_name: bwt.csv
bwt_plot_name: bwt_plot.png

fwt_save_dir: ${output_dir}/results
fwt_csv_name: fwt.csv
fwt_plot_name: fwt_plot.png

fr_save_dir: ${output_dir}/results
fr_csv_name: fr.csv

Required Config Fields

Below is the list of required config fields for the index config of continual learning full evaluation.

Field

Description

Allowed Values

pipeline The default pipeline that clarena use the config to run
  • Choose from supported pipeline indicators
  • Only CL_FULL_EVAL, CL_FULL_EVAL_ATTACHED are allowed
eval_tasks The list of task IDs1 to evaluate
  • List of integers tk: At least 1
  • Integer T: Equivalent to list of integer [1,⋯,T]. At least 0 (no task to evaluate)
  • These tasks must exist in the accuracy CSV files
main_acc_csv_path The path to the main experiment accuracy CSV file
  • Relative path to where you run the clarena command
refjoint_acc_csv_path The path to the reference joint learning experiment accuracy CSV file
  • Relative path to where you run the clarena command
  • Optional. If not specified, computing FR will be skipped
refindependent_acc_csv_path The path to the reference independent learning experiment accuracy CSV file
  • Relative path to where you run the clarena command
  • Optional. If not specified, computing FWT will be skipped
refrandom_acc_csv_path The path to the reference random learning experiment accuracy CSV file
  • Relative path to where you run the clarena command
  • Optional. If not specified, computing FR will be skipped
/hydra Configuration for Hydra
  • Choose from sub-config YAML files in hydra/ folder
  • Please refer to Other Configs section
/misc Miscellaneous configs that are less related to the experiment
  • Choose from sub-config YAML files in misc/ folder
  • Please refer to Other Configs section
output_dir The folder storing the evaluation results
  • Relative path to where you run the clarena command
bwt_save_dir The folder storing the BWT metric results
  • Relative path to where you run the clarena command
bwt_csv_name The file name to store the BWT metrics as CSV file
  • Relative path to bwt_dir
bwt_plot_name The file name to store the BWT metrics as plot figure
  • Relative path to bwt_dir
  • Optional. If not specified, the BWT plot figure will not be saved
fwt_save_dir The folder storing the FWT metric results
  • Relative path to where you run the clarena command
  • Optional. When refindependent_acc_csv_path is not provided, this field can be excluded
fwt_csv_name The file name to store the FWT metrics as CSV file
  • Relative path to fwt_dir
  • Optional. When refindependent_acc_csv_path is not provided, this field can be excluded
fwt_plot_name The file name to store the FWT metrics as plot figure
  • Relative path to fwt_dir
  • Optional. If not specified, the FWT plot figure will not be saved. When refindependent_acc_csv_path is not provided, this field can be excluded
fr_save_dir The folder storing the FR metric results
  • Relative path to where you run the clarena command
  • Optional. When refjoint_acc_csv_path or refrandom_acc_csv_path are not provided, this field can be excluded.
fr_csv_name The file name to store the FR metrics as CSV file
  • Relative path to fr_dir
  • Optional. When refjoint_acc_csv_path or refrandom_acc_csv_path are not provided, this field can be excluded
Note

The continual learning full evaluation is managed by a CLFullEvaluation class. To learn how these fields work, please refer to its source code.

Full Experiment

Instead of constructing and executing above pipelines one by one manually, CLArena integrates them into one command for a continual learning full experiment.

Running

To run a continual learning full experiment, specify the CL_FULL_EXPR indicator with the main experiment config in the command:

clarena pipeline=CL_FULL_EXPR index=<main-experiment-index-config-name>

This effectively runs:

  • clarena pipeline=CL_MAIN_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CL_REF_JOINT_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CL_REF_INDEPENDENT_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CL_REF_RANDOM_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CL_FULL_EVAL index=<index-config-name> where the config is constructed based on the main and reference experiments, detailed as follows (see the source code):
    • Set eval_task to train_task of the main experiment.
    • Align main_acc_csv_path, refjoint_acc_csv_path, refindependent_acc_csv_path, refrandom_acc_csv_path with the path where the main and reference experiments output the accuracy metrics data to.
    • Set output_dir and all save_dir as the same as the output_dir of main experiment, so that all the results are saved in the same folder.
    • Set bwt_csv_name, fwt_csv_name, fr_csv_name, bwt_plot_name, fwt_plot_name, fr_plot_name as the default names.

We also allow running reference experiments separately and providing them to the full experiment (This is useful when multiple main experiments share the same reference runs, allowing reuse without redundant retraining). To do this, specify additional arguments:

clarena pipeline=CL_FULL_EXPR index=<main-experiment-index-config-name> +refjoint_acc_csv_path=... +refindependent_acc_csv_path=... +refrandom_acc_csv_path=...

For any reference experiment whose accuracy CSV path is provided, execution is skipped and the results provided are used directly. You may specify any of the paths.

Back to top

Footnotes

  1. The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎

Save and Evaluate Model
Output Results
 
 

©️ 2025 Pengxiang Wang. All rights reserved.