Shawn’s Blog
  • 🗂️ Collections
    • 🖥️ Slides Gallery
    • 💻 LeetCode Notes
    • 🧑‍🍳️ Cooking Ideas
    • 🍱 Cookbook
    • 💬 Language Learning
    • 🎼 Songbook
  • ⚙️ Projects
    • ⚛ Continual Learning Arena
  • 📄 Papers
    • AdaHAT
    • FG-AdaHAT
  • 🎓 CV
    • CV (English)
    • CV (Mandarin)
    • CV (Mandarin, Long Version)
  • About
  1. Continual Unlearning (CUL)
  2. Full Experiment
  • Welcome to CLArena
  • Getting Started
  • Configure Pipelines
  • Continual Learning (CL)
    • CL Main Experiment
    • Save and Evaluate Model
    • Full Experiment
    • Output Results
  • Continual Unlearning (CUL)
    • CUL Main Experiment
    • Full Experiment
    • Output Results
  • Multi-Task Learning (MTL)
    • MTL Experiment
    • Save and Evaluate Model
    • Output Results
  • Single-Task Learning (STL)
    • STL Experiment
    • Save and Evaluate Model
    • Output Results
  • Components
    • CL Dataset
    • MTL Dataset
    • STL Dataset
    • CL Algorithm
    • CUL Algorithm
    • MTL Algorithm
    • STL Algorithm
    • Backbone Network
    • Optimizer
    • Learning Rate Scheduler
    • Trainer
    • Metrics
    • Lightning Loggers
    • Callbacks
    • Other Configs
  • Custom Implementation
    • CL Dataset
    • MTL Dataset
    • STL Dataset
    • CL Algorithm
    • CUL Algorithm
    • MTL Algorithm
    • STL Algorithm
    • Backbone Network
    • Callback
  • API Reference
  • FAQs

On this page

  • Reference Retrain Experiment
  • Reference Original Experiment
  • Full Evaluation
    • Evaluation Pipeline
    • Running
    • Configuration
      • Example
      • Required Config Fields
  • Full Experiment
    • Running
  1. Continual Unlearning (CUL)
  2. Full Experiment

Continual Unlearning Full Experiment

Modified

October 6, 2025

The continual unlearning main experiment and evaluation can only produce evaluation results for continual learning, not unlearning performance. To fully evaluate a continual unlearning model, reference experiments in addition to the continual unlearning main experiment are required:

  • Retraining: Retrain continual learning from scratch without tasks requested to unlearn.
  • Original: Retrain continual learning from scratch including all tasks.

Their results can be used to compute unlearning metrics, which include Distribution Distance (DD) and Accuracy Difference (AD), which is called continual unlearning full evaluation. The entire pipeline (including main experiment, reference experiments, and full evaluation) is called continual unlearning full experiment. We introduce their running and configuration instructions below.

Reference Retrain Experiment

Instead of constructing it manually, you can run the reference retrain experiment easily through specifying the CUL_REF_RETRAIN_EXPR indicator with the main experiment config in the command:

clarena pipeline=CUL_REF_RETRAIN_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a retrain experiment config by:

  • Set the output_dir as subfolder refretrain/ under the output_dir of the main experiment.
  • Exclude the unlearned tasks in unlearning_requests in the field train_tasks and eval_after_tasks.
  • Remove fields related to unlearning, such as unlearning_requests, /cul_algorithm.
  • Switch /metrics and /callbacks to continual learning counterparts.

For details, please check the source code.

Reference Original Experiment

Instead of constructing it manually, you can run the reference original experiment easily through specifying the CUL_REF_ORIGINAL_EXPR indicator with the main experiment config in the command:

clarena pipeline=CUL_REF_ORIGINAL_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a original experiment config by:

  • Set the output_dir as subfolder reforiginal/ under the output_dir of the main experiment.
  • Remove fields related to unlearning, such as unlearning_requests, /cul_algorithm.
  • Switch /metrics and /callbacks to continual learning counterparts.

For details, please check the source code.

Full Evaluation

Continual unlearning full evaluation pipeline computes unlearning metrics from saved evaluation result files of the main and reference experiments. Its output results are summarized in Output Results (CUL).

Evaluation Pipeline

  • Evaluate Distribution Distance (DD) using the model trained from the main experiment and the model trained from reference retraining experiment, on the CL test dataset. Save the data and figures.
  • Evaluate Accuracy Difference (AD) using the model trained from the main experiment and the model trained from reference original experiment, on the CL test dataset. Save the data and figures.

Running

To run a continual unlearning full evaluation, specify the CUL_FULL_EVAL indicator in the command:

clarena pipeline=CUL_FULL_EVAL index=<index-config-name>

Configuration

To run a custom continual unlearning full evaluation, create a YAML file in the index/ folder as index config. Below is an example.

Example

example_configs/index/example_cul_full_eval.yaml
# @package _global_
# make sure to include the above commented global setting!

# evaluation info
pipeline: CUL_FULL_EVAL
dd_eval_tasks: 5
ad_eval_tasks: 5
global_seed: 1

# evaluation target
main_model_path: outputs/example_cul_main_expr/2023-10-01_12-00-00/saved_models/cl_model.pth
refretrain_model_path: outputs/example_cul_main_expr/2023-10-01_12-00-00/refretrain/results/cl_model.pth
reforiginal_model_path: outputs/example_cul_main_expr/2023-10-01_12-00-00/reforiginal/results/cl_model.pth

# paradigm settings
cl_paradigm: TIL

# components
defaults: 
  - /cl_dataset: split_mnist.yaml
  - /trainer: cpu_eval.yaml
  - /metrics: cul_full_eval_default.yaml
  - /callbacks: eval_default.yaml
  - /hydra: default.yaml
  - /misc: default.yaml

# outputs
output_dir: outputs/example_cul_main_expr/2023-10-01_12-00-00/eval # output to the same folder as the experiment

Required Config Fields

Below is the list of required config fields for the index config of continual unlearning full evaluation.

Field Description Allowed Values
pipeline The default pipeline that clarena use the config to run
  • Choose from supported pipeline indicators
  • Only CUL_FULL_EVAL, CUL_FULL_EVAL_ATTACHED are allowed
dd_eval_tasks The list of task IDs1 to evaluate Distribution Distance (DD)
  • List of integers tk: At least 1, no more than available number of tasks of CL dataset
  • Integer T: Equivalent to list of integer [1,⋯,T]. At least 0 (no task to evaluate), no more than available number of tasks of CL dataset
ad_eval_tasks The list of task IDs2 to evaluate Accuracy Difference (AD)
  • List of integers tk: At least 1, no more than available number of tasks of CL dataset
  • Integer T: Equivalent to list of integer [1,⋯,T]. At least 0 (no task to evaluate), no more than available number of tasks of CL dataset
global_seed The global seed for the entire evaluation
  • Same as seed argument in lightning.seed_everything()
main_model_path The file path of the model to evaluate
  • Relative path to where you run the clarena command for the continual unlearning main experiment
refretrain_model_path The file path of the reference retrain model
  • Relative path to where you run the clarena command for the reference retrain experiment
  • Optional. If not specified, evaluating DD will be skipped
reforiginal_model_path The file path of the reference original model
  • Relative path to where you run the clarena command for the reference original experiment
  • Optional. If not specified, evaluating AD will be skipped
cl_paradigm The continual learning paradigm
  • ‘TIL’: Task-Incremental Learning (TIL)
  • ‘CIL’: Class-Incremental Learning (CIL)
global_seed The global seed for the entire evaluation
  • Same as seed argument in lightning.seed_everything()
/cl_dataset The continual learning dataset that the model is evaluated on
  • Choose from sub-config YAML files in cl_dataset/ folder
  • See Configure CL Dataset
/trainer The PyTorch Lightning Trainer object that contains all configs for testing process
  • Choose from sub-config YAML files in trainer/ folder
  • See Configure Trainer
/metrics The metrics to be monitored, logged or visualized
  • Choose from sub-config YAML files in metrics/ folder
  • See Configure Metrics
/lightning_loggers The Lightning Loggers used to log metrics and results
  • Choose from sub-config YAML files in the lightning_loggers/ folder
  • See Configure Lightning Loggers
/callbacks The callbacks applied to this evaluation experiment (other than metric callbacks). Callbacks are additional actions integrated at different points during the evaluation
  • Choose from sub-config YAML files in callbacks/ folder
  • See Configure Callbacks
/hydra Configuration for Hydra
  • Choose from sub-config YAML files in hydra/ folder
  • See Other Configs
/misc Miscellaneous configs that are less related to the experiment
  • Choose from sub-config YAML files in misc/ folder
  • See Other Configs
output_dir The folder storing the evaluation results
  • Relative path to where you run the clarena command
  • We recommend setting it to the output_dir of continual unlearning main experiment to be evaluated
Note

The continual unlearning full evaluation is managed by a CULFullEvaluation class. To learn how these fields work, please refer to its source code.

Full Experiment

Instead of constructing and executing above pipelines one by one manually, CLArena integrates them into one command for a continual unlearning full experiment.

Running

To run a continual unlearning full experiment, specify the CUL_FULL_EXPR indicator with the main experiment config in the command:

clarena pipeline=CUL_FULL_EXPR index=<main-experiment-index-config-name>

This effectively runs:

  • clarena pipeline=CUL_MAIN_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CUL_REF_RETRAIN_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CUL_REF_ORIGINAL_EXPR index=<main-experiment-index-config-name>
  • clarena pipeline=CUL_FULL_EVAL index=<index-config-name> where the config is constructed based on the main and reference experiments, detailed as follows (see the source code):
    • Set eval_tasks to train_tasks of the main experiment.
    • Align main_model_path, refretrain_model_path, reforiginal_model_path with the path where the main and reference experiments output the trained models to.
    • Add unlearning metric callbacks CULDistributionDistance and CULAccuracyDifference

We also allow running reference experiments separately and providing them to the full experiment (This is useful when multiple main experiments share the same reference runs, allowing reuse without redundant retraining). To do this, specify additional arguments:

clarena pipeline=CUL_FULL_EXPR index=<main-experiment-index-config-name> +refretrain_model_path=... +reforiginal_model_path=...

For any reference experiment whose model path is provided, execution is skipped and the results provided are used directly. You may specify any of the paths.

Back to top

Footnotes

  1. The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎

  2. The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎

CUL Main Experiment
Output Results
 
 

©️ 2025 Pengxiang Wang. All rights reserved.