Continual Unlearning Full Experiment

Modified

November 21, 2025

The continual unlearning main experiment and evaluation can only produce evaluation results for continual learning, not unlearning performance. To fully evaluate a continual unlearning model, reference experiments in addition to the continual unlearning main experiment are required:

Retraining: Retrain continual learning from scratch without tasks requested to unlearn.
Original: Retrain continual learning from scratch including all tasks.

Their results can be used to compute unlearning metrics, which include Distribution Distance (DD) and Accuracy Difference (AD), which is called continual unlearning full evaluation. The entire pipeline (including main experiment, reference experiments, and full evaluation) is called continual unlearning full experiment. We introduce their running and configuration instructions below.

Reference Retrain Experiment

Instead of constructing it manually, you can run the reference retrain experiment easily through specifying the CUL_REF_RETRAIN_EXPR indicator with the main experiment config in the command:

clarena pipeline=CUL_REF_RETRAIN_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a retrain experiment config by:

Set the output_dir as subfolder refretrain/ under the output_dir of the main experiment.
Exclude the unlearned tasks in unlearning_requests in the field train_tasks and eval_after_tasks.
Remove fields related to unlearning, such as unlearning_requests, /cul_algorithm.
Switch /metrics and /callbacks to continual learning counterparts.

For details, please check the source code.

Reference Original Experiment

Instead of constructing it manually, you can run the reference original experiment easily through specifying the CUL_REF_ORIGINAL_EXPR indicator with the main experiment config in the command:

clarena pipeline=CUL_REF_ORIGINAL_EXPR index=<main-experiment-index-config-name>

It preprocesses the main experiment config into a original experiment config by:

Set the output_dir as subfolder reforiginal/ under the output_dir of the main experiment.
Remove fields related to unlearning, such as unlearning_requests, /cul_algorithm.
Switch /metrics and /callbacks to continual learning counterparts.

For details, please check the source code.

Full Evaluation

Continual unlearning full evaluation pipeline computes unlearning metrics from saved evaluation result files of the main and reference experiments. Its output results are summarized in Output Results (CUL).

Evaluation Pipeline

Evaluate Distribution Distance (DD) using the model trained from the main experiment and the model trained from reference retraining experiment, on the CL test dataset. Save the data and figures.
Evaluate Accuracy Difference (AD) using the model trained from the main experiment and the model trained from reference original experiment, on the CL test dataset. Save the data and figures.

Running

To run a continual unlearning full evaluation, specify the CUL_FULL_EVAL indicator in the command:

clarena pipeline=CUL_FULL_EVAL index=<index-config-name>

Configuration

To run a custom continual unlearning full evaluation, create a YAML file in the index/ folder as index config. Below is an example.

Example

example_configs/index/example_cul_full_eval.yaml

# @package _global_
# make sure to include the above commented global setting!

# evaluation info
pipeline: CUL_FULL_EVAL
dd_eval_tasks: 5
ad_eval_tasks: 5
global_seed: 1

# evaluation target
main_model_path: outputs/example_cul_main_expr/2023-10-01_12-00-00/saved_models/cl_model.pth
refretrain_model_path: outputs/example_cul_main_expr/2023-10-01_12-00-00/refretrain/results/cl_model.pth
reforiginal_model_path: outputs/example_cul_main_expr/2023-10-01_12-00-00/reforiginal/results/cl_model.pth

# paradigm settings
cl_paradigm: TIL

# components
defaults: 
  - /cl_dataset: split_mnist.yaml
  - /trainer: cpu_eval.yaml
  - /metrics: cul_full_eval_default.yaml
  - /callbacks: eval_default.yaml
  - /hydra: default.yaml
  - /misc: default.yaml

# outputs
output_dir: outputs/example_cul_main_expr/2023-10-01_12-00-00/eval # output to the same folder as the experiment

Required Config Fields

Below is the list of required config fields for the index config of continual unlearning full evaluation.

Field	Description	Allowed Values
`pipeline`	The default pipeline that `clarena` use the config to run	Choose from supported pipeline indicators Only `CUL_FULL_EVAL`, `CUL_FULL_EVAL_ATTACHED` are allowed
`dd_eval_tasks`	The list of task IDs¹ to evaluate Distribution Distance (DD)	List of integers $t_{k}$ : At least 1, no more than available number of tasks of CL dataset Integer $T$ : Equivalent to list of integer [ $1, \dots, T$ ]. At least 0 (no task to evaluate), no more than available number of tasks of CL dataset
`ad_eval_tasks`	The list of task IDs² to evaluate Accuracy Difference (AD)	List of integers $t_{k}$ : At least 1, no more than available number of tasks of CL dataset Integer $T$ : Equivalent to list of integer [ $1, \dots, T$ ]. At least 0 (no task to evaluate), no more than available number of tasks of CL dataset
`global_seed`	The global seed for the entire evaluation	Same as `seed` argument in lightning.seed_everything()
`main_model_path`	The file path of the model to evaluate	Relative path to where you run the `clarena` command for the continual unlearning main experiment
`refretrain_model_path`	The file path of the reference retrain model	Relative path to where you run the `clarena` command for the reference retrain experiment Optional. If not specified, evaluating DD will be skipped
`reforiginal_model_path`	The file path of the reference original model	Relative path to where you run the `clarena` command for the reference original experiment Optional. If not specified, evaluating AD will be skipped
`cl_paradigm`	The continual learning paradigm	‘TIL’: Task-Incremental Learning (TIL) ‘CIL’: Class-Incremental Learning (CIL) ‘DIL’: Domain-Incremental Learning (DIL)
`global_seed`	The global seed for the entire evaluation	Same as `seed` argument in lightning.seed_everything()
`/cl_dataset`	The continual learning dataset that the model is evaluated on	Choose from sub-config YAML files in `cl_dataset/` folder See Configure CL Dataset
`/trainer`	The PyTorch Lightning Trainer object that contains all configs for testing process	Choose from sub-config YAML files in `trainer/` folder See Configure Trainer
`/metrics`	The metrics to be monitored, logged or visualized	Choose from sub-config YAML files in `metrics/` folder See Configure Metrics
`/lightning_loggers`	The Lightning Loggers used to log metrics and results	Choose from sub-config YAML files in the `lightning_loggers/` folder See Configure Lightning Loggers
`/callbacks`	The callbacks applied to this evaluation experiment (other than metric callbacks). Callbacks are additional actions integrated at different points during the evaluation	Choose from sub-config YAML files in `callbacks/` folder See Configure Callbacks
`/hydra`	Configuration for Hydra	Choose from sub-config YAML files in `hydra/` folder See Other Configs
`/misc`	Miscellaneous configs that are less related to the experiment	Choose from sub-config YAML files in `misc/` folder See Other Configs
`output_dir`	The folder storing the evaluation results	Relative path to where you run the `clarena` command We recommend setting it to the `output_dir` of continual unlearning main experiment to be evaluated

Note

The continual unlearning full evaluation is managed by a CULFullEvaluation class. To learn how these fields work, please refer to its source code.

Full Experiment

Instead of constructing and executing above pipelines one by one manually, CLArena integrates them into one command for a continual unlearning full experiment.

Running

To run a continual unlearning full experiment, specify the CUL_FULL_EXPR indicator with the main experiment config in the command:

clarena pipeline=CUL_FULL_EXPR index=<main-experiment-index-config-name>

This effectively runs:

clarena pipeline=CUL_MAIN_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CUL_REF_RETRAIN_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CUL_REF_ORIGINAL_EXPR index=<main-experiment-index-config-name>
clarena pipeline=CUL_FULL_EVAL index=<index-config-name> where the config is constructed based on the main and reference experiments, detailed as follows (see the source code):
- Set eval_tasks to train_tasks of the main experiment.
- Align main_model_path, refretrain_model_path, reforiginal_model_path with the path where the main and reference experiments output the trained models to.
- Add unlearning metric callbacks CULDistributionDistance and CULAccuracyDifference

We also allow running reference experiments separately and providing them to the full experiment (This is useful when multiple main experiments share the same reference runs, allowing reuse without redundant retraining). To do this, specify additional arguments:

clarena pipeline=CUL_FULL_EXPR index=<main-experiment-index-config-name> +refretrain_model_path=... +reforiginal_model_path=...

For any reference experiment whose model path is provided, execution is skipped and the results provided are used directly. You may specify any of the paths.

Footnotes

The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎
The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎