Shawn’s Blog
  • 🗂️ Collections
    • 🖥️ Slides Gallery
    • 💻 LeetCode Notes
    • 🧑‍🍳️ Cooking Ideas
    • 🍱 Cookbook
    • 💬 Language Learning
    • 🎼 Songbook
  • ⚙️ Projects
    • ⚛ Continual Learning Arena
  • 📄 Papers
    • AdaHAT
    • FG-AdaHAT
  • 🎓 CV
    • CV (English)
    • CV (Mandarin)
    • CV (Mandarin, Long Version)
  • About
  1. Continual Learning (CL)
  2. Save and Evaluate Model
  • Welcome to CLArena
  • Getting Started
  • Configure Pipelines
  • Continual Learning (CL)
    • CL Main Experiment
    • Save and Evaluate Model
    • Full Experiment
    • Output Results
  • Continual Unlearning (CUL)
    • CUL Main Experiment
    • Full Experiment
    • Output Results
  • Multi-Task Learning (MTL)
    • MTL Experiment
    • Save and Evaluate Model
    • Output Results
  • Single-Task Learning (STL)
    • STL Experiment
    • Save and Evaluate Model
    • Output Results
  • Components
    • CL Dataset
    • MTL Dataset
    • STL Dataset
    • CL Algorithm
    • CUL Algorithm
    • MTL Algorithm
    • STL Algorithm
    • Backbone Network
    • Optimizer
    • Learning Rate Scheduler
    • Trainer
    • Metrics
    • Lightning Loggers
    • Callbacks
    • Other Configs
  • Custom Implementation
    • CL Dataset
    • MTL Dataset
    • STL Dataset
    • CL Algorithm
    • CUL Algorithm
    • MTL Algorithm
    • STL Algorithm
    • Backbone Network
    • Callback
  • API Reference
  • FAQs

On this page

  • 1 Save Model
  • 2 Evaluate Model
    • Running
    • Configuration
      • Example
      • Required Config Fields
  1. Continual Learning (CL)
  2. Save and Evaluate Model

Save and Evaluate Model (CL Main)

Modified

October 6, 2025

CLArena supports saving the model after training each task and evaluating it separately for the continual learning main experiment.

1 Save Model

To save the model after training each task or after training all tasks, enable the callback clarena.callbacks.SaveModels. Please refer to Configure Callbacks section.

Warning

Checkpointing is not used for saving models later for evaluation in CLArena. This is because the model class is needed to load checkpoints, while we expect evaluation model regardless of its type and setting. clarena.callbacks.SaveModels uses torch.save() so later evaluation can use torch.load() to load the model without specifying the model class.

2 Evaluate Model

Continual learning main evaluation pipeline evaluates the saved model trained from continual learning main experiment. Its output results are summarized in Output Results (CL).

Running

To run a continual learning main evaluation, specify the CL_MAIN_EVAL indicator in the command:

clarena pipeline=CL_MAIN_EVAL index=<index-config-name>

Configuration

To run a custom continual learning main evaluation, create a YAML file in the index/ folder as index config. Below is an example.

Example

example_configs/index/example_cl_main_eval.yaml
# @package _global_
# make sure to include the above commented global setting!

# pipline info
pipeline: CL_MAIN_EVAL
eval_tasks: 10
global_seed: 1

# evaluation target
main_model_path: outputs/example_cl_main_expr/2023-10-01_12-00-00/saved_models/cl_model.pth

# paradigm settings
cl_paradigm: TIL

# components
defaults:
  - /cl_dataset: permuted_mnist.yaml
  - /trainer: cpu_eval.yaml
  - /metrics: cl_main_eval_default.yaml
  - /callbacks: eval_default.yaml
  - /hydra: default.yaml
  - /misc: default.yaml

# outputs
output_dir: outputs/example_cl_main_expr/2023-10-01_12-00-00/eval # output to the same folder as the experiment

Required Config Fields

Below is the list of required config fields for the index config of continual learning main evaluation.

Field Description Allowed Values
pipeline The default pipeline that clarena use the config to run
  • Choose from supported pipeline indicators
  • Only CL_MAIN_EVAL is allowed
eval_tasks The list of task IDs1 to evaluate
  • List of integers tk: At least 1, no more than available number of tasks of CL dataset
  • Integer T: Equivalent to list of integer [1,⋯,T]. At least 0 (no task to evaluate), no more than available number of tasks of CL dataset
global_seed The global seed for the entire evaluation
  • Same as seed argument in lightning.seed_everything()
main_model_path The file path of the model to evaluate
  • Relative path to where you run the clarena command for the continual learning main experiment
cl_paradigm The continual learning paradigm
  • ‘TIL’: Task-Incremental Learning (TIL)
  • ‘CIL’: Class-Incremental Learning (CIL)
/cl_dataset The continual learning dataset that the model is evaluated on
  • Choose from sub-config YAML files in cl_dataset/ folder
  • See Configure CL Dataset
/trainer The PyTorch Lightning Trainer object that contains all configs for testing process
  • Choose from sub-config YAML files in trainer/ folder
  • See Configure Trainer
/metrics The metrics to be monitored, logged or visualized
  • Choose from sub-config YAML files in metrics/ folder
  • See Configure Metrics
/lightning_loggers The Lightning Loggers used to log metrics and results
  • Choose from sub-config YAML files in the lightning_loggers/ folder
  • See Configure Lightning Loggers
/callbacks The callbacks applied to this evaluation experiment (other than metric callbacks). Callbacks are additional actions integrated at different points during the evaluation
  • Choose from sub-config YAML files in callbacks/ folder
  • See Configure Callbacks
/hydra Configuration for Hydra
  • Choose from sub-config YAML files in hydra/ folder
  • See Other Configs
/misc Miscellaneous configs that are less related to the experiment
  • Choose from sub-config YAML files in misc/ folder
  • See Other Configs
output_dir The folder storing the evaluation results
  • Relative path to where you run the clarena command
  • We recommend setting it to the output_dir of continual learning main experiment to be evaluated
Note

The continual learning main evaluation is managed by a CLMainEvaluation class. To learn how these fields work, please refer to its source code.

Back to top

Footnotes

  1. The task IDs are integers starting from 1, ending with number of tasks of the CL dataset. Each corresponds to a task-specific dataset in the CL dataset.↩︎

CL Main Experiment
Full Experiment
 
 

©️ 2025 Pengxiang Wang. All rights reserved.