Shawn’s Blog
  • πŸ—‚οΈ Collections
    • πŸ–₯️ Slides Gallery
    • πŸ§‘β€πŸ³οΈ Cooking Ideas
    • 🍱 Cookbook
    • πŸ’¬ Language Learning
    • 🎼 Songbook
  • βš™οΈ Projects
    • βš› Continual Learning Arena
  • πŸ“„ Papers
    • AdaHAT
    • FG-AdaHAT
  • πŸŽ“ CV
    • CV (English)
    • CV (Mandarin)
  • About
  1. Continual Learning (CL)
  2. Configure CL Main Experiment
  3. Optimizer
  • Welcome to CLArena
  • Get Started
  • Continual Learning (CL)
    • Configure CL Main Experiment
      • Experiment Index Config
      • CL Algorithm
      • CL Dataset
      • Backbone Network
      • Optimizer
      • Learning Rate Scheduler
      • Trainer
      • Metrics
      • Lightning Loggers
      • Callbacks
      • Other Configs
    • Save and Evaluate Model
    • Full Experiment
    • Output Results
  • Continual Unlearning (CUL)
    • Configure CUL Main Experiment
      • Experiment Index Config
      • Unlearning Algorithm
      • Callbacks
    • Full Experiment
    • Output Results
  • Multi-Task Learning (MTL)
    • Configure MTL Experiment
      • Experiment Index Config
      • MTL Algorithm
      • MTL Dataset
      • Backbone Network
      • Optimizer
      • Learning Rate Scheduler
      • Trainer
      • Metrics
      • Callbacks
    • Save and Evaluate Model
    • Output Results
  • Single-Task Learning (STL)
    • Configure STL Experiment
      • Experiment Index Config
      • STL Dataset
      • Backbone Network
      • Optimizer
      • Learning Rate Scheduler
      • Trainer
      • Metrics
      • Callbacks
    • Save and Evaluate Model
    • Output Results
  • Implement Your Modules (TBC)
  • API Reference
  1. Continual Learning (CL)
  2. Configure CL Main Experiment
  3. Optimizer

Configure Optimizer(s) (CL Main)

Modified

August 16, 2025

Optimizer is the component that manages the learning process by updating model parameters based on the computed gradients.

Optimizer is a sub-config under the experiment index config (CL Main). To configure a custom optimizer, you need to create a YAML file in optimizer/ folder. As continual learning involves multiple tasks, each task is supposed to be given a optimizer for training. We support uniform optimizer across all tasks and distinct optimizer to each task. Below shows examples of the optimizer config for configuring both uniform and distinct optimizers.

Example

configs
β”œβ”€β”€ __init__.py
β”œβ”€β”€ entrance.yaml
β”œβ”€β”€ experiment
β”‚   β”œβ”€β”€ example_clmain_train.yaml
β”‚   └── ...
β”œβ”€β”€ optimizer
β”‚   β”œβ”€β”€ sgd_10_tasks.yaml
β”‚   └── sgd.yaml
...

Uniform Optimizer for All Tasks

configs/experiment/example_clmain_train.yaml
defaults:
  ...
  - /optimizer: sgd.yaml
  ...
configs/optimizer/sgd.yaml
_target_: torch.optim.SGD
_partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0

Distinct Optimizer For Each Task

Distinct optimizers are specified as a list of optimizers. The length of the list must match the field train_tasks in experiment index config (CLMain). Below is an example of distinct optimizers for 10 tasks.

defaults:
  ...
  - /optimizer: sgd_10_tasks.yaml
  ...
configs/optimizer/sgd_10_tasks.yaml
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0
- _target_: torch.optim.SGD
  _partial_: true # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
  lr: 0.01
  weight_decay: 0.0

Supported Optimizers & Required Config Fields

In CLArena, we didn’t implement our own optimizers, but rather use the built-in optimizers from PyTorch in torch.optimmodule.

Note

Optimization-based approaches in continual learning methodology focus on designing mechanisms from the perspective of manipulating optimization step. Typically, these approaches involve using different optimizers for various tasks. However, the evolution of optimization can be directly integrated into the CL algorithm so we don’t particularly design our own CL optimizers.

To choose an optimizer, assign the _target_ field to the class name of the optimizer. For example, to use SGD, set _target_ field to torch.optim.SGD. Meanwhile, include _partial: true as well (see below). Each optimizer has its own hyperparameters and configurations, which means it has its own required fields. The required fields are the same as the arguments of the class specified by _target_. The arguments of each optimizer class can be found in PyTorch documentation.

PyTorch Documentation (Built-In Optimizers)

Warning

Make sure to include field _partial_: true to enable partial instantiation. PyTorch optimizer need the model parameters as an argument to be fully instantiated, but we are now in the phase of configuration and certainly don’t have that argument, so the optimizer can be only partially instantiated.

Back to top
Backbone Network
Learning Rate Scheduler
 
 

©️ 2025 Pengxiang Wang. All rights reserved.