Configure Optimizer
We use PyTorch Optimizer objects to train models within the framework of PyTorch and Lightning.
As continual learning involves multiple tasks, each task is supposed to be given a optimizer for training. We can either use a uniform optimizer across all tasks or assign distinct optimizer to each task.
Configure Uniform Optimizer For All Tasks
To configure uniform optimizer for all tasks for your experiment, link the /optimizer
field in the experiment index config to a YAML file in optimizer/ subfolder of your configs. That YAML file should use _target_
field to link to a PyTorch optimizer class and specify its arguments in the following field. Here is an example:
./clarena/example_configs
βββ __init__.py
βββ entrance.yaml
βββ experiment
β βββ example.yaml
β βββ ...
βββ optimizer
β βββ sgd_10_tasks.yaml
β βββ sgd.yaml
...
example_configs/experiment/example.yaml
defaults:
...
- /optimizer: sgd.yaml
...
example_configs/optimizer/sgd.yaml
_target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
Make sure to include field _partial_: True
to enable partial instantiation. PyTorch optimizer need the model parameters as an argument to be fully instantiated, but we are now in the phase of configuration and certainly donβt have that argument, so the optimizer can be only partially instantiated.
Configure Distinct Optimizer For Each Task
To configure distinct optimizer for each task for your experiment, the YAML file linked in optimizer/ subfolder should be a list of PyTorch optimizer classes. Each class is assigned to a task. The length of the list must be equal to field num_tasks
in experiment index config.
example_configs/optimizer/sgd_10_tasks.yaml
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
- _target_: torch.optim.SGD
_partial_: True # partially instantiate optimizer without 'params' argument. Make sure this is included in any case!
lr: 0.01
weight_decay: 0.0
Supported Optimizers
We fully support all built-in optimizers defined in PyTorch. Please refer to PyTorch documentation to see the full list. Please also refer to PyTorch documentation of each optimizer class to learn its required arguments.
Optimisation-based approaches in continual learning methodology focus on designing mechanisms from the perspective of manipulating optimisation step. Typically, these approaches involve using different optimizers for various tasks. However, the evolution of optimisation can be directly integrated into the CL algorithm so we donβt particularly design our own CL optimizers.