Configure Trainer(s)
Under the PyTorch Lightning framework, we use the Lightning Trainer object for all training-related configs, such as number of epochs, training strategy, device, etc.
Trainer is a sub-config under the index config of:
- Continual learning main experiment and evaluation
- Continual learning full experiment and the reference experiments
- Continual unlearning main experiment and evaluation
- Continual unlearning full experiment, the reference experiments and evaluation
- Multi-task learning experiment and evaluation
- Single-task learning experiment and evaluation
To configure a custom trainer, create a YAML file in the trainer/
folder. As continual learning involves multiple tasks, each task can be assigned a trainer. We support a uniform trainer for all tasks and distinct trainers for each task. Below are examples of both configurations.
Example
configs
├── __init__.py
├── entrance.yaml
├── index
│ ├── example_cl_main_expr.yaml
│ └── ...
├── trainer
│ ├── 10_tasks.yaml
│ ├── cpu.yaml
...
Uniform Trainer
example_configs/index/example_cl_main_expr.yaml
defaults:
...
- /trainer: cpu.yaml
...
example_configs/trainer/cpu.yaml
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 50
accelerator: cpu
devices: 1
max_epochs: 2
Distinct Trainer for Each Task (Continual Learning)
Distinct trainers are specified as a dictionary. The key and length of the dictionary must match the train_tasks
field in the index config of continual learning. Below is an example for 10 tasks, where tasks 2 and 3 are using GPU while the rest are using CPU.
defaults:
...
- /optimizer: 10_tasks.yaml
...
example_configs/trainer/10_tasks.yaml
1:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
2:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: gpu
devices: 1
max_epochs: 2
3:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: gpu
devices: 1
max_epochs: 2
4:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
5:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
6:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
7:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
8:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
9:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
10:
_target_: lightning.Trainer # always link to the lightning.Trainer class
default_root_dir: ${output_dir}
log_every_n_steps: 20
accelerator: cpu
devices: 1
max_epochs: 2
Required Config Fields
The _target_: lightning.Trainer
field is required and must always be specified. Please refer to the PyTorch Lightning documentation for full information on the available arguments (trainer flags) of lightning.Trainer
.