Configure CL Algorithm

The continual learning algorithm is the core part of continual learning, determining how sequential tasks are learned and managing interactions between previous and new tasks. If you are not familiar with continual learning algorithms, feel free to get some knowledge from my CL beginner’s guide about the baseline algorithms and CL methodology.

Configure CL Algorithm

To configure the continual learning dataset for your experiment, link the /cl_algorithm field in the experiment index config to a YAML file in cl_algorithm/ subfolder of your configs. That YAML file should use _target_ field to link to a CL dataset class (as shown in source code: clarena/cl_algorithms/) and specify its arguments in the following field (except backbone and heads). Here is an example:

./clarena/example_configs
├── __init__.py
├── entrance.yaml
├── experiment
│   ├── example.yaml
│   └── ...
├── cl_algorithm
│   └── finetuning.yaml
...

example_configs/experiment/example.yaml

defaults:
  ...
  - /cl_algorithm: finetuning.yaml
  ...

example_configs/cl_algorithm/finetuning.yaml

_target_: clarena.cl_algorithms.Finetuning

Supported CL Algorithm List

In this package we implemented many CL algorithm classes in clarena.cl_algorithms module that you can use for your experiment. Below is the full list. Please refer to the API reference of each class to learn its required arguments.

API Reference (CL Algorithms) Source Code

CL Algorithm	Description
Finetuning (SGD)	The most naive way for task-incremental learning. It simply initialises the backbone from the last task when training new task. Check out my CL beginners’ guide for details.
Fix	Another naive way for task-incremental learning aside from Finetuning. It serves as kind of toy algorithm when discussing stability-plasticity dilemma in continual learning. It simply fixes the backbone forever after training first task. Check out my CL beginners’ guide for details.
LwF [paper]	LwF (Learning without Forgetting, 2017) is a regularisation-based continual learning approach that constrains the feature output of the model to be similar to that of the previous tasks. From the perspective of knowledge distillation, it distills previous tasks models into the training process for new task in the regularisation term. It is a simple yet effective method for continual learning.
EWC [paper]	EWC (Elastic Weight Consolidation, 2017) is a regularisation-based continual learning approach that calculates parameter importance for the previous tasks and penalises the current task loss with the importance of the parameters.
HAT [paper][code]	HAT (Hard Attention to the Task, 2018) is an architecture-based continual learning approach that uses learnable hard attention masks to select the task-specific parameters.
AdaHAT [paper][code]	AdaHAT (Adaptive Hard Attention to the Task, 2024) is an architecture-based continual learning approach that improves HAT by introducing new adaptive soft gradient clipping based on parameter importance and network sparsity. This is my work, check out Paper: AdaHAT for details.
CBP [paper] [code]	CBP (Continual Backpropagation, 2024) is a continual learning approach that reinitialises a small number of units during training, using an utility measures to determine which units to reinitialise. It aims to address loss of plasticity problem for learning new tasks, yet not very well solve the catastrophic forgetting problem in continual learning.

Warning

Make sure the algorithm is compatible with the CL dataset, backbone and paradigm.