October 6, 2025
This work aims to adapt batch normalization to be compatible with HAT architecture, which improves the absolute performance of HAT-based continual learning approaches.
This work aims to leverage the task-wise separability of architecture-based continual learning to propose a novel architecture-based approach in the continual unlearning paradigm (parallel to those proposed in CLPU, such as CLPU-DER++). The idea is based on HAT–an architecture-based CL approach–to leverage its task-wise separability characteristic, and to store task-overlapping weights on top of HAT. When training new task, the weights needed by both new and previous tasks are duplicated and stored in a set, which can be updated independently for the new task. This approach can not only equip an architecture-based CL approach the ability to unlearn, but more importantly can also recycle the model capacity, addressing the problem of insufficient network capacity in architecture-based continual learning through unlearning. A novel metric is proposed to measure the benefit of model capacity release, which compares the performance on remaining tasks with and without unlearning the requested tasks. Also, we argue that the unlearning metric in CLPU is not reasonable to the TIL setting with independent heads for each task. We propose to measure the feature extracted by the backbone instead of the outputs on the testset of unlearned tasks.