Configure MTL Dataset

Modified

January 16, 2026

The multi-task learning dataset consists of multiple datasets corresponding to multiple learning tasks, each of which has their own training, validation and test data. MTL dataset can also be converted from continual learning dataset.

MTL dataset is a sub-config under the index config of:

Multi-task learning experiment and evaluation

To configure a custom MTL dataset, you need to create a YAML file in mtl_dataset/ folder. Below shows examples of the MTL dataset config.

Example

configs
├── __init__.py
├── entrance.yaml
├── index
│   ├── example_mtl_expr.yaml
│   └── ...
├── mtl_dataset
│   └── from_cl_split_mnist.yaml
...

example_configs/index/example_mtl_expr.yaml

defaults:
  ...
    - /mtl_dataset: from_cl_split_mnist.yaml
  ...

example_configs/mtl_dataset/from_cl_split_mnist.yaml

_target_: clarena.mtl_datasets.MTLDatasetFromCL
cl_dataset:
  _target_: clarena.cl_datasets.SplitMNIST
  root: data/MNIST
  class_split: 
    1: [0, 1]
    2: [2, 3]
    3: [4, 5]
    4: [6, 7]
    5: [8, 9]
  validation_percentage: 0.1
sampling_strategy: mixed
batch_size: 128

Supported MTL Datasets & Required Config Fields

All CL datasets in CLArena can be converted into MTL datasets. Please refer to Supported CL Datasets & Required Config Fields. Please use MTLDatasetFromCL. Raw single-task dataset can be directly applied to each task to form a MTL dataset. Please use MTLCombinedDataset.

In CLArena, we have also implemented many MTL datasets as Python classes in clarena.mtl_datasets module that you can use for your experiments and evaluations.

To choose a MTL dataset, assign the _target_ field to the class name of the MTL dataset. For example, to use the multi-task learning dataset from continual learning, set _target_ field to clarena.mtl_datasets.MTLDatasetFromCL. Each MTL dataset has its own hyperparameters and configurations, which means it has its own required fields. The required fields are the same as the arguments of the class specified by _target_. The arguments of each MTL dataset class can be found in API documentation.

API Reference (MTL Datasets) Source Code (MTL Datasets)

Below is the full list of supported MTL datasets. We only support image classification datasets. Note that the “MTL Dataset” is exactly the class name that the _target_ field is assigned.

MTL Dataset	Description	Required Config Fields
Combined	Combined MTL dataset. We currently support: CIFAR-10, CIFAR-100, MNIST, SVHN, Fashion-MNIST, TrafficSigns, FaceScrub, NotMNIST, EMNIST Digits, EMNIST Letters, Arabic Handwritten Digits, Kannada-MNIST, Sign Language MNIST, Kuzushiji-MNIST, Food-101, Linnaeus 5, Caltech 101, EuroSAT, DTD, Country 211	Same as Combined class arguments