In this tutorial we will learn how to do the following:
If you haven't already read our Getting started tutorial, we recommend starting there before reading this tutorial.
Creating a dataset for use / using an existing dataset in Classy Vision is as easy as it is in PyTorch, it only requires wrapping the dataset in our dataloading class, ClassyDataset.
First, specify a dataset with a __getitem__
and __len__
function, the same as required by torch.utils.data.Dataset
import torch.utils.data
import torch
class MyDataset(torch.utils.data.Dataset):
def __init__(self):
super().__init__()
self.length = 100
def __getitem__(self, idx):
assert idx >= 0 and idx < self.length, \
"Provided index {} must be in range [0, {}).".format(idx, self.length)
return torch.rand(3, 100, 100)
def __len__(self):
return self.length
Now for most training tasks we want to be able to configure the batchsize on the fly, transform samples, shuffle the dataset, maybe limit the number of samples to shorten a training run, and then construct an iterator for the training loop. ClassyDataset is a simple wrapper that provides this functionality.
from classy_vision.dataset import ClassyDataset
class MyClassyDataset(ClassyDataset):
def __init__(self, batchsize_per_replica, shuffle, transform, num_samples):
dataset = MyDataset()
super().__init__(dataset, batchsize_per_replica, shuffle, transform, num_samples)
It's that easy! Later in the tutorial we will see how to use the iterator, but before moving on, let's talk about what each of these arguments does.
To get started with a basic task just do:
from classy_vision.tasks import ClassificationTask
my_dataset = MyClassyDataset(
batchsize_per_replica=10,
shuffle=True,
transform=None,
num_samples=None,
)
# Note, the "train" here is the phase type.
# It tells the task to set the model in train mode / do a backwards pass, etc, using our new dataset
my_task = ClassificationTask().set_dataset(my_dataset, "train")
For more details on training a model, please see our Getting started tutorial.
Classy Vision is also able to read a configuration file and instantiate the dataset. This is useful to keep your experiments organized and reproducible. For that, you have to:
from classy_vision.dataset import ClassyDataset, register_dataset
from classy_vision.dataset.transforms import build_transforms
@register_dataset("my_dataset")
class MyClassyDataset(ClassyDataset):
def __init__(self, batchsize_per_replica, shuffle, transform, num_samples):
dataset = MyDataset()
super().__init__(dataset, batchsize_per_replica, shuffle, transform, num_samples)
@classmethod
def from_config(cls, config):
transform = build_transforms(config["transforms"])
return cls(
batchsize_per_replica=config["batchsize_per_replica"],
shuffle=config["shuffle"],
transform=transform,
num_samples=config["num_samples"],
)
Now we can start using this dataset in our configurations. The string argument passed to the register_dataset is a unique identifier for this model (if you try to register two models with the same name, it will throw an error):
from classy_vision.dataset import build_dataset
import torch
dataset_config = {
"name": "my_dataset",
"batchsize_per_replica": 10,
"shuffle": True,
"transforms": [{"name": "Normalize", "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]}],
"num_samples": None,
}
my_dataset = build_dataset(dataset_config)
assert isinstance(my_dataset, MyClassyDataset)
sample = my_dataset[0]
print(sample.size())
As mentioned above, the ClassyDataset class adds several pieces of basic logic for constructing a torch.utils.data.Dataloader for your dataset. ClassyDataset supports local and distributed training out-of-box by internally using a PyTorch DistributedSampler for sampling the dataset along with the PyTorch Dataloader for batching and parallelizing sample retrieval. To get an iterable for epoch 0, do the following:
from classy_vision.dataset import build_dataset
import torch
dataset_config = {
"name": "my_dataset",
"batchsize_per_replica": 10,
"shuffle": True,
"transforms": [],
"num_samples": None,
}
my_dataset = build_dataset(dataset_config)
assert isinstance(my_dataset, MyClassyDataset)
# multiprocessing_context can be set to "spawn", "forkserver", "fork" or None.
# If None is used, then the dataloader inherits the context of the parent thread.
# If num_workers is 0, then multiprocessing is not used by the dataloader
#
# A warning, while fork is fast and simple to get started with, it
# is unsafe to use with threading and can lead to difficult to debug errors.
# Spawn / forkserver are threadsafe, but they come with additional startup costs.
iterator = my_dataset.iterator(
shuffle_seed=0,
epoch=0,
num_workers=0, # 0 indicates to do dataloading on the primary process
pin_memory=False,
multiprocessing_context=None,
)
assert isinstance(iterator, torch.utils.data.DataLoader)
# Iterate over all 100 samples.
for sample in iter(iterator):
# Do stuff with sample...
# Note that size now has an extra dimension representing the batchsize
assert sample.size() == torch.Size([10, 3, 100, 100])
You can also provide a custom iterator function if you would like to return a custom iterator or a custom sampler. Please see the ClassyDataset code for more details.
You may have noticed in the configuration section that we did something slightly more complicated with the transform configuration. In particular, just like our datasets / models etc, we have a registration / build mechanism for transforms so that transforms can be specified via config.
We also automatically register torchvision transforms, so let's start with an example of how to specify torchvision transforms and the synthetic image dataset we provide for testing / proto-typing.
import torchvision.transforms as transforms
from classy_vision.dataset import build_dataset
from classy_vision.dataset.classy_synthetic_image import SyntheticImageDataset
from classy_vision.dataset.transforms import build_transforms
# Declarative approach
# Transform to be applied to image
image_transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
decl_dataset = SyntheticImageDataset(
batchsize_per_replica=10,
shuffle=True,
transform=image_transform,
num_samples=100,
crop_size=320,
class_ratio=4,
seed=0,
)
# FAILS!!!!
# decl_dataset[0]
This fails! Why?
It fails because most datasets don't return just an image, they return image or video content data, label data, and (potentially) sample metadata. In Classy Vision, the sample format is specified by the task and our classification_task expects a dict with input / target keys.
For example, the sample format for the SyntheticImageDataset looks like:
{"input": <PIL Image>, "target": <Target>}
For our transforms to work, we need to specify which key to apply the transform to.
import torchvision.transforms as transforms
from classy_vision.dataset import build_dataset
from classy_vision.dataset.classy_synthetic_image import SyntheticImageDataset
from classy_vision.dataset.transforms import build_transforms, ApplyTransformToKey
# Declarative approach
# Transform to be applied to image
image_transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Transform wrapper that says which key to apply the transform to
transform = ApplyTransformToKey(
transform=image_transform,
key="input",
)
decl_dataset = SyntheticImageDataset(
batchsize_per_replica=10,
shuffle=True,
transform=transform,
num_samples=100,
crop_size=320,
class_ratio=4,
seed=0,
)
# Success!!!!
decl_dataset[0]
Now let's see how to do the same thing via a config.
# Note that this cell won't work until we fix the synthetic dataset from_config function
from classy_vision.dataset import build_dataset
# Configuration approach
config = {
"name": "synthetic_image",
"batchsize_per_replica": 10,
"use_shuffle": True,
"transforms": [
{
"name": "apply_transform_to_key",
"transforms": [
{"name": "Resize", "size": 256},
{"name": "CenterCrop", "size": 224},
{"name": "ToTensor"},
{"name": "Normalize", "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]},
],
"key": "input",
},
],
"num_samples": 100,
"crop_size": 320,
"class_ratio": 4,
"seed": 0
}
config_dataset = build_dataset(config)
# Sample should be the same as that provided by the decl_dataset
assert torch.allclose(config_dataset[0]["input"], decl_dataset[0]["input"])
Torchvision has a different sample format using tuples for images:
(<PIL Image>, <Target>)
The ApplyTransformToKey will still work (the key in this case is '0'), but for our classification tasks, we also want a sample that is a dict with "input"/"target" keys.
Because this is a common dataset format, we provide a convenience transform called "GenericImageTransform" which applies a specified transform to the torchvision tuple image key and then maps the whole sample to a dict. This is just a convenience transform, we can also do this using raw composable blocks, but it makes things more verbose.
All of the transforms in the next cell have the same effect on an image:
import torchvision.transforms as transforms
from classy_vision.dataset.transforms import build_transforms
from classy_vision.dataset.transforms.util import GenericImageTransform
# Declarative
image_transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
decl_transform = GenericImageTransform(transform=image_transform)
# Configuration with helper function
transform_config = [{
"name": "generic_image_transform",
"transforms": [
{"name": "Resize", "size": 256},
{"name": "CenterCrop", "size": 224},
{"name": "ToTensor"},
{"name": "Normalize", "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]},
],
}]
config_helper_transform = build_transforms(transform_config)
# Configuration using raw, composable functions:
transform_config = [
{"name": "tuple_to_map", "list_of_map_keys": ["input", "target"]},
{
"name": "apply_transform_to_key",
"transforms": [
{"name": "Resize", "size": 256},
{"name": "CenterCrop", "size": 224},
{"name": "ToTensor"},
{"name": "Normalize", "mean": [0.485, 0.456, 0.406], "std": [0.229, 0.224, 0.225]},
],
"key": "input",
},
]
config_raw_transform = build_transforms(transform_config)
# These transforms are all functionally the same
Now, to complete this tutorial, we show our code for creating an ImageNet dataset in classy vision using the pre-existing torchvision dataset. Code very similar to this (+ some typing and helper functions) is in the datasets folder of the base Classy Vision repository.
Note, we do not distribute any of the underlying dataset data with Classy Vision. Before this will work, you will need to download a torchvision compatible copy of the Imagenet dataset yourself.
from classy_vision.dataset import ClassyDataset, register_dataset
from classy_vision.dataset.transforms import ClassyTransform, build_transforms
from torchvision.datasets.imagenet import ImageNet
@register_dataset("example_imagenet")
class ExampleImageNetDataset(ClassyDataset):
def __init__(
self,
split,
batchsize_per_replica,
shuffle,
transform,
num_samples,
root, # Root directory for your Imagenet dataset
):
# Create torchvision dataset
dataset = ImageNet(root=root, split=split)
super().__init__(
dataset, batchsize_per_replica, shuffle, transform, num_samples
)
@classmethod
def from_config(cls, config):
batchsize_per_replica = config.get("batchsize_per_replica")
shuffle = config.get("use_shuffle")
num_samples = config.get("num_samples")
transform_config = config.get("transforms")
split = config.get("split")
root = config.get("root")
download = config.get("download")
transform = build_transforms(transform_config)
return cls(
split=split,
batchsize_per_replica=batchsize_per_replica,
shuffle=shuffle,
transform=transform,
num_samples=num_samples,
root=root,
download=download,
)
In this tutorial we have seen how to create a custom dataset using ClassyDataset, how to integrate this dataset with the configuration system, how to iterate over samples / use multiple workers, how to use transforms in the configuration system and finally we showed an example of how to use a torchvision dataset in Classy Vision.
For more details on how to use the dataset for training, please see Getting started.