Tasks

class classy_vision.tasks.ClassyTask

An abstract base class for a training task.

A ClassyTask encapsulates all the components and steps needed to train using a classy_vision.trainer.ClassyTrainer.

__init__() → classy_vision.tasks.classy_task.ClassyTask

Constructs a ClassyTask.

abstract advance_phase() → None

Advance the task a phase.

Called when one phase of reading from classy_vision.dataset.ClassyDataset is over.

abstract done_training() → bool

Tells if we are done training.

Returns

A boolean telling if training is over.

abstract classmethod from_config(config: Dict[str, Any]) → classy_vision.tasks.classy_task.ClassyTask

Instantiates a ClassyTask from a configuration.

Parameters

config – A configuration for a ClassyTask.

Returns

A ClassyTask instance.

abstract get_classy_state(deep_copy: bool = False) → Dict[str, Any]

Get the state of the ClassyTask.

The returned state is used for checkpointing.

Parameters

deep_copy – If True, creates a deep copy of the state dict. Otherwise, the returned dict’s state will be tied to the object’s.

Returns

A state dictionary containing the state of the task.

abstract init_distributed_data_parallel_model() → None

Initialize torch.nn.parallel.distributed.DistributedDataParallel.

Needed for distributed training. This is where a model should be wrapped by DDP.

abstract prepare(num_dataloader_workers=0, pin_memory=False, use_gpu=False, dataloader_mp_context=None) → None

Prepares the task for training.

Will be called by the classy_vision.trainer.ClassyTrainer to prepare the task.

Parameters
  • num_dataloader_workers – Number of workers to create for the dataloaders

  • pin_memory – Whether the dataloaders should copy the Tensors into CUDA pinned memory (default False)

  • use_gpu – True if training on GPUs, False otherwise

run_hooks(local_variables: Dict[str, Any], hook_function: str) → None

Helper function that runs a hook function for all the classy_vision.hooks.ClassyHook.

Parameters
  • local_variables – Local variables created in train_step()

  • hook_function – One of the hook functions in the

:param classy_vision.hooks.ClassyHookFunctions: enum.

abstract set_classy_state(state)

Set the state of the ClassyTask.

Parameters

state_dict – The state dictionary. Must be the output of a call to get_classy_state().

This is used to load the state of the task from a checkpoint.

abstract train_step(use_gpu, local_variables: Optional[Dict] = None) → None

Run a train step.

This corresponds to training over one batch of data from the dataloaders.

Parameters
  • use_gpu – True if training on GPUs, False otherwise

  • local_variables – Local variables created in the function. Can be passed to custom classy_vision.hooks.ClassyHook.

abstract property where

Tells how far along (where) we are during training.

Returns

A float in [0, 1) which tells the training progress.

class classy_vision.tasks.FineTuningTask(*args, **kwargs)
__init__(*args, **kwargs)

Constructs a ClassificationTask

classmethod from_config(config: Dict[str, Any]) → classy_vision.tasks.fine_tuning_task.FineTuningTask

Instantiates a FineTuningTask from a configuration.

Parameters

config – A configuration for a FineTuningTask. See __init__() for parameters expected in the config.

Returns

A FineTuningTask instance.

prepare(num_dataloader_workers: int = 0, pin_memory: bool = False, use_gpu: bool = False, dataloader_mp_context=None) → None

Prepares task for training, populates all derived attributes

Parameters
  • num_dataloader_workers – Number of dataloading processes. If 0, dataloading is done on main process

  • pin_memory – if true pin memory on GPU

  • use_gpu – if true, load model, optimizer, loss, etc on GPU

  • dataloader_mp_context – Determines how processes are spawned. Value must be one of None, “spawn”, “fork”, “forkserver”. If None, then context is inherited from parent process

set_freeze_trunk(freeze_trunk: bool) → classy_vision.tasks.fine_tuning_task.FineTuningTask
set_pretrained_checkpoint(checkpoint: Dict[str, Any]) → classy_vision.tasks.fine_tuning_task.FineTuningTask
set_reset_heads(reset_heads: bool) → classy_vision.tasks.fine_tuning_task.FineTuningTask
classy_vision.tasks.build_task(config)

Builds a ClassyTask from a config.

This assumes a ‘name’ key in the config which is used to determine what task class to instantiate. For instance, a config {“name”: “my_task”, “foo”: “bar”} will find a class that was registered as “my_task” (see register_task()) and call .from_config on it.

classy_vision.tasks.register_task(name)

Registers a ClassyTask subclass.

This decorator allows Classy Vision to instantiate a subclass of ClassyTask from a configuration file, even if the class itself is not part of the Classy Vision framework. To use it, apply this decorator to a ClassyTask subclass, like this:

@register_task('my_task')
class MyTask(ClassyTask):
    ...

To instantiate a task from a configuration file, see build_task().

class classy_vision.tasks.ClassificationTask

Basic classification training task.

This task encapsultates all of the components and steps needed to train a classifier using a classy_vision.trainer.ClassyTrainer.

Assumes a train / test phase per each epoch and that the datasets have the same API as the map-style Dataset class in torch.utils.data.dataset (in particular, this task makes use of the len). If you are using an IterableDataset then a custom task may be appropriate.

Variables
  • loss – Loss (see classy_vision.losses.ClassyLoss) function used for computing the loss in each forward pass

  • datasets – Mapping from a phase_type in [“train”, “test’] to dataset used for training (or testing)

  • meters – List of meters (see classy_vision.meters.ClassyMeter) to calculate during training

  • num_epochs – Number of epochs (passes over dataset) to train

  • test_only – Used to only run the test phase

  • base_model – Model to be trained, unwrapped in DDP or DP wrappers

  • optimizer – Optimizer used in train step

  • checkpoint – Serializable dict which represents state in training

  • phases – List of phase specific information, e.g. if phase is train / test.

  • hooks – List of hooks to apply during training

  • train – Phase type, if true it means we are training, false means testing

  • distributed_model – Base model, but wrapped in DDP (DistributedDataParallel)

  • phase_idx – Current phase id, first phase is 0, if task has not started training then returns -1

  • train_phase_idx – Only counts train phases

  • num_updates – Number of total parameter updates applied to model by the optimizer

  • data_iterator – Iterator which can be used to obtain batches

  • num_samples_this_phase – Number of samples ran this phase

  • losses – Loss curve

__init__()

Constructs a ClassificationTask

advance_phase()

Performs bookkeeping / task updates between phases

Increments phase idx, resets meters, resets loss history, resets counters, shuffles dataset, rebuilds iterators, and sets the train / test state for phase.

build_dataloader(phase_type, num_workers, pin_memory, multiprocessing_context=None, **kwargs)

Buildss a dataloader iterable for a particular phase type.

Parameters
  • phase_type – “train” or “test” iterable

  • num_workers – Number of dataloading processes. If 0, dataloading is done on main process. See PyTorch dataloader documentation for more details on num_workers and the usage of python multiprocessing in dataloaders

  • pin_memory – if true pin memory on GPU. See PyTorch dataloader documentation for details on pin_memory.

  • multiprocessing_context – Determines how processes are spawned. Value must be one of None, “spawn”, “fork”, “forkserver”. If None, then context is inherited from parent process

Returns

Returns a iterable over the dataset

build_dataloaders(num_workers, pin_memory, multiprocessing_context=None, **kwargs)

Build a dataloader for each phase type

Parameters
  • num_workers

    Number of dataloading processes. If 0, dataloading is done on main process. See PyTorch dataloader documentation for more details on num_workers and the usage of python multiprocessing in dataloaders

  • pin_memory – if true pin memory on GPU. See PyTorch dataloader documentation for details on pin_memory.

  • multiprocessing_context – Determines how processes are spawned. Value must be one of None, “spawn”, “fork”, “forkserver”. If None, then context is inherited from parent process

Returns

Returns an iterable over the dataset associated with each phase_type

compute_loss(model_output, sample)
create_data_iterator()

Creates data iterator for phase.

done_training()

Stop condition for training

property eval_phase_idx

Returns current evaluation phase

classmethod from_config(config: Dict[str, Any]) → classy_vision.tasks.classification_task.ClassificationTask

Instantiates a ClassificationTask from a configuration.

Parameters

config – A configuration for a ClassificationTask. See __init__() for parameters expected in the config.

Returns

A ClassificationTask instance.

get_batchsize_per_replica()

Return local replica’s batchsize for dataset (e.g. batchsize per GPU)

get_classy_state(deep_copy: bool = False)

Returns serialiable state of task

Parameters

deep_copy – If true, does a deep copy of state before returning.

get_data_iterator()

Returns data iterator for current phase

get_global_batchsize()

Return global batchsize across all trainers

get_total_samples_trained_this_phase()

Returns the total number of samples processed in current phase

get_total_test_phases()

Returns the total number of “test” phases in the task

get_total_training_phases()

Returns the total number of “train” phases in the task

init_distributed_data_parallel_model()

Sets up distributed dataparallel and wraps model in DDP

property model

Returns model used in training (can be wrapped with DDP)

property num_batches_per_phase

Returns number of batches in current phase iterator

property phase_type

Returns current phase type. String with value “train” or “test”

prepare(num_dataloader_workers=0, pin_memory=False, use_gpu=False, dataloader_mp_context=None)

Prepares task for training, populates all derived attributes

Parameters
  • num_dataloader_workers – Number of dataloading processes. If 0, dataloading is done on main process

  • pin_memory – if true pin memory on GPU

  • use_gpu – if true, load model, optimizer, loss, etc on GPU

  • dataloader_mp_context – Determines how processes are spawned. Value must be one of None, “spawn”, “fork”, “forkserver”. If None, then context is inherited from parent process

set_amp_opt_level(opt_level: Optional[str])

Disable / enable apex.amp and set the automatic mixed precision level.

apex.amp can be utilized for mixed / half precision training.

Parameters

opt_level

Opt level used to initialize apex.amp. Set to None to disable amp. Supported modes are -

O0: FP32 training O1: Mixed Precision O2: “Almost FP16” Mixed Precision O3: FP16 training

See https://nvidia.github.io/apex/amp.html#opt-levels for more info.

Raises

RuntimeError – If opt_level is not None and apex is not installed.

Warning: apex needs to be installed to utilize this feature.

set_checkpoint(checkpoint)

Sets checkpoint on task.

Parameters

checkpoint – A serializable dict representing current task state

set_classy_state(state)

Set task state

Parameters

state – Dict containing state of a task

set_dataset(dataset: classy_vision.dataset.classy_dataset.ClassyDataset, phase_type: str)

Set dataset for phase type on task

Parameters
  • dataset – ClassyDataset for returning samples.

  • phase_type – str must be one of “train” or “test”

set_distributed_options(broadcast_buffers_mode: classy_vision.tasks.classification_task.BroadcastBuffersMode)

Set distributed options.

Parameters

broadcast_buffers_mode – Broadcast buffers mode. See BroadcastBuffersMode for options.

set_hooks(hooks: List[ClassyHook])

Set hooks for task

Parameters

hooks – List of hooks to apply during training

set_loss(loss: classy_vision.losses.classy_loss.ClassyLoss)

Set loss function for task

Parameters

loss – loss for task

set_meters(meters: List[ClassyMeter])

Set meters for task

Parameters

meters – list of meters to compute during training

set_model(model: classy_vision.models.classy_model.ClassyModel)

Set model for task

Parameters

model – Model to be trained

set_num_epochs(num_epochs: Union[int, float])

Set number of epochs to be run.

Parameters

num_epochs – Number of epochs to run task

set_optimizer(optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)

Set optimizer for task

Parameters

optimizer – optimizer for task

set_test_only(test_only: bool)

Set test only flag

Parameters

test_only – If true, only test phases will be run

train_step(use_gpu, local_variables=None)

Train step to be executed in train loop

Parameters
  • use_gpu – if true, execute training on GPU

  • local_variables – Dict containing intermediate values in train_step for access by hooks

update_meters(model_output, sample)
property where

Returns the proportion of training that has completed. If in test only mode, returns proportion of testing completed

Returned value is a float in the range [0, 1)