Tasks

class classy_vision.tasks.ClassificationTask

Basic classification training task.

This task encapsultates all of the components and steps needed to train a classifier using a classy_vision.trainer.ClassyTrainer.

Assumes a train / test phase per each epoch and that the datasets have the same API as the map-style Dataset class in torch.utils.data.dataset (in particular, this task makes use of the len). If you are using an IterableDataset then a custom task may be appropriate.

Variables
  • loss – Loss (see classy_vision.losses.ClassyLoss) function used for computing the loss in each forward pass

  • datasets – Mapping from a phase_type in [“train”, “test’] to dataset used for training (or testing)

  • meters – List of meters (see classy_vision.meters.ClassyMeter) to calculate during training

  • num_epochs – Number of epochs (passes over dataset) to train

  • test_only – Used to only run the test phase

  • base_model – Model to be trained, unwrapped in DDP or DP wrappers

  • optimizer – Optimizer used in train step

  • optimizer_schedulers – Dictionary. Key is the name of the optimizer option (e.g. lr), value is a ClassyParamScheduler

  • checkpoint – Serializable dict which represents state in training

  • phases – List of phase specific information, e.g. if phase is train / test.

  • hooks – List of hooks to apply during training

  • train – Phase type, if true it means we are training, false means testing

  • distributed_model – Base model, but wrapped in DDP (DistributedDataParallel)

  • phase_idx – Current phase id, first phase is 0, if task has not started training then returns -1

  • train_phase_idx – Only counts train phases

  • num_updates – Number of total parameter updates applied to model by the optimizer

  • data_iterator – Iterator which can be used to obtain batches

  • losses – Loss curve

  • perf_log – list of training speed measurements, to be logged

  • clip_grad_norm – maximum gradient norm (default None)

  • simulated_global_batchsize – batch size simulated via gradient accumulation

  • optimizer_period – apply optimizer after this many steps; derived from simulated_global_batchsize, default 1.

__init__()

Constructs a ClassificationTask

advance_phase()

Performs bookkeeping / task updates between phases

Increments phase idx, resets meters, resets loss history, resets counters, shuffles dataset, rebuilds iterators, and sets the train / test state for phase.

build_dataloader_from_dataset(dataset, **kwargs)

Builds a dataloader from the provided dataset

Parameters
  • dataset – A ClassyDataset

  • kwargs – Additional kwargs to pass during dataloader construction for derived classes

build_dataloaders_for_current_phase()

Builds dataloader(s) for the current phase.

Deriving classes can override this method to support custom behavior, like supporting multiple dataloaders in parallel.

create_data_iterators()

Creates data iterator(s) for the current phase.

done_training()

Stop condition for training

property eval_phase_idx

Returns current evaluation phase

eval_step()

Run an evaluation step.

This corresponds to evaluating the model over one batch of data.

classmethod from_config(config: Dict[str, Any]) classy_vision.tasks.classification_task.ClassificationTask

Instantiates a ClassificationTask from a configuration.

Parameters

config – A configuration for a ClassificationTask. See __init__() for parameters expected in the config.

Returns

A ClassificationTask instance.

get_batchsize_per_replica()

Return local replica’s batchsize for dataset (e.g. batchsize per GPU)

get_classy_state(deep_copy: bool = False)

Returns serialiable state of task

Parameters

deep_copy – If true, does a deep copy of state before returning.

get_global_batchsize()

Return global batchsize across all trainers

get_total_test_phases()

Returns the total number of “test” phases in the task

get_total_training_phases()

Returns the total number of “train” phases in the task

init_distributed_data_parallel_model()

Initialize torch.nn.parallel.distributed.DistributedDataParallel.

Needed for distributed training. This is where a model should be wrapped by DDP.

property loss

Returns loss used in training (can be wrapped with DDP)

property model

Returns model used in training (can be wrapped with DDP)

property num_batches_per_phase

Returns number of batches in current phase iterator

on_end()

Training end.

Called by classy_vision.trainer.ClassyTrainer after training ends.

on_phase_end()

Epoch end.

Called by classy_vision.trainer.ClassyTrainer after each epoch ends.

on_phase_start()

Epoch start.

Called by classy_vision.trainer.ClassyTrainer before each epoch starts.

on_start()

Start training.

Called by classy_vision.trainer.ClassyTrainer before training starts.

property phase_type

Returns current phase type. String with value “train” or “test”

prepare()

Prepares task for training, populates all derived attributes

run_optimizer(loss)

Runs backwards pass and update the optimizer

set_amp_args(amp_args: Optional[Dict[str, Any]])

Disable / enable apex.amp and set the automatic mixed precision parameters.

apex.amp can be utilized for mixed / half precision training.

Parameters
  • amp_args – Dictionary containing arguments to be passed to

  • mixed (amp.initialize. Set to None to disable amp. To enable) –

  • training (precision) – “O1”} here.

  • amp_args={"opt_level" (pass) – “O1”} here.

  • https (See) – //nvidia.github.io/apex/amp.html for more info.

Raises

RuntimeError – If opt_level is not None and apex is not installed.

Warning: apex needs to be installed to utilize this feature.

set_checkpoint(checkpoint_path: str)

Sets checkpoint on task.

Parameters

checkpoint_path – The path to load the checkpoint from. Can be a file or a

:param directory. See load_checkpoint() for more information.:

set_checkpoint_load_strict(checkpoint_load_strict: bool)

Sets checkpoint on task.

Parameters

checkpoint_load_strict – Whether to use load_strict when copying model weights

set_classy_state(state)

Set task state

Parameters

state – Dict containing state of a task

set_clip_grad_norm(clip_grad_norm: Optional[float])

Sets maximum gradient norm.

None means gradient clipping is disabled. Defaults to None.

set_dataloader_mp_context(dataloader_mp_context: Optional[str])

Set the multiprocessing context used by the dataloader.

The context can be either ‘spawn’, ‘fork’, ‘forkserver’ or None (uses the default context). See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.get_context for more details.

set_dataset(dataset: classy_vision.dataset.classy_dataset.ClassyDataset, phase_type: str)

Set dataset for phase type on task

Parameters
  • dataset – ClassyDataset for returning samples.

  • phase_type – str must be one of “train” or “test”

set_distributed_options(broadcast_buffers_mode: classy_vision.tasks.classification_task.BroadcastBuffersMode = BroadcastBuffersMode.BEFORE_EVAL, batch_norm_sync_mode: classy_vision.tasks.classification_task.BatchNormSyncMode = BatchNormSyncMode.DISABLED, batch_norm_sync_group_size: int = 0, find_unused_parameters: bool = False, bucket_cap_mb: int = 25, fp16_grad_compress: bool = False)

Set distributed options.

Parameters
  • broadcast_buffers_mode – Broadcast buffers mode. See BroadcastBuffersMode for options.

  • batch_norm_sync_mode – Batch normalization synchronization mode. See BatchNormSyncMode for options.

  • batch_norm_sync_group_size – Group size to use for synchronized batch norm. 0 means that the stats are synchronized across all replicas. For efficient synchronization, set it to the number of GPUs in a node ( usually 8).

  • find_unused_parameters – See torch.nn.parallel.DistributedDataParallel for information.

  • bucket_cap_mb – See torch.nn.parallel.DistributedDataParallel for information.

Raises

RuntimeError – If batch_norm_sync_mode is BatchNormSyncMode.APEX and apex is not installed.

set_hooks(hooks: List[classy_vision.hooks.classy_hook.ClassyHook])

Set hooks for task

Parameters

hooks – List of hooks to apply during training

set_loss(loss: classy_vision.losses.classy_loss.ClassyLoss)

Set loss function for task

Parameters

loss – loss for task

set_meters(meters: List[classy_vision.meters.classy_meter.ClassyMeter])

Set meters for task

Parameters

meters – list of meters to compute during training

set_mixup_transform(mixup_transform: Optional[classy_vision.dataset.transforms.mixup.MixupTransform])

Disable / enable mixup transform for data augmentation

Args::

mixup_transform: a callable object which performs mixup data augmentation

set_model(model: classy_vision.models.classy_model.ClassyModel)

Set model for task

Parameters

model – Model to be trained

set_num_epochs(num_epochs: Union[int, float])

Set number of epochs to be run.

Parameters

num_epochs – Number of epochs to run task

set_optimizer(optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)

Set optimizer for task

Parameters

optimizer – optimizer for task

set_simulated_global_batchsize(simulated_global_batchsize: Optional[int])

Sets a simulated batch size by gradient accumulation.

Gradient accumulation adds up gradients from multiple minibatches and steps the optimizer every N train_steps, where N is optimizer_period. When enabled, the very last train_steps might end up not updating the model, depending on the number of total steps. None means gradient accumulation is disabled. Defaults to None.

set_test_only(test_only: bool)

Set test only flag

Parameters

test_only – If true, only test phases will be run

set_test_phase_period(test_phase_period: int)

Set the period of test phase.

Parameters

test_phase_period – The period of test phase

synchronize_losses()

Average the losses across the different replicas

train_step()

Train step to be executed in train loop.

property where

Returns the proportion of training that has completed. If in test only mode, returns proportion of testing completed

Returned value is a float in the range [0, 1)

class classy_vision.tasks.ClassyTask

An abstract base class for a training task.

A ClassyTask encapsulates all the components and steps needed to train using a classy_vision.trainer.ClassyTrainer.

__init__() classy_vision.tasks.classy_task.ClassyTask

Constructs a ClassyTask.

abstract done_training() bool

Tells if we are done training.

Returns

A boolean telling if training is over.

abstract eval_step() None

Run an evaluation step.

This corresponds to evaluating the model over one batch of data.

abstract classmethod from_config(config: Dict[str, Any]) classy_vision.tasks.classy_task.ClassyTask

Instantiates a ClassyTask from a configuration.

Parameters

config – A configuration for a ClassyTask.

Returns

A ClassyTask instance.

abstract get_classy_state(deep_copy: bool = False) Dict[str, Any]

Get the state of the ClassyTask.

The returned state is used for checkpointing.

Parameters

deep_copy – If True, creates a deep copy of the state dict. Otherwise, the returned dict’s state will be tied to the object’s.

Returns

A state dictionary containing the state of the task.

abstract on_end()

Training end.

Called by classy_vision.trainer.ClassyTrainer after training ends.

abstract on_phase_end()

Epoch end.

Called by classy_vision.trainer.ClassyTrainer after each epoch ends.

abstract on_phase_start()

Epoch start.

Called by classy_vision.trainer.ClassyTrainer before each epoch starts.

abstract on_start()

Start training.

Called by classy_vision.trainer.ClassyTrainer before training starts.

abstract prepare(num_dataloader_workers=0, dataloader_mp_context=None) None

Prepares the task for training.

Will be called by the classy_vision.trainer.ClassyTrainer to prepare the task, before on_start is called.

Parameters
  • num_dataloader_workers – Number of workers to create for the dataloaders

  • pin_memory – Whether the dataloaders should copy the Tensors into CUDA pinned memory (default False)

run_hooks(local_variables: Dict[str, Any], hook_function: str) None

Helper function that runs a hook function for all the classy_vision.hooks.ClassyHook.

Parameters
  • local_variables – Local variables created in train_step()

  • hook_function – One of the hook functions in the

:param classy_vision.hooks.ClassyHookFunctions: enum.

abstract set_classy_state(state)

Set the state of the ClassyTask.

Parameters

state_dict – The state dictionary. Must be the output of a call to get_classy_state().

This is used to load the state of the task from a checkpoint.

abstract train_step() None

Run a train step.

This corresponds to training over one batch of data from the dataloaders.

abstract property where: float

Tells how far along (where) we are during training.

Returns

A float in [0, 1) which tells the training progress.

class classy_vision.tasks.FineTuningTask(*args, **kwargs)
__init__(*args, **kwargs)

Constructs a ClassificationTask

classmethod from_config(config: Dict[str, Any]) classy_vision.tasks.fine_tuning_task.FineTuningTask

Instantiates a FineTuningTask from a configuration.

Parameters

config – A configuration for a FineTuningTask. See __init__() for parameters expected in the config.

Returns

A FineTuningTask instance.

prepare() None

Prepares task for training, populates all derived attributes

classy_vision.tasks.build_task(config)

Builds a ClassyTask from a config.

This assumes a ‘name’ key in the config which is used to determine what task class to instantiate. For instance, a config {“name”: “my_task”, “foo”: “bar”} will find a class that was registered as “my_task” (see register_task()) and call .from_config on it.

classy_vision.tasks.register_task(name)

Registers a ClassyTask subclass.

This decorator allows Classy Vision to instantiate a subclass of ClassyTask from a configuration file, even if the class itself is not part of the Classy Vision framework. To use it, apply this decorator to a ClassyTask subclass, like this:

@register_task('my_task')
class MyTask(ClassyTask):
    ...

To instantiate a task from a configuration file, see build_task().