Tasks¶
- class classy_vision.tasks.ClassificationTask¶
Basic classification training task.
This task encapsultates all of the components and steps needed to train a classifier using a
classy_vision.trainer.ClassyTrainer
.Assumes a train / test phase per each epoch and that the datasets have the same API as the map-style Dataset class in torch.utils.data.dataset (in particular, this task makes use of the len). If you are using an IterableDataset then a custom task may be appropriate.
- Variables
loss – Loss (see
classy_vision.losses.ClassyLoss
) function used for computing the loss in each forward passdatasets – Mapping from a
phase_type
in [“train”, “test’] to dataset used for training (or testing)meters – List of meters (see
classy_vision.meters.ClassyMeter
) to calculate during trainingnum_epochs – Number of epochs (passes over dataset) to train
test_only – Used to only run the test phase
base_model – Model to be trained, unwrapped in DDP or DP wrappers
optimizer – Optimizer used in train step
optimizer_schedulers – Dictionary. Key is the name of the optimizer option (e.g. lr), value is a ClassyParamScheduler
checkpoint – Serializable dict which represents state in training
phases – List of phase specific information, e.g. if phase is train / test.
hooks – List of hooks to apply during training
train – Phase type, if true it means we are training, false means testing
distributed_model – Base model, but wrapped in DDP (DistributedDataParallel)
phase_idx – Current phase id, first phase is 0, if task has not started training then returns -1
train_phase_idx – Only counts train phases
num_updates – Number of total parameter updates applied to model by the optimizer
data_iterator – Iterator which can be used to obtain batches
losses – Loss curve
perf_log – list of training speed measurements, to be logged
clip_grad_norm – maximum gradient norm (default None)
simulated_global_batchsize – batch size simulated via gradient accumulation
optimizer_period – apply optimizer after this many steps; derived from simulated_global_batchsize, default 1.
- __init__()¶
Constructs a ClassificationTask
- advance_phase()¶
Performs bookkeeping / task updates between phases
Increments phase idx, resets meters, resets loss history, resets counters, shuffles dataset, rebuilds iterators, and sets the train / test state for phase.
- build_dataloader_from_dataset(dataset, **kwargs)¶
Builds a dataloader from the provided dataset
- Parameters
dataset – A ClassyDataset
kwargs – Additional kwargs to pass during dataloader construction for derived classes
- build_dataloaders_for_current_phase()¶
Builds dataloader(s) for the current phase.
Deriving classes can override this method to support custom behavior, like supporting multiple dataloaders in parallel.
- create_data_iterators()¶
Creates data iterator(s) for the current phase.
- done_training()¶
Stop condition for training
- property eval_phase_idx¶
Returns current evaluation phase
- eval_step()¶
Run an evaluation step.
This corresponds to evaluating the model over one batch of data.
- classmethod from_config(config: Dict[str, Any]) classy_vision.tasks.classification_task.ClassificationTask ¶
Instantiates a ClassificationTask from a configuration.
- Parameters
config – A configuration for a ClassificationTask. See
__init__()
for parameters expected in the config.- Returns
A ClassificationTask instance.
- get_batchsize_per_replica()¶
Return local replica’s batchsize for dataset (e.g. batchsize per GPU)
- get_classy_state(deep_copy: bool = False)¶
Returns serialiable state of task
- Parameters
deep_copy – If true, does a deep copy of state before returning.
- get_global_batchsize()¶
Return global batchsize across all trainers
- get_total_test_phases()¶
Returns the total number of “test” phases in the task
- get_total_training_phases()¶
Returns the total number of “train” phases in the task
- init_distributed_data_parallel_model()¶
Initialize torch.nn.parallel.distributed.DistributedDataParallel.
Needed for distributed training. This is where a model should be wrapped by DDP.
- property loss¶
Returns loss used in training (can be wrapped with DDP)
- property model¶
Returns model used in training (can be wrapped with DDP)
- property num_batches_per_phase¶
Returns number of batches in current phase iterator
- on_end()¶
Training end.
Called by
classy_vision.trainer.ClassyTrainer
after training ends.
- on_phase_end()¶
Epoch end.
Called by
classy_vision.trainer.ClassyTrainer
after each epoch ends.
- on_phase_start()¶
Epoch start.
Called by
classy_vision.trainer.ClassyTrainer
before each epoch starts.
- on_start()¶
Start training.
Called by
classy_vision.trainer.ClassyTrainer
before training starts.
- property phase_type¶
Returns current phase type. String with value “train” or “test”
- prepare()¶
Prepares task for training, populates all derived attributes
- run_optimizer(loss)¶
Runs backwards pass and update the optimizer
- set_amp_args(amp_args: Optional[Dict[str, Any]])¶
Disable / enable apex.amp and set the automatic mixed precision parameters.
apex.amp can be utilized for mixed / half precision training.
- Parameters
amp_args – Dictionary containing arguments to be passed to
mixed (amp.initialize. Set to None to disable amp. To enable) –
training (precision) – “O1”} here.
amp_args={"opt_level" (pass) – “O1”} here.
https (See) – //nvidia.github.io/apex/amp.html for more info.
- Raises
RuntimeError – If opt_level is not None and apex is not installed.
Warning: apex needs to be installed to utilize this feature.
- set_checkpoint(checkpoint_path: str)¶
Sets checkpoint on task.
- Parameters
checkpoint_path – The path to load the checkpoint from. Can be a file or a
:param directory. See
load_checkpoint()
for more information.:
- set_checkpoint_load_strict(checkpoint_load_strict: bool)¶
Sets checkpoint on task.
- Parameters
checkpoint_load_strict – Whether to use load_strict when copying model weights
- set_classy_state(state)¶
Set task state
- Parameters
state – Dict containing state of a task
- set_clip_grad_norm(clip_grad_norm: Optional[float])¶
Sets maximum gradient norm.
None means gradient clipping is disabled. Defaults to None.
- set_dataloader_mp_context(dataloader_mp_context: Optional[str])¶
Set the multiprocessing context used by the dataloader.
The context can be either ‘spawn’, ‘fork’, ‘forkserver’ or None (uses the default context). See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.get_context for more details.
- set_dataset(dataset: classy_vision.dataset.classy_dataset.ClassyDataset, phase_type: str)¶
Set dataset for phase type on task
- Parameters
dataset – ClassyDataset for returning samples.
phase_type – str must be one of “train” or “test”
- set_distributed_options(broadcast_buffers_mode: classy_vision.tasks.classification_task.BroadcastBuffersMode = BroadcastBuffersMode.BEFORE_EVAL, batch_norm_sync_mode: classy_vision.tasks.classification_task.BatchNormSyncMode = BatchNormSyncMode.DISABLED, batch_norm_sync_group_size: int = 0, find_unused_parameters: bool = False, bucket_cap_mb: int = 25, fp16_grad_compress: bool = False)¶
Set distributed options.
- Parameters
broadcast_buffers_mode – Broadcast buffers mode. See
BroadcastBuffersMode
for options.batch_norm_sync_mode – Batch normalization synchronization mode. See
BatchNormSyncMode
for options.batch_norm_sync_group_size – Group size to use for synchronized batch norm. 0 means that the stats are synchronized across all replicas. For efficient synchronization, set it to the number of GPUs in a node ( usually 8).
find_unused_parameters – See
torch.nn.parallel.DistributedDataParallel
for information.bucket_cap_mb – See
torch.nn.parallel.DistributedDataParallel
for information.
- Raises
RuntimeError – If batch_norm_sync_mode is BatchNormSyncMode.APEX and apex is not installed.
- set_hooks(hooks: List[classy_vision.hooks.classy_hook.ClassyHook])¶
Set hooks for task
- Parameters
hooks – List of hooks to apply during training
- set_loss(loss: classy_vision.losses.classy_loss.ClassyLoss)¶
Set loss function for task
- Parameters
loss – loss for task
- set_meters(meters: List[classy_vision.meters.classy_meter.ClassyMeter])¶
Set meters for task
- Parameters
meters – list of meters to compute during training
- set_mixup_transform(mixup_transform: Optional[classy_vision.dataset.transforms.mixup.MixupTransform])¶
Disable / enable mixup transform for data augmentation
- Args::
mixup_transform: a callable object which performs mixup data augmentation
- set_model(model: classy_vision.models.classy_model.ClassyModel)¶
Set model for task
- Parameters
model – Model to be trained
- set_num_epochs(num_epochs: Union[int, float])¶
Set number of epochs to be run.
- Parameters
num_epochs – Number of epochs to run task
- set_optimizer(optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)¶
Set optimizer for task
- Parameters
optimizer – optimizer for task
- set_simulated_global_batchsize(simulated_global_batchsize: Optional[int])¶
Sets a simulated batch size by gradient accumulation.
Gradient accumulation adds up gradients from multiple minibatches and steps the optimizer every N train_steps, where N is optimizer_period. When enabled, the very last train_steps might end up not updating the model, depending on the number of total steps. None means gradient accumulation is disabled. Defaults to None.
- set_test_only(test_only: bool)¶
Set test only flag
- Parameters
test_only – If true, only test phases will be run
- set_test_phase_period(test_phase_period: int)¶
Set the period of test phase.
- Parameters
test_phase_period – The period of test phase
- synchronize_losses()¶
Average the losses across the different replicas
- train_step()¶
Train step to be executed in train loop.
- property where¶
Returns the proportion of training that has completed. If in test only mode, returns proportion of testing completed
Returned value is a float in the range [0, 1)
- class classy_vision.tasks.ClassyTask¶
An abstract base class for a training task.
A ClassyTask encapsulates all the components and steps needed to train using a
classy_vision.trainer.ClassyTrainer
.- __init__() classy_vision.tasks.classy_task.ClassyTask ¶
Constructs a ClassyTask.
- abstract done_training() bool ¶
Tells if we are done training.
- Returns
A boolean telling if training is over.
- abstract eval_step() None ¶
Run an evaluation step.
This corresponds to evaluating the model over one batch of data.
- abstract classmethod from_config(config: Dict[str, Any]) classy_vision.tasks.classy_task.ClassyTask ¶
Instantiates a ClassyTask from a configuration.
- Parameters
config – A configuration for a ClassyTask.
- Returns
A ClassyTask instance.
- abstract get_classy_state(deep_copy: bool = False) Dict[str, Any] ¶
Get the state of the ClassyTask.
The returned state is used for checkpointing.
- Parameters
deep_copy – If True, creates a deep copy of the state dict. Otherwise, the returned dict’s state will be tied to the object’s.
- Returns
A state dictionary containing the state of the task.
- abstract on_end()¶
Training end.
Called by
classy_vision.trainer.ClassyTrainer
after training ends.
- abstract on_phase_end()¶
Epoch end.
Called by
classy_vision.trainer.ClassyTrainer
after each epoch ends.
- abstract on_phase_start()¶
Epoch start.
Called by
classy_vision.trainer.ClassyTrainer
before each epoch starts.
- abstract on_start()¶
Start training.
Called by
classy_vision.trainer.ClassyTrainer
before training starts.
- abstract prepare(num_dataloader_workers=0, dataloader_mp_context=None) None ¶
Prepares the task for training.
Will be called by the
classy_vision.trainer.ClassyTrainer
to prepare the task, before on_start is called.- Parameters
num_dataloader_workers – Number of workers to create for the dataloaders
pin_memory – Whether the dataloaders should copy the Tensors into CUDA pinned memory (default False)
- run_hooks(local_variables: Dict[str, Any], hook_function: str) None ¶
Helper function that runs a hook function for all the
classy_vision.hooks.ClassyHook
.- Parameters
local_variables – Local variables created in
train_step()
hook_function – One of the hook functions in the
:param
classy_vision.hooks.ClassyHookFunctions
: enum.
- abstract set_classy_state(state)¶
Set the state of the ClassyTask.
- Parameters
state_dict – The state dictionary. Must be the output of a call to
get_classy_state()
.
This is used to load the state of the task from a checkpoint.
- class classy_vision.tasks.FineTuningTask(*args, **kwargs)¶
- __init__(*args, **kwargs)¶
Constructs a ClassificationTask
- classmethod from_config(config: Dict[str, Any]) classy_vision.tasks.fine_tuning_task.FineTuningTask ¶
Instantiates a FineTuningTask from a configuration.
- Parameters
config – A configuration for a FineTuningTask. See
__init__()
for parameters expected in the config.- Returns
A FineTuningTask instance.
- classy_vision.tasks.build_task(config)¶
Builds a ClassyTask from a config.
This assumes a ‘name’ key in the config which is used to determine what task class to instantiate. For instance, a config {“name”: “my_task”, “foo”: “bar”} will find a class that was registered as “my_task” (see
register_task()
) and call .from_config on it.
- classy_vision.tasks.register_task(name)¶
Registers a ClassyTask subclass.
This decorator allows Classy Vision to instantiate a subclass of ClassyTask from a configuration file, even if the class itself is not part of the Classy Vision framework. To use it, apply this decorator to a ClassyTask subclass, like this:
@register_task('my_task') class MyTask(ClassyTask): ...
To instantiate a task from a configuration file, see
build_task()
.