Optimizers

class classy_vision.optim.Adam(lr: float = 0.1, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.0, amsgrad: bool = False)
__init__(lr: float = 0.1, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.0, amsgrad: bool = False) None

Constructor for ClassyOptimizer.

Variables
  • options_view – provides convenient access to current values of learning rate, momentum etc.

  • _param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.

classmethod from_config(config: Dict[str, Any]) classy_vision.optim.adam.Adam

Instantiates a Adam from a configuration.

Parameters

config – A configuration for a Adam. See __init__() for parameters expected in the config.

Returns

A Adam instance.

prepare(param_groups) None

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

class classy_vision.optim.AdamW(lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.01, amsgrad: bool = False)
__init__(lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.01, amsgrad: bool = False) None

Constructor for ClassyOptimizer.

Variables
  • options_view – provides convenient access to current values of learning rate, momentum etc.

  • _param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.

classmethod from_config(config: Dict[str, Any]) classy_vision.optim.adamw.AdamW

Instantiates an AdamW optimizer from a configuration.

Parameters

config – A configuration for AdamW. See __init__() for parameters expected in the config.

Returns

An AdamW instance.

prepare(param_groups) None

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

class classy_vision.optim.ClassyOptimizer

Base class for optimizers.

This wraps a torch.optim.Optimizer instance and provides support for parameter scheduling. Typical PyTorch optimizers are used like this:

optim = SGD(model.parameters(), lr=0.1)

but the user is responsible for updating lr over the course of training. ClassyOptimizers extend PyTorch optimizers and allow specifying ParamSchedulers instead:

optim = SGD() optim.set_param_groups(model.parameters(), lr=LinearParamScheduler(1, 2))

This means that as you step through the optimizer, the learning rate will automatically get updated with the given schedule. To access the current learning rate value (or any other optimizer option), you can read optim.options_view.lr. Similar to other Classy abstractions, you can also instantiate ClassyOptimizers from a configuration file.

__init__() None

Constructor for ClassyOptimizer.

Variables
  • options_view – provides convenient access to current values of learning rate, momentum etc.

  • _param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.

classmethod from_config(config: Dict[str, Any]) classy_vision.optim.classy_optimizer.ClassyOptimizer

Instantiates a ClassyOptimizer from a configuration.

Parameters

config – A configuration for the ClassyOptimizer.

Returns

A ClassyOptimizer instance.

get_classy_state() Dict[str, Any]

Get the state of the ClassyOptimizer.

The returned state is used for checkpointing.

Returns

A state dictionary containing the state of the optimizer.

on_epoch(where: float) None

Called at the end of a phase.

Updates the param schedule at the end of a phase, till training is in progress. This should be called by the task at the end of every epoch to update the schedule of epoch based param schedulers (See param_scheduler.ParamScheduler for more information).

Parameters

where – where we are in terms of training progress (output of tasks.ClassyTask.where())

abstract prepare(param_groups)

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

set_classy_state(state: Dict[str, Any]) None

Set the state of the ClassyOptimizer.

Parameters

state_dict – The state dictionary. Must be the output of a call to get_classy_state().

This is used to load the state of the optimizer from a checkpoint.

set_param_groups(param_groups, **kwargs)

Specifies what parameters will be optimized.

This is the public API where users of ClassyOptimizer can specify what parameters will get optimized. Unlike PyTorch optimizers, we don’t require the list of param_groups in the constructor.

Parameters

param_groups – this is either a list of Tensors (e.g. model.parameters()) or a list of dictionaries. If a dictionary, must contain a key “params” having the same format and semantics as PyTorch.

step(*args, closure: Optional[Callable] = None, where: Optional[float] = None) None

Perform the optimization updates for a given training step.

The optimization options (such as learning rate) performed during this step will correspond to the where value given as an argument to this function. The exact values used can be read via the options_view property in the optimizer.

Parameters

where – where we are in terms of training progress (output of :method:`ClassyTask.where`). Must be a float in the [0;1) interval; This dictates parameter scheduling;

zero_grad()

Clears the gradients of all optimized parameters.

See torch.optim.Optimizer.zero_grad for more information.

class classy_vision.optim.RMSProp(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False)
__init__(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False) None

Constructor for ClassyOptimizer.

Variables
  • options_view – provides convenient access to current values of learning rate, momentum etc.

  • _param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.

classmethod from_config(config: Dict[str, Any]) classy_vision.optim.rmsprop.RMSProp

Instantiates a RMSProp from a configuration.

Parameters

config – A configuration for a RMSProp. See __init__() for parameters expected in the config.

Returns

A RMSProp instance.

prepare(param_groups)

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

class classy_vision.optim.RMSPropTF(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False)
__init__(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False) None

Constructor for ClassyOptimizer.

Variables
  • options_view – provides convenient access to current values of learning rate, momentum etc.

  • _param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.

classmethod from_config(config: Dict[str, Any]) classy_vision.optim.rmsprop_tf.RMSPropTF

Instantiates a RMSPropTF from a configuration.

Parameters

config – A configuration for a RMSPropTF. See __init__() for parameters expected in the config.

Returns

A RMSPropTF instance.

prepare(param_groups)

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

class classy_vision.optim.SGD(larc_config: Optional[Dict[str, Any]] = None, lr: float = 0.1, momentum: float = 0.0, weight_decay: float = 0.0, nesterov: bool = False, use_larc: bool = False)
__init__(larc_config: Optional[Dict[str, Any]] = None, lr: float = 0.1, momentum: float = 0.0, weight_decay: float = 0.0, nesterov: bool = False, use_larc: bool = False)

Constructor for ClassyOptimizer.

Variables
  • options_view – provides convenient access to current values of learning rate, momentum etc.

  • _param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.

classmethod from_config(config: Dict[str, Any]) classy_vision.optim.sgd.SGD

Instantiates a SGD from a configuration.

Parameters

config – A configuration for a SGD. See __init__() for parameters expected in the config.

Returns

A SGD instance.

prepare(param_groups)

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

class classy_vision.optim.ZeRO(base_optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)
__init__(base_optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)

Wraps an arbitrary ClassyOptimizer optimizer and shards its state as described by ZeRO.

opt = OSS(params, optim=torch.optim.Adam, lr=0.01)

This instance holds all of the parameters for the model (in the .param_groups attribute) but relies on a wrapped optimizer, which only process an original shard of the parameters. Every step all the parameters are synced across the replicas. The Fairscale library is used https://github.com/facebookresearch/fairscale

classmethod from_config(config)

Instantiates a ClassyOptimizer from a configuration.

Parameters

config – A configuration for the ClassyOptimizer.

Returns

A ClassyOptimizer instance.

on_epoch(where: float) None

Called at the end of a phase.

Updates the param schedule at the end of a phase, till training is in progress. This should be called by the task at the end of every epoch to update the schedule of epoch based param schedulers (See param_scheduler.ParamScheduler for more information).

Parameters

where – where we are in terms of training progress (output of tasks.ClassyTask.where())

prepare(param_groups) None

Prepares the optimizer for training.

Deriving classes should initialize the underlying PyTorch torch.optim.Optimizer in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).

Warning

This should called only after the model has been moved to the correct device.

classy_vision.optim.build_optimizer(config)

Builds a ClassyOptimizer from a config.

This assumes a ‘name’ key in the config which is used to determine what optimizer class to instantiate. For instance, a config {“name”: “my_optimizer”, “foo”: “bar”} will find a class that was registered as “my_optimizer” (see register_optimizer()) and call .from_config on it.

Also builds the param schedulers passed in the config and associates them with the optimizer. The config should contain an optional “param_schedulers” key containing a dictionary of param scheduler configs, keyed by the parameter they control. Adds “num_epochs” to each of the scheduler configs and then calls build_param_scheduler() on each config in the dictionary.

classy_vision.optim.register_optimizer(name, bypass_checks=False)

Registers a ClassyOptimizer subclass.

This decorator allows Classy Vision to instantiate a subclass of ClassyOptimizer from a configuration file, even if the class itself is not part of the Classy Vision framework. To use it, apply this decorator to a ClassyOptimizer subclass, like this:

@register_optimizer('my_optimizer')
class MyOptimizer(ClassyOptimizer):
    ...

To instantiate an optimizer from a configuration file, see build_optimizer().