Optimizers¶
- class classy_vision.optim.Adam(lr: float = 0.1, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.0, amsgrad: bool = False)¶
- __init__(lr: float = 0.1, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.0, amsgrad: bool = False) None ¶
Constructor for ClassyOptimizer.
- Variables
options_view – provides convenient access to current values of learning rate, momentum etc.
_param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.
- classmethod from_config(config: Dict[str, Any]) classy_vision.optim.adam.Adam ¶
Instantiates a Adam from a configuration.
- Parameters
config – A configuration for a Adam. See
__init__()
for parameters expected in the config.- Returns
A Adam instance.
- prepare(param_groups) None ¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- class classy_vision.optim.AdamW(lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.01, amsgrad: bool = False)¶
- __init__(lr: float = 0.001, betas: Tuple[float, float] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.01, amsgrad: bool = False) None ¶
Constructor for ClassyOptimizer.
- Variables
options_view – provides convenient access to current values of learning rate, momentum etc.
_param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.
- classmethod from_config(config: Dict[str, Any]) classy_vision.optim.adamw.AdamW ¶
Instantiates an AdamW optimizer from a configuration.
- Parameters
config – A configuration for AdamW. See
__init__()
for parameters expected in the config.- Returns
An AdamW instance.
- prepare(param_groups) None ¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- class classy_vision.optim.ClassyOptimizer¶
Base class for optimizers.
This wraps a
torch.optim.Optimizer
instance and provides support for parameter scheduling. Typical PyTorch optimizers are used like this:optim = SGD(model.parameters(), lr=0.1)
but the user is responsible for updating lr over the course of training. ClassyOptimizers extend PyTorch optimizers and allow specifying ParamSchedulers instead:
optim = SGD() optim.set_param_groups(model.parameters(), lr=LinearParamScheduler(1, 2))
This means that as you step through the optimizer, the learning rate will automatically get updated with the given schedule. To access the current learning rate value (or any other optimizer option), you can read optim.options_view.lr. Similar to other Classy abstractions, you can also instantiate ClassyOptimizers from a configuration file.
- __init__() None ¶
Constructor for ClassyOptimizer.
- Variables
options_view – provides convenient access to current values of learning rate, momentum etc.
_param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.
- classmethod from_config(config: Dict[str, Any]) classy_vision.optim.classy_optimizer.ClassyOptimizer ¶
Instantiates a ClassyOptimizer from a configuration.
- Parameters
config – A configuration for the ClassyOptimizer.
- Returns
A ClassyOptimizer instance.
- get_classy_state() Dict[str, Any] ¶
Get the state of the ClassyOptimizer.
The returned state is used for checkpointing.
- Returns
A state dictionary containing the state of the optimizer.
- on_epoch(where: float) None ¶
Called at the end of a phase.
Updates the param schedule at the end of a phase, till training is in progress. This should be called by the task at the end of every epoch to update the schedule of epoch based param schedulers (See
param_scheduler.ParamScheduler
for more information).- Parameters
where – where we are in terms of training progress (output of
tasks.ClassyTask.where()
)
- abstract prepare(param_groups)¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- set_classy_state(state: Dict[str, Any]) None ¶
Set the state of the ClassyOptimizer.
- Parameters
state_dict – The state dictionary. Must be the output of a call to
get_classy_state()
.
This is used to load the state of the optimizer from a checkpoint.
- set_param_groups(param_groups, **kwargs)¶
Specifies what parameters will be optimized.
This is the public API where users of ClassyOptimizer can specify what parameters will get optimized. Unlike PyTorch optimizers, we don’t require the list of param_groups in the constructor.
- Parameters
param_groups – this is either a list of Tensors (e.g. model.parameters()) or a list of dictionaries. If a dictionary, must contain a key “params” having the same format and semantics as PyTorch.
- step(*args, closure: Optional[Callable] = None, where: Optional[float] = None) None ¶
Perform the optimization updates for a given training step.
The optimization options (such as learning rate) performed during this step will correspond to the where value given as an argument to this function. The exact values used can be read via the options_view property in the optimizer.
- Parameters
where – where we are in terms of training progress (output of :method:`ClassyTask.where`). Must be a float in the [0;1) interval; This dictates parameter scheduling;
- zero_grad()¶
Clears the gradients of all optimized parameters.
See torch.optim.Optimizer.zero_grad for more information.
- class classy_vision.optim.RMSProp(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False)¶
- __init__(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False) None ¶
Constructor for ClassyOptimizer.
- Variables
options_view – provides convenient access to current values of learning rate, momentum etc.
_param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.
- classmethod from_config(config: Dict[str, Any]) classy_vision.optim.rmsprop.RMSProp ¶
Instantiates a RMSProp from a configuration.
- Parameters
config – A configuration for a RMSProp. See
__init__()
for parameters expected in the config.- Returns
A RMSProp instance.
- prepare(param_groups)¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- class classy_vision.optim.RMSPropTF(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False)¶
- __init__(lr: float = 0.1, momentum: float = 0, weight_decay: float = 0, alpha: float = 0.99, eps: float = 1e-08, centered: bool = False) None ¶
Constructor for ClassyOptimizer.
- Variables
options_view – provides convenient access to current values of learning rate, momentum etc.
_param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.
- classmethod from_config(config: Dict[str, Any]) classy_vision.optim.rmsprop_tf.RMSPropTF ¶
Instantiates a RMSPropTF from a configuration.
- Parameters
config – A configuration for a RMSPropTF. See
__init__()
for parameters expected in the config.- Returns
A RMSPropTF instance.
- prepare(param_groups)¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- class classy_vision.optim.SGD(larc_config: Optional[Dict[str, Any]] = None, lr: float = 0.1, momentum: float = 0.0, weight_decay: float = 0.0, nesterov: bool = False, use_larc: bool = False)¶
- __init__(larc_config: Optional[Dict[str, Any]] = None, lr: float = 0.1, momentum: float = 0.0, weight_decay: float = 0.0, nesterov: bool = False, use_larc: bool = False)¶
Constructor for ClassyOptimizer.
- Variables
options_view – provides convenient access to current values of learning rate, momentum etc.
_param_group_schedulers – list of dictionaries in the param_groups format, containing all ParamScheduler instances needed. Constant values are converted to ConstantParamScheduler before being inserted here.
- classmethod from_config(config: Dict[str, Any]) classy_vision.optim.sgd.SGD ¶
Instantiates a SGD from a configuration.
- Parameters
config – A configuration for a SGD. See
__init__()
for parameters expected in the config.- Returns
A SGD instance.
- prepare(param_groups)¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- class classy_vision.optim.ZeRO(base_optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)¶
- __init__(base_optimizer: classy_vision.optim.classy_optimizer.ClassyOptimizer)¶
Wraps an arbitrary
ClassyOptimizer
optimizer and shards its state as described by ZeRO.opt = OSS(params, optim=torch.optim.Adam, lr=0.01)
This instance holds all of the parameters for the model (in the .param_groups attribute) but relies on a wrapped optimizer, which only process an original shard of the parameters. Every step all the parameters are synced across the replicas. The Fairscale library is used https://github.com/facebookresearch/fairscale
- classmethod from_config(config)¶
Instantiates a ClassyOptimizer from a configuration.
- Parameters
config – A configuration for the ClassyOptimizer.
- Returns
A ClassyOptimizer instance.
- on_epoch(where: float) None ¶
Called at the end of a phase.
Updates the param schedule at the end of a phase, till training is in progress. This should be called by the task at the end of every epoch to update the schedule of epoch based param schedulers (See
param_scheduler.ParamScheduler
for more information).- Parameters
where – where we are in terms of training progress (output of
tasks.ClassyTask.where()
)
- prepare(param_groups) None ¶
Prepares the optimizer for training.
Deriving classes should initialize the underlying PyTorch
torch.optim.Optimizer
in this call. The param_groups argument follows the same format supported by PyTorch (list of parameters, or list of param group dictionaries).Warning
This should called only after the model has been moved to the correct device.
- classy_vision.optim.build_optimizer(config)¶
Builds a ClassyOptimizer from a config.
This assumes a ‘name’ key in the config which is used to determine what optimizer class to instantiate. For instance, a config {“name”: “my_optimizer”, “foo”: “bar”} will find a class that was registered as “my_optimizer” (see
register_optimizer()
) and call .from_config on it.Also builds the param schedulers passed in the config and associates them with the optimizer. The config should contain an optional “param_schedulers” key containing a dictionary of param scheduler configs, keyed by the parameter they control. Adds “num_epochs” to each of the scheduler configs and then calls
build_param_scheduler()
on each config in the dictionary.
- classy_vision.optim.register_optimizer(name, bypass_checks=False)¶
Registers a ClassyOptimizer subclass.
This decorator allows Classy Vision to instantiate a subclass of ClassyOptimizer from a configuration file, even if the class itself is not part of the Classy Vision framework. To use it, apply this decorator to a ClassyOptimizer subclass, like this:
@register_optimizer('my_optimizer') class MyOptimizer(ClassyOptimizer): ...
To instantiate an optimizer from a configuration file, see
build_optimizer()
.