Class Adam
java.lang.Object
org.tensorflow.framework.optimizers.Optimizer
org.tensorflow.framework.optimizers.Adam
Optimizer that implements the Adam algorithm.
Adam optimization is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments.
According to Kingma et al., 2014, the method is "computationally efficient, has little memory requirement, invariant to diagonal rescaling of gradients, and is well suited for problems that are large in terms of data/parameters".
@see Kingma et al., 2014, Adam: A Method for Stochastic Optimization.
-
Nested Class Summary
Nested classes/interfaces inherited from class Optimizer
Optimizer.GradAndVar<T>, Optimizer.Options -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final floatstatic final floatstatic final floatstatic final Stringstatic final floatstatic final StringFields inherited from class Optimizer
globals, graph, tf, VARIABLE_V2 -
Constructor Summary
ConstructorsConstructorDescriptionCreates an Adam optimizerCreates an Adam optimizerCreates an Adam optimizerCreates an Adam optimizerCreates an Adam optimizer -
Method Summary
Modifier and TypeMethodDescriptionapplyDense(Ops deps, Output<T> gradient, Output<T> variable) Generates the gradient update operations for the specific variable and gradient.createAdamMinimize(Scope scope, Operand<T> loss, float learningRate, float betaOne, float betaTwo, float epsilon, Optimizer.Options... options) Creates the Operation that minimizes the lossprotected voidcreateSlots(List<Output<? extends TType>> variables) Performs a No-op slot creation method.protected OpGathers up the update operations into a single op that can be used as a run target.Get the Name of the optimizer.Returns a No-op prepare.toString()Methods inherited from class Optimizer
applyGradients, computeGradients, createName, createSlot, getSlot, getTF, minimize, minimize
-
Field Details
-
FIRST_MOMENT
- See Also:
-
SECOND_MOMENT
- See Also:
-
LEARNING_RATE_DEFAULT
public static final float LEARNING_RATE_DEFAULT- See Also:
-
EPSILON_DEFAULT
public static final float EPSILON_DEFAULT- See Also:
-
BETA_ONE_DEFAULT
public static final float BETA_ONE_DEFAULT- See Also:
-
BETA_TWO_DEFAULT
public static final float BETA_TWO_DEFAULT- See Also:
-
-
Constructor Details
-
Adam
-
Adam
Creates an Adam optimizer- Parameters:
graph- the TensorFlow graphlearningRate- the learning rate
-
Adam
Creates an Adam optimizer- Parameters:
graph- the TensorFlow graphlearningRate- the learning ratebetaOne- The exponential decay rate for the 1st moment estimates. Defaults to 0.9.betaTwo- The exponential decay rate for the 2nd moment estimates. Defaults to 0.999.epsilon- A small constant for numerical stability. This epsilon is "epsilon hat" in the Kingma and Ba paper (in the formula just before Section 2.1), not the epsilon in Algorithm 1 of the paper. Defaults to 1e-8.
-
Adam
-
Adam
public Adam(Graph graph, String name, float learningRate, float betaOne, float betaTwo, float epsilon) Creates an Adam optimizer- Parameters:
graph- the TensorFlow graphname- the Optimizer name, defaults to "Adam"learningRate- the learning ratebetaOne- The exponential decay rate for the 1st moment estimates. Defaults to 0.9.betaTwo- The exponential decay rate for the 2nd moment estimates. Defaults to 0.999.epsilon- A small constant for numerical stability. This epsilon is "epsilon hat" in the Kingma and Ba paper (in the formula just before Section 2.1), not the epsilon in Algorithm 1 of the paper. Defaults to 1e-8.
-
-
Method Details
-
createAdamMinimize
@Endpoint(name="adam_minimize") public static <T extends TType> Op createAdamMinimize(Scope scope, Operand<T> loss, float learningRate, float betaOne, float betaTwo, float epsilon, Optimizer.Options... options) Creates the Operation that minimizes the loss- Type Parameters:
T- the data type for the loss- Parameters:
scope- the TensorFlow scopeloss- the loss to minimizelearningRate- the learning ratebetaOne- The exponential decay rate for the 1st moment estimates.betaTwo- The exponential decay rate for the 2nd moment estimates.epsilon- A small constant for numerical stability. This epsilon is "epsilon hat" in the Kingma and Ba paper (in the formula just before Section 2.1), not the epsilon in Algorithm 1 of the paper.options- Optional Optimizer attributes- Returns:
- the Operation that minimizes the loss
- Throws:
IllegalArgumentException- if scope does not represent a Graph
-
createSlots
Performs a No-op slot creation method.- Overrides:
createSlotsin classOptimizer- Parameters:
variables- The variables to create slots for.
-
prepare
-
applyDense
Generates the gradient update operations for the specific variable and gradient.- Specified by:
applyDensein classOptimizer- Type Parameters:
T- The type of the variable.- Parameters:
gradient- The gradient to use.variable- The variable to update.- Returns:
- An operand which applies the desired optimizer update to the variable.
-
finish
Gathers up the update operations into a single op that can be used as a run target.Adds the betaOne and betaTwo updates to the end of the updates list.
-
toString
-
getOptimizerName
Get the Name of the optimizer.- Specified by:
getOptimizerNamein classOptimizer- Returns:
- The optimizer name.
-