Class NnOps

java.lang.Object
org.tensorflow.framework.op.NnOps

public class NnOps extends Object
Creates Framework nerual network Operations

These are higher level ops that may invoke core ops. Higher level Ops may perform the operation solely in the TensorFlow framework or do preprocessing of the Operands before invoking a core level Op.

FrameworkOps

  • Method Details

    • sigmoidCrossEntropyWithLogits

      public <T extends TNumber> Operand<T> sigmoidCrossEntropyWithLogits(Operand<T> labels, Operand<T> logits)
      Computes sigmoid cross entropy given logits.

      Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time.

      For brevity, let x = logits, z = labels. The logistic loss in pseudo-code is

       z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))
        = z * -log(1 / (1 + exp(-x))) + (1 - z) * -log(exp(-x) / (1 + exp(-x)))
        = z * log(1 + exp(-x)) + (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))
        = z * log(1 + exp(-x)) + (1 - z) * (x + log(1 + exp(-x))
        = (1 - z) * x + log(1 + exp(-x))
        = x - x * z + log(1 + exp(-x))
       

      For x < 0, to avoid overflow in exp(-x), we reformulate the above

       x - x * z + log(1 + exp(-x))
        = log(exp(x)) - x * z + log(1 + exp(-x))
        = - x * z + log(1 + exp(x))
       

      Hence, to ensure stability and avoid overflow, the implementation uses this equivalent formulation

         max(x, 0) - x * z + log(1 + exp(-abs(x)))
       

      logits and labels must have the same type and shape.

      Type Parameters:
      T - the type of labels and logits
      Parameters:
      labels - the labels
      logits - the logits of type float32 or float64
      Returns:
      the component-wise logistic losses.
      Throws:
      IllegalArgumentException - if logits and labels do not have the same shape
    • softmaxCrossEntropyWithLogits

      public <T extends TNumber, U extends TNumber> Operand<T> softmaxCrossEntropyWithLogits(Operand<U> labels, Operand<T> logits, int axis)
      Computes softmax cross entropy between logits and labels.

      Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.

      NOTE:

      While the classes are mutually exclusive, their probabilities need not be. All that is required is that each row of labels is a valid probability distribution. If they are not, the computation of the gradient will be incorrect.

      If using exclusive labels (wherein one and only one class is true at a time), see NnOps.sparseSoftmaxCrossEntropyWithLogits(Operand, Operand)

      Usage:

         Operand<TFloat32> logits =
             tf.constant(new float[][] {{4.0F, 2.0F, 1.0F}, {0.0F, 5.0F, 1.0F}} );
         Operand<TFloat32> labels =
             tf.constant(new float[][] {{1.0F, 0.0F, 0.0F}, {0.0F, 0.8F, 0.2F}} );
         Operand<TFloat32> output =
             tf.nn.softmaxCrossEntropyWithLogits(labels, logits, -1);
         // output Shape = [2]
         // dataType = FLOAT (1)
         // values { 0.169846, 0.824745 }
       

      Backpropagation will happen into both logits and labels. To disallow backpropagation into labels, pass label tensors through tf.stopGradient before feeding it to this function.

      Type Parameters:
      T - the number type of the operands
      U - the data type for the labels.
      Parameters:
      labels - Each vector along the class dimension should hold a valid probability distribution e.g. for the case in which labels are of shape [batch_size, num_classes] , each row of labels[i] must be a valid probability distribution.
      logits - Per-label activations, typically a linear output. These activation energies are interpreted as unnormalized log probabilities.
      axis - The class dimension. -1 is the last dimension.
      Returns:
      the softmax cross entropy loss. Its type is the same as logits and its shape is the same as labels except that it does not have the last dimension of labels.
    • sparseSoftmaxCrossEntropyWithLogits

      public <T extends TNumber, U extends TNumber> Operand<T> sparseSoftmaxCrossEntropyWithLogits(Operand<U> labels, Operand<T> logits)
      Computes sparse softmax cross entropy between logits and labels.

      Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both.

      NOTE:

      For this operation, the probability of a given label is considered exclusive. That is, soft classes are not allowed, and the labels vector must provide a single specific index for the true class for each row of logits (each minibatch entry). For soft softmax classification with a probability distribution for each entry, NnOps.softmaxCrossEntropyWithLogits(Operand, Operand).

      WARNING:

      This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.

      A common use case is to have logits of shape [batchSize, numClasses] and have labels of shape [batchSize], but higher dimensions are supported, in which case the dim-th dimension is assumed to be of size numClasses. logits must have the dataType of TFloat16, TFloat32 , or TFloat64, and labels must have the dtype of TInt32 or TInt64.

      Type Parameters:
      T - the data tyoe for the loss and logits.
      U - the data type for the labels
      Parameters:
      labels - Tensor of shape [d_0, d_1, ..., d_{r-1}] (where r is rank of labels and result) and the dataType is TInt32 or TInt64. Each entry in labels must be an index in [0, numClasses). Other values will raise an exception when this op is run on CPU, and return NaN for corresponding loss and gradient rows on GPU.
      logits - Per-label activations (typically a linear output) of shape [d_0, d_1, ..., d_{r-1}, numClasses] and dataType of TFloat16, TFloat32, or TFloat64. These activation energies are interpreted as unnormalized log probabilities.
      Returns:
      the loss
      Throws:
      IllegalArgumentException - If logits are scalars (need to have rank >= 1) or if the rank of the labels is not equal to the rank of the logits minus one.
    • gelu

      public <T extends TNumber> Operand<T> gelu(Operand<T> input)
      Compute the Gaussian Error Linear Unit (GELU) activation function without approximation.

      Gaussian error linear unit (GELU) computes x * P(X <= x), where P(X) ~ N(0, 1). The (GELU) nonlinearity weights inputs by their value, rather than gates inputs by their sign as in ReLU.

      Type Parameters:
      T - the data type for the input and result
      Parameters:
      input - the input
      Returns:
      The Gaussian Error Linear Unit computation
    • gelu

      public <T extends TNumber> Operand<T> gelu(Operand<T> input, boolean approximate)
      Compute the Gaussian Error Linear Unit (GELU) activation function.

      Gaussian error linear unit (GELU) computes x * P(X <= x), where P(X) ~ N(0, 1). The (GELU) nonlinearity weights inputs by their value, rather than gates inputs by their sign as in ReLU.

      Type Parameters:
      T - the data type for the input and result
      Parameters:
      input - the input
      approximate - Whether to enable approximation.
      Returns:
      The Gaussian Error Linear Unit computation