lingvo.tasks.car.calibration_processing module

Library for calculating calibration on a prediction.

lingvo.tasks.car.calibration_processing.ExpectedCalibrationError(confidence, empirical_accuracy, num_examples, min_confidence=None)[source]

Calculate the expected calibration error.

Parameters
  • confidence – 1-D np.array of float32 binned confidence scores with one number per bin

  • empirical_accuracy – 1-D np.array of float32 binned empirical accuracies with one number per bin

  • num_examples – 1-D np.array of int for the number of examples within a bin.

  • min_confidence – float32 of minimum confidence score to use in the calculation. If None, no filtering is applied.

Returns

float32 of expected calibration error

lingvo.tasks.car.calibration_processing.CalibrationCurve(scores, hits, num_bins)[source]

Compute data for calibration reliability diagrams.

Parameters
  • scores – 1-D np.array of float32 confidence scores

  • hits – 1-D np.array of int32 (either 0 or 1) indicating whether predicted label matches the ground truth label

  • num_bins – int for the number of calibration bins

Returns

  • mean_predicted_accuracies: np.array of mean predicted accuracy for each

    bin

  • mean_empirical_accuracies: np.array of mean empirical accuracy for each

    bin

  • num_examples: np.array of the number of examples in each bin

Return type

A tuple containing

class lingvo.tasks.car.calibration_processing.CalibrationCalculator(metadata)[source]

Bases: object

Base class for calculating calibration on a prediction.

Calculate(metrics)[source]

Calculate metrics for calibration.

Parameters
  • metrics – A dict. Each entry in the dict is a list of C (number of classes) dicts containing mapping from metric names to individual results.

  • entries may be the following items (Individual) –

  • scalars (-) – A list of C (number of classes) dicts mapping metric names to scalar values.

  • curves (-) – A list of C dicts mapping metrics names to np.float32 arrays of shape [NumberOfPrecisionRecallPoints()+1, 2]. In the last dimension, 0 indexes precision and 1 indexes recall.

  • calibrations (-) – A list of C dicts mapping metrics names to np.float32 arrays of shape [number of predictions, 2]. The first column is the predicted probabilty and the second column is 0 or 1 indicating that the prediction matched a ground truth item.

Returns

nothing

Summary(name)[source]

Generate tf summaries for calibration.

Parameters

name – str, name of summary.

Returns

list of tf.Summary