lingvo.tasks.car.calibration_processing module¶

Library for calculating calibration on a prediction.

lingvo.tasks.car.calibration_processing.ExpectedCalibrationError(confidence, empirical_accuracy, num_examples, min_confidence=None)[source]¶

Calculate the expected calibration error.

Parameters

confidence – 1-D np.array of float32 binned confidence scores with one number per bin
empirical_accuracy – 1-D np.array of float32 binned empirical accuracies with one number per bin
num_examples – 1-D np.array of int for the number of examples within a bin.
min_confidence – float32 of minimum confidence score to use in the calculation. If None, no filtering is applied.

Returns

float32 of expected calibration error

lingvo.tasks.car.calibration_processing.CalibrationCurve(scores, hits, num_bins)[source]¶

Compute data for calibration reliability diagrams.

Parameters

scores – 1-D np.array of float32 confidence scores
hits – 1-D np.array of int32 (either 0 or 1) indicating whether predicted label matches the ground truth label
num_bins – int for the number of calibration bins

Returns

mean_predicted_accuracies: np.array of mean predicted accuracy for each
bin
mean_empirical_accuracies: np.array of mean empirical accuracy for each
bin
num_examples: np.array of the number of examples in each bin

Return type

A tuple containing

class lingvo.tasks.car.calibration_processing.CalibrationCalculator(metadata)[source]¶

Bases: object

Base class for calculating calibration on a prediction.

Calculate(metrics)[source]¶

Calculate metrics for calibration.

Parameters

metrics – A dict. Each entry in the dict is a list of C (number of classes) dicts containing mapping from metric names to individual results.
entries may be the following items (Individual) –
scalars (-) – A list of C (number of classes) dicts mapping metric names to scalar values.
curves (-) – A list of C dicts mapping metrics names to np.float32 arrays of shape [NumberOfPrecisionRecallPoints()+1, 2]. In the last dimension, 0 indexes precision and 1 indexes recall.
calibrations (-) – A list of C dicts mapping metrics names to np.float32 arrays of shape [number of predictions, 2]. The first column is the predicted probabilty and the second column is 0 or 1 indicating that the prediction matched a ground truth item.

Returns

nothing

Summary(name)[source]¶

Generate tf summaries for calibration.

Parameters: name – str, name of summary.
Returns: list of tf.Summary