lingvo.tasks.car.ops package

Car Operations.

lingvo.tasks.car.ops.pairwise_iou3d(boxes_a, boxes_b, name=None)

Calculate pairwise IoUs between two set of 3D bboxes. Every bbox is represented

as [center_x, center_y, center_z, dim_x, dim_y, dim_z, heading].

Parameters
  • boxes_a – A Tensor of type float32. A tensor of shape [num_boxes_a, 7]

  • boxes_b – A Tensor of type float32. A tensor of shape [num_boxes_b, 7]

  • name – A name for the operation (optional).

Returns

A Tensor of type float32.

lingvo.tasks.car.ops.point_to_grid(points, num_points_per_cell, x_intervals, y_intervals, z_intervals, x_range, y_range, z_range, name=None)

Re-organize input points into equally spaced grids. Points in each grid cell are

shuffled. When not enough available points, the center of each cell with all 0 on feature dimensions will be used as padding.

The number of points in each grid cell. I.e., output_points[i, j, k, num_points[i, j, k]:, :] are padded points.

Parameters
  • points – A Tensor of type float32. [n, d]. d >= 3 and the first 3 dimensions are treated as x,y,z.

  • num_points_per_cell – An int. int. Number of points to keep in each cell.

  • x_intervals – An int. int. Number of cells along x-axis.

  • y_intervals – An int. int. Number of cells along y-axis.

  • z_intervals – An int. int. Number of cells along z-axis.

  • x_range – A list of floats. tuple of two scalars: (xmin, xmax). Spatial span of the grid.

  • y_range – A list of floats. tuple of two scalars: (ymin, ymax). Spatial span of the grid.

  • z_range – A list of floats. tuple of two scalars: (zmin, zmax). Spatial span of the grid.

  • name – A name for the operation (optional).

Returns

A tuple of Tensor objects (output_points, grid_centers, num_points).

output_points: A Tensor of type float32.

[x_intervals, y_intervals, z_intervals, num_per_grid, d].

grid_centers: A Tensor of type float32.

[x_intervals, y_intervals, z_intervals, 3]. Grid cell centers.

num_points: A Tensor of type int32.

[x_intervals, y_intervals, z_intervals].

lingvo.tasks.car.ops.non_max_suppression_3d(bboxes, scores, nms_iou_threshold, score_threshold, max_boxes_per_class, name=None)

Greedily selects the top subset of 3D (7 DOF format) bounding boxes per class.

This implementation is rotation and class aware, and for each class takes the best boxes that are above our score_threshold and also don’t overlap more than our nms_iou_threshold with any better scoring boxes.

Parameters
  • bboxes – A Tensor of type float32. A tf.float32 Tensor of shape [num_bboxes, 7] where the box is of format [center_x, center_y, center_z, dim_x, dim_y, dim_z, heading].

  • scores – A Tensor of type float32. A tf.float32 Tensor of shape [num_bboxes, num_classes] with a score per box for each class.

  • nms_iou_threshold – A Tensor of type float32. A tf.float32 Tensor of shape [num_classes] specifying the max overlap between two boxes we allow before saying these boxes overlap, and suppressing one of them.

  • score_threshold – A Tensor of type float32. A tf.float32 Tensor of shape [num_classes] specifying the minimum class score (per class) a box can have before it is removed.

  • max_boxes_per_class – An int. An integer specifying how many (at most) boxes to return for each class.

  • name – A name for the operation (optional).

Returns

A tuple of Tensor objects (bbox_indices, bbox_scores, valid_mask).

bbox_indices: A Tensor of type int32.

[num_classes, max_boxes_per_class] with the indices of selected boxes for each class.

bbox_scores: A Tensor of type float32.

[num_classes, max_boxes_per_class] with the score of selected boxes for each class.

valid_mask: A Tensor of type float32.

[num_classes, max_boxes_per_class] with a 1 for a valid box and a 0 for invalid boxes for each class.

lingvo.tasks.car.ops.average_precision3d(iou_threshold, groundtruth_bbox, groundtruth_imageid, groundtruth_ignore, prediction_bbox, prediction_imageid, prediction_ignore, prediction_score, num_recall_points=1, algorithm='KITTI', name=None)

Computes average precision for 3D bounding boxes.

The output PR is sorted by recall in descending order. When there isn’t enough data for num_recall_points + 1 sample points, this tensor will be zero-padded.

Parameters
  • iou_threshold – A Tensor of type float32. IoU threshold.

  • groundtruth_bbox – A Tensor of type float32. [N, 7]. N ground truth bounding boxes.

  • groundtruth_imageid – A Tensor of type int32. [N]. N image ids for ground truth bounding boxes.

  • groundtruth_ignore – A Tensor of type int32. [N]. Valid values are 0 - Don’t ignore; 1 - Ignore the first match; 2 - Ignore all matches.

  • prediction_bbox – A Tensor of type float32. [M, 7]. M predicted bounding boxes.

  • prediction_imageid – A Tensor of type int32. [M]. M image ids for the predicted bounding boxes.

  • prediction_ignore – A Tensor of type int32. [N]. The ignore types for predictions. Currently only used by the KITTI AP. Valid values are 0 - Don’t ignore; 1 - Ignore the first match.

  • prediction_score – A Tensor of type float32. [M]. M scores for each predicted bounding box.

  • num_recall_points – An optional int that is >= 1. Defaults to 1.

  • algorithm – An optional string. Defaults to "KITTI". string. One of [“KITTI”, “VOC”]. See this paper “Supervised learning and evaluation of KITTI’s cars detector with DPM”, Section III.A for the differences between KITTI AP and VOC AP.

  • name – A name for the operation (optional).

Returns

A tuple of Tensor objects (average_precision, precision_recall, score_and_hit).

average_precision: A Tensor of type float32.

A scalar. The AP metric.

precision_recall: A Tensor of type float32.

[num_recall_points, 2]. List of PR points.

score_and_hit: A Tensor of type float32.

[M, 2] Prediction score and corresponding binary indication whether a prediction detected a ground truth item.

lingvo.tasks.car.ops.sample_points(points, points_padding, num_seeded_points, center_selector, neighbor_sampler, num_centers, center_z_min, center_z_max, num_neighbors, max_distance, neighbor_algorithm='auto', random_seed=- 1, name=None)

Sample points among ‘points’.

Parameters
  • points – A Tensor of type float32. [B, N, K]. B is the batch size; N is the number of points; K is the number of dimensions of each point.

  • points_padding – A Tensor of type float32. [B, N]. 0/1 padding of points. If points_padding[b, i] is 0., points[b, i, :] are valid point coordinates. Otherwise, point[b, i, :] are all zeros.

  • num_seeded_points – A Tensor of type int32. If num_seeded_points > 0, then the first num_seeded_points in points are considered to be seeded in the FPS sampling. Note that we assume that these points are not padded, and do not check padding when seeding them.

  • center_selector – A string. Valid options - ‘farthest’, ‘uniform’.

  • neighbor_sampler – A string. Valid options - ‘uniform’, ‘closest’.

  • num_centers – An int. The number of centers to sample for each batch example (M).

  • center_z_min – A float. Points with z less than center_z_min are not considered for center selection.

  • center_z_max – A float. Points with z greater than center_z_max are not considered for center selection.

  • num_neighbors – An int. Sample these many points within the neighborhood (P).

  • max_distance – A float. Points with L2 distances from a center larger than this threshold are not considered to be in the neighborhood.

  • neighbor_algorithm – An optional string. Defaults to "auto". Valid options - ‘auto’, ‘hash’.

  • random_seed – An optional int. Defaults to -1. The random seed.

  • name – A name for the operation (optional).

Returns

A tuple of Tensor objects (center, center_padding, indices, indices_padding).

center: A Tensor of type int32.

[B, M]. the indices of selected centers.

center_padding: A Tensor of type float32.

[B, M]. If center_padding[b, i] is 0., center[b, i], indices[b, i, :] and indices_padding[b, i, :] are valid sampled center. Otherwise, center_padding[b, i] is 1.0, and center[b, i] and indices[b, i, :] are all zeros while indices_padding[b, i, :] are all 1.0.

indices: A Tensor of type int32.

[B, M, P]. the indices of selected points.

indices_padding: A Tensor of type float32.

[B, M, P]. 0/1 padding of indices.