lingvo.tasks.car.kitti_input_generator module¶
Input generator for KITTI data.
-
lingvo.tasks.car.kitti_input_generator.ComputeKITTIDifficulties(box_image_height, occlusion, truncation)[source]¶ Compute difficulties from box height, occlusion, and truncation.
-
class
lingvo.tasks.car.kitti_input_generator.KITTILaserExtractor(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.input_extractor.LaserExtractorBase extractor for the laser points from a KITTI tf.Example.
-
class
lingvo.tasks.car.kitti_input_generator.KITTIImageExtractor(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.input_extractor.FieldsExtractorExtracts the image information (left camera) from a KITTI tf.Example.
- Produces:
image: [512, 1382, 3] - Floating point Tensor containing image data. Note that image may not be produced if decode_image is set to False. During training, we may not want to decode the images.
width: [1] - integer scalar width of the original image.
height: [1] - integer scalar width of the original image.
velo_to_image_plane: [3, 4] - transformation matrix from velo xyz to image plane xy. After multiplication, you need to divide by last coordinate to recover 2D pixel locations.
velo_to_camera: [4, 4] - transformation matrix from velo xyz to camera xyz.
camera_to_velo: [4, 4] - transformation matrix from camera xyz to velo xyz.
-
_KITTI_MAX_HEIGHT= 512¶
-
_KITTI_MAX_WIDTH= 1382¶
-
class
lingvo.tasks.car.kitti_input_generator.KITTILabelExtractor(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.input_extractor.FieldsExtractorExtracts the object labels from a KITTI tf.Example.
- Emits:
bboxes_count: Scalar number of 2D bounding boxes in the example.
bboxes: [p.max_num_objects, 4] - 2D bounding box data in [ymin, xmin, ymax, xmax] format.
bboxes_padding: [p.max_num_objects] - Padding for bboxes.
bboxes_3d: [p.max_num_objects, 7] - 3D bounding box data in [x, y, z, dx, dy, dz, phi] format. x, y, z are the object center; dx, dy, dz are the dimensions of the box, and phi is the rotation angle around the z-axis. 3D bboxes are defined in the velodyne coordinate frame.
bboxes_3d_mask: [p.max_num_objects] - Mask for bboxes (mask is the inversion of padding).
bboxes3d_proj_to_image_plane: [p.max_num_objects, 8, 2] - For each bounding box, the 8 corners of the bounding box in projected image coordinates (x, y).
bboxes_td: [p.max_num_objects, 4] - The 3D bounding box data in top down projected coordinates (ymin, xmin, ymax, xmax). This currently ignores rotation.
bboxes_td_mask: [p.max_num_objects]: Mask for bboxes_td.
bboxes_3d_num_points: [p.max_num_objects]: Number of points in each box.
labels: [p.max_num_objects] - Integer label for each bounding box object corresponding to the index in KITTI_CLASS_NAMES.
texts: [p.max_num_objects] - The class name for each label in labels.
source_id: Scalar string. The unique identifier for each example.
See ComputeKITTIDifficulties for more info of the following:
box_image_height: [p.max_num_objects] - The height of the box in pixels of each box in the projected image plane. occlusion: [p.max_num_objects] - The occlusion level of each bounding box. truncation: [p.max_num_objects] - The truncation level of each bounding box. difficulties: [p.max_num_objects] - The computed difficulty based on the above three factors.
-
KITTI_CLASS_NAMES= ['Background', 'Car', 'Van', 'Truck', 'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram', 'Misc', 'DontCare']¶
-
SUBCLASS_DICT= {'cyclist': [6], 'human': [4, 5], 'motor': [1, 2, 3, 7], 'pedestrian': [4]}¶
-
class
lingvo.tasks.car.kitti_input_generator.KITTIBase(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.base_extractor._BaseExtractorKITTI dataset base parameters.
-
classmethod
Params(*args, **kwargs)[source]¶ Defaults params.
- Parameters
extractors – An hyperparams.Params of extractor names to Extractors. A few extractor types are required: ‘labels’: A LabelExtractor.Params().
- Returns
A base_layer Params object.
-
property
class_names¶
-
classmethod
-
class
lingvo.tasks.car.kitti_input_generator.KITTILaser(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.kitti_input_generator.KITTIBaseKITTI object detection dataset.
This class emits KITTI images, labels, and the raw laser representation of the data. See KITTIGrid and KITTISparse for alternative laser representations.
- Input batch contains outputs from:
KITTIImageExtractor
KITTILabelExtractor
KITTILaserExtractor
-
property
class_names¶
-
class
lingvo.tasks.car.kitti_input_generator.KITTISparseLaser(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.kitti_input_generator.KITTIBaseKITTI object detection dataset for sparse detection models.
This class emits KITTI images, labels, and the sparse laser representation of the data. See KITTIGrid and KITTISparse for alternative laser representations.
- Input batch contains outputs from:
KITTILabelExtractor
KITTILaserExtractor
- Transformed with:
Metadata annotation: - CountNumberOfPointsInBoxes3D
Visualization: - CreateDecoderCopy
Sparse gather of points for featurization: - SparseCenterSelector - SparseCellGatherFeatures
Anchor creation for classification regression targets: - TileAnchorBBoxes - AnchorAssignment
-
class
lingvo.tasks.car.kitti_input_generator.KITTIGrid(*args, **kwargs)[source]¶ Bases:
lingvo.tasks.car.kitti_input_generator.KITTIBaseKITTI object detection dataset.
This class emits KITTI images, labels, and the fixed grid laser representation of the data.
- Input batch contains outputs from:
KITTILabelExtractor
KITTILaserExtractor
- Transformed with:
Metadata annotation: - CountNumberOfPointsInBoxes3D
Visualization: - CreateDecoderCopy
Points to Pillars - PointsToGrid - GridToPillars
Anchor creation for classification regression targets: - GridAnchorCenters - TileAnchorBBoxes - AnchorAssignment