lingvo.tasks.car.tools.kitti_data module¶
Library for parsing KITTI raw data.
-
lingvo.tasks.car.tools.kitti_data.LoadVeloBinFile(filepath)[source]¶ Reads and parse raw KITTI velodyne binary file.
- Parameters
filepath – Path to a raw KITTI velodyne binary file.
- Returns
A dictionary with keys xyz and reflectance containing numpy arrays.
-
lingvo.tasks.car.tools.kitti_data.LoadLabelFile(filepath)[source]¶ Reads and parse raw KITTI label file.
The ordering of the arrays for bbox, dimensions, and location follows the order in the table below. We refer to the length (dx), width (dy), height (dz) for clarity.
Each line in the label contains (per KITTI documentation):
Values
Name
Description
1
type
Describes the type of object: ‘Car’, ‘Van’, ‘Truck’, ‘Pedestrian’, ‘Person_sitting’, ‘Cyclist’, ‘Tram’, ‘Misc’ or ‘DontCare’
1
truncated
Float from 0 (non-truncated) to 1 (truncated), where truncated refers to the object leaving image boundaries.
1
occluded
Integer (0,1,2,3) indicating occlusion state: 0 = fully visible, 1 = partly occluded 2 = largely occluded, 3 = unknown
1
alpha
Observation angle of object, ranging [-pi..pi]
4
bbox
2D bounding box of object in the image (0-based index): left, top, right, bottom pixel coordinates.
3
dimensions
3D object dimensions: height, width, length (meters)
3
location
3D object location x,y,z in camera coordinates (in meters)
1
rotation_y
Rotation ry around Y-axis in camera coordinates [-pi..pi]
1
score
Only for results: Float, indicating confidence in detection, needed for p/r curves, higher is better.
- Parameters
filepath – Path to a raw KITTI label file.
- Returns
A list of dictionary with keys corresponding to the name column above. type, truncated, occluded, alpha, bbox, dimensions, location, rotation_y, score. Note that the order of the floats in bbox, dimensions, and location correspond to that in the doc-string above.
-
lingvo.tasks.car.tools.kitti_data._ValidateLabeledObject(obj)[source]¶ Validate that obj has expected values.
-
lingvo.tasks.car.tools.kitti_data.ParseCalibrationDict(raw_calib)[source]¶ Parse transformation matrices in a raw KITTI calibration dictionary.
Per the KITTI documentation:
All matrices are stored row-major, i.e., the first values correspond to the first row. R0_rect contains a 3x3 matrix which you need to extend to a 4x4 matrix by adding a 1 as the bottom-right element and 0’s elsewhere. Tr_xxx is a 3x4 matrix (R|t), which you need to extend to a 4x4 matrix in the same way.
IMPORTANT: The coordinates in the camera coordinate system can be projected in the image by using the 3x4 projection matrix in the calib folder, where for the left color camera for which the images are provided, P2 must be used.
- Parameters
raw_calib – A dictionary of raw KITTI calibration values with keys P0, P1, P2, P3, R0_rect, Tr_imu_to_velo, and Tr_velo_to_cam containing flattened matrices of appropriate size.
- Returns
A dictionary with keys P0, P1, P2, P3, R0_rect, Tr_imu_to_velo, and Tr_velo_to_cam containing reshaped and extended matrices.
-
lingvo.tasks.car.tools.kitti_data.LoadCalibrationFile(filepath)[source]¶ Read and parse a raw KITTI calibration file.
- Parameters
filepath – Path to a raw KITTI calibration file.
- Returns
A dictionary with keys P0, P1, P2, P3, R0_rect, Tr_imu_to_velo, and Tr_velo_to_cam containing reshaped and extended transformation matrices.
-
lingvo.tasks.car.tools.kitti_data.VeloToImagePlaneTransformation(calib)[source]¶ Compute the transformation matrix from velo xyz to image plane xy.
Per the KITTI documentation, to project a point from Velodyne coordinates into the left color image, you can use this formula:
x = P2 * R0_rect * Tr_velo_to_cam * y
After applying the transformation, you will need to divide the by the last coordinate to recover the 2D pixel locations.
- Parameters
calib – A calibration dictionary returned by LoadCalibrationFile.
- Returns
A numpy 3x4 transformation matrix.
-
lingvo.tasks.car.tools.kitti_data.VeloToCameraTransformation(calib)[source]¶ Compute the transformation matrix from velo xyz to camera xyz.
Per the KITTI documentation, to project a point from Velodyne coordinates into the left color image, you can use this formula:
x = P2 * R0_rect * Tr_velo_to_cam * y
NOTE: The above formula further projects the xyz point to the image plane using P2, which we do not apply in this function since we are working with xyz (3D coordinates).
- Parameters
calib – A calibration dictionary returned by LoadCalibrationFile.
- Returns
A numpy 4x4 transformation matrix.
-
lingvo.tasks.car.tools.kitti_data.CameraToVeloTransformation(calib)[source]¶ Compute the transformation matrix from camera to velo.
This is the inverse transformation of CameraToVeloTransformation.
- Parameters
calib – A calibartion dictionary returned by LoadCalibrationFile.
- Returns
A numpy 4x4 transformation matrix.
-
lingvo.tasks.car.tools.kitti_data.AnnotateKITTIObjectsWithBBox3D(objects, calib)[source]¶ Add our canonical bboxes 3d format to KITTI objects.
The annotated bboxes 3d are in the velodyne coordinate frame.
- Parameters
objects – A list of KITTI objects returned by LoadLabelFile.
calib – A calibartion dictionary returned by LoadCalibrationFile.
- Returns
The original list of KITTI objects, where each object has new keys ‘has_3d_info’ indicating if the object has valid 3D bounding box data, and ‘bbox3d’ which corresponds to our canonical bboxes 3d format.
-
lingvo.tasks.car.tools.kitti_data._KITTIObjectHas3DInfo(obj)[source]¶ Check whether KITTI object has valid 3D bounding box information.