obs_utils
create_video_writer(fpath, resolution, codec_name='libx264', rate=30, pix_fmt='yuv420p', stream_options=None, context_options=None)
Creates a video writer to write video frames to when playing back the dataset using PyAV
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fpath
|
str
|
Absolute path that the generated video writer will write to. Should end in .mp4 or .mkv |
required |
resolution
|
tuple
|
Resolution of the video frames to write (height, width) |
required |
codec_name
|
str
|
Codec to use for the video writer. Default is "libx264" |
'libx264'
|
rate
|
int
|
Frame rate of the video writer. Default is 30 |
30
|
pix_fmt
|
str
|
Pixel format to use for the video writer. Default is "yuv420p" |
'yuv420p'
|
stream_options
|
dict
|
Additional stream options to pass to the video writer. Default is None |
None
|
context_options
|
dict
|
Additional context options to pass to the video writer. Default is None |
None
|
Returns: av.Container: PyAV container object that can be used to write video frames av.Stream: PyAV stream object that can be used to write video frames
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
decode_depth_frame(frame)
Decodes a depth frame by extracting the quantized depth from the Y plane and then dequantizing it back to float values. Args: frame (np.ndarray): Encoded depth tensor of shape (H, W) with uint16 values. Returns: np.ndarray: Decoded depth tensor of shape (H, W) with float values.
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
depth_to_pcd(depth, rel_pose, K)
Convert depth images to point clouds with batch processing support. Args: depth: (B, H, W) depth tensor rel_pose: (B, 7) relative pose from camera to base tensor [pos, quat] K: (3, 3) camera intrinsics tensor max_depth: maximum depth value to filter Returns: pc: (B, H, W, 3) point cloud tensor in base frame
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
dequantize_depth(quantized_depth, min_depth=MIN_DEPTH, max_depth=MAX_DEPTH, shift=DEPTH_SHIFT)
Dequantizes a 12-bit depth tensor back to the original depth values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
quantized_depth
|
ndarray or tensor
|
Quantized depth tensor. |
required |
min_depth
|
float
|
Minimum depth value. |
MIN_DEPTH
|
max_depth
|
float
|
Maximum depth value. |
MAX_DEPTH
|
shift
|
float
|
Small value to shift depth to avoid log(0). |
DEPTH_SHIFT
|
Returns: np.ndarray or th.tensor: Dequantized depth tensor.
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
downsample_pcd(color_pcd, num_points, use_fps=True)
Downsample point clouds with batch FPS processing or random sampling.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
color_pcd
|
(B, [T], N, 6) point cloud tensor [rgb, xyz] for each batch |
required | |
num_points
|
target number of points |
required |
Returns: color_pcd: (B, num_points, 6) downsampled point cloud sampled_idx: (B, num_points) sampled indices
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
encode_depth_frame(depth)
Encodes a depth frame by quantizing it to a 12-bit range and then packing it into YUV420p12le format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
depth
|
ndarray or tensor
|
Depth tensor of shape (H, W) with float values. |
required |
Returns: av.VideoFrame: Encoded depth frame in YUV420p12le format, where the Y plane contains the quantized depth values.
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
instance_id_to_instance(obs, instance_id_mapping, unique_ins_ids)
Instance_id segmentation map each unique visual meshes of objects (e.g. /World/scene_name/object_name/visual_mesh_0) This function merges all visual meshes of the same object instance to a single instance id. Args: obs (th.Tensor): (N, H, W) instance_id segmentation instance_id_mapping (Dict[int, str]): Dict mapping instance_id ids to instance names Returns: instance_seg (th.Tensor): (N, H, W) instance segmentation instance_mapping (Dict[int, str]): Dict mapping instance ids to instance names
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
instance_to_bbox(obs, instance_mapping, unique_ins_ids)
Convert instance segmentation to bounding boxes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
Tensor
|
(N, H, W) tensor of instance IDs |
required |
instance_mapping
|
Dict[int, str]
|
Dict mapping instance IDs to instance names Note: this does not need to include all instance IDs, only the ones that we want to generate bbox for |
required |
unique_ins_ids
|
List[int]
|
List of unique instance IDs |
required |
Returns: List of N lists, each containing tuples (x_min, y_min, x_max, y_max, instance_id) for each instance
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
instance_to_semantic(obs, instance_mapping, unique_ins_ids, is_instance_id=True)
Convert instance / instance id segmentation to semantic segmentation. Args: obs (th.Tensor): (N, H, W) instance / instance_id segmentation instance_mapping (Dict[int, str]): Dict mapping instance IDs to instance names unique_ins_ids (List[int]): List of unique instance IDs is_instance_id (bool): Whether the input is instance id segmentation Returns: semantic_seg (th.Tensor): (N, H, W) semantic segmentation
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
process_fused_point_cloud(obs, camera_intrinsics, pcd_range, pcd_num_points=None, use_fps=True, verbose=False)
Given a dictionary of observations, process the fused point cloud from all cameras and return the final point cloud tensor in robot base frame. Args: obs (dict): Dictionary of observations containing point cloud data from different cameras. camera_intrinsics (Dict[str, th.Tensor]): Dictionary of camera intrinsics for each camera. pcd_range (Tuple[float, float, float, float, float, float]): Range of the point cloud to filter [x_min, x_max, y_min, y_max, z_min, z_max]. pcd_num_points (Optional[int]): Number of points to sample from the point cloud. If None, no downsampling is performed. use_fps (bool): Whether to use farthest point sampling for point cloud downsampling. Default is True. verbose (bool): Whether to print verbose output during processing. Default is False.
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
quantize_depth(depth, min_depth=MIN_DEPTH, max_depth=MAX_DEPTH, shift=DEPTH_SHIFT)
Quantizes depth values to a 12-bit range (0 to 4096) based on the specified min and max depth.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
depth
|
ndarray or tensor
|
Depth tensor. |
required |
min_depth
|
float
|
Minimum depth value. |
MIN_DEPTH
|
max_depth
|
float
|
Maximum depth value. |
MAX_DEPTH
|
shift
|
float
|
Small value to shift depth to avoid log(0). |
DEPTH_SHIFT
|
Returns: np.ndarray: Quantized depth tensor.
Source code in OmniGibson/omnigibson/learning/utils/obs_utils.py
write_video(obs, video_writer, mode='rgb', batch_size=None, **kwargs)
Writes videos to the specified video writers using the current trajectory history
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
ndarray
|
Observation data |
required |
video_writer
|
(container, stream)
|
PyAV container and stream objects to write video frames to |
required |
mode
|
str
|
Mode to write video frames to. Only "rgb", "depth" and "seg" are supported. |
'rgb'
|
batch_size
|
int
|
Batch size to write video frames to. If None, write video frames to the entire video. |
None
|
kwargs
|
dict
|
Additional keyword arguments to pass to the video writer. |
{}
|