perceptionmetrics.datasets package¶
Submodules¶
perceptionmetrics.datasets.coco module¶
- class perceptionmetrics.datasets.coco.CocoDataset(annotation_file, image_dir, split='train')[source]¶
Bases:
ImageDetectionDatasetSpecific class for COCO-styled object detection datasets.
- Parameters:
annotation_file (str) – Path to the COCO-format JSON annotation file
image_dir (str) – Path to the directory containing image files
split (str) – Dataset split name (e.g., “train”, “val”, “test”)
- read_annotation(fname)[source]¶
Return bounding boxes and category indices for a given image ID.
This method uses COCO’s efficient indexing to load annotations on-demand. The COCO object maintains an internal index that allows for very fast annotation retrieval without needing a separate cache.
- Parameters:
fname (
str) – str (image_id in string form)- Return type:
Tuple[List[List[float]],List[int],List[int]]- Returns:
Tuple of (boxes, category_indices)
- perceptionmetrics.datasets.coco.build_coco_dataset(annotation_file, image_dir, coco_obj=None, split='train')[source]¶
Build dataset and ontology dictionaries from COCO dataset structure
- Parameters:
annotation_file (str) – Path to the COCO-format JSON annotation file
image_dir (str) – Path to the directory containing image files
coco_obj (COCO) – Optional pre-loaded COCO object to reuse
split (str) – Dataset split name (e.g., “train”, “val”, “test”)
- Returns:
Dataset DataFrame and ontology dictionary
- Return type:
Tuple[pd.DataFrame, dict]
perceptionmetrics.datasets.detection module¶
- class perceptionmetrics.datasets.detection.DetectionDataset(dataset, dataset_dir, ontology)[source]¶
Bases:
PerceptionDatasetAbstract perception detection dataset class.
- Parameters:
dataset (DataFrame)
dataset_dir (str)
ontology (dict)
- class perceptionmetrics.datasets.detection.ImageDetectionDataset(dataset, dataset_dir, ontology)[source]¶
Bases:
DetectionDatasetImage detection dataset class.
- Parameters:
dataset (DataFrame)
dataset_dir (str)
ontology (dict)
- class perceptionmetrics.datasets.detection.LiDARDetectionDataset(dataset, dataset_dir, ontology, is_kitti_format=True)[source]¶
Bases:
DetectionDatasetLiDAR detection dataset class.
- Parameters:
dataset (DataFrame)
dataset_dir (str)
ontology (dict)
is_kitti_format (bool)
perceptionmetrics.datasets.gaia module¶
- class perceptionmetrics.datasets.gaia.GaiaImageSegmentationDataset(dataset_fname)[source]¶
Bases:
ImageSegmentationDatasetSpecific class for GAIA-styled image segmentation datasets
- Parameters:
dataset_fname (str) – Parquet dataset filename
- class perceptionmetrics.datasets.gaia.GaiaLiDARSegmentationDataset(dataset_fname)[source]¶
Bases:
LiDARSegmentationDatasetSpecific class for GAIA-styled LiDAR segmentation datasets
- Parameters:
dataset_fname (str) – Parquet dataset filename
- perceptionmetrics.datasets.gaia.build_dataset(dataset_fname)[source]¶
Build dataset and ontology dictionaries from GAIA-like dataset structure
- Parameters:
dataset_fname (str) – Parquet dataset filename
- Returns:
Dataset dataframe and directory, and onotology
- Return type:
Tuple[pd.DataFrame, str, dict]
perceptionmetrics.datasets.generic module¶
- class perceptionmetrics.datasets.generic.GenericImageSegmentationDataset(data_suffix, label_suffix, ontology_fname, train_dataset_dir=None, val_dataset_dir=None, test_dataset_dir=None)[source]¶
Bases:
ImageSegmentationDatasetGeneric class for image segmentation datasets.
- Parameters:
data_suffix (str) – File suffix to be used to filter data
label_suffix (str) – File suffix to be used to filter labels
ontology_fname (
str) – JSON file containing either a list of classes or a dictionary with class names as keys and class indexes + rgb as valuestrain_dataset_dir (str) – Directory containing training data
val_dataset_dir (str, optional) – Directory containing validation data, defaults to None
test_dataset_dir (str, optional) – Directory containing test data, defaults to None
- class perceptionmetrics.datasets.generic.GenericLiDARSegmentationDataset(data_suffix, label_suffix, ontology_fname, train_dataset_dir=None, val_dataset_dir=None, test_dataset_dir=None)[source]¶
Bases:
LiDARSegmentationDatasetGeneric class for LiDAR segmentation datasets.
- Parameters:
data_suffix (str) – File suffix to be used to filter data
label_suffix (str) – File suffix to be used to filter labels
ontology_fname (
str) – JSON file containing either a list of classes or a dictionary with class names as keys and class indexes + rgb as valuestrain_dataset_dir (str) – Directory containing training data
val_dataset_dir (str, optional) – Directory containing validation data, defaults to None
test_dataset_dir (str, optional) – Directory containing test data, defaults to None
- perceptionmetrics.datasets.generic.build_dataset(data_suffix, label_suffix, ontology_fname, train_dataset_dir=None, val_dataset_dir=None, test_dataset_dir=None)[source]¶
Build dataset and ontology dictionaries
- Parameters:
train_dataset_dir (str) – Directory containing training data
data_suffix (str) – File suffix to be used to filter data
label_suffix (str) – File suffix to be used to filter labels
ontology_fname (str) – JSON file containing either a list of classes or a dictionary with class names as keys and class indexes + rgb as values
val_dataset_dir (str, optional) – Directory containing validation data, defaults to None
test_dataset_dir (str, optional) – Directory containing test data, defaults to None
- Returns:
Dataset and onotology
- Return type:
Tuple[dict, dict]
perceptionmetrics.datasets.goose module¶
- class perceptionmetrics.datasets.goose.GOOSEImageSegmentationDataset(train_dataset_dir=None, val_dataset_dir=None, test_dataset_dir=None)[source]¶
Bases:
ImageSegmentationDatasetSpecific class for GOOSE-styled image segmentation datasets. All data can be downloaded from the official webpage (https://goose-dataset.de): train -> https://goose-dataset.de/storage/goose_2d_train.zip val -> https://goose-dataset.de/storage/goose_2d_val.zip test -> https://goose-dataset.de/storage/goose_2d_test.zip
- Parameters:
train_dataset_dir (str) – Directory containing training data
val_dataset_dir (str, optional) – Directory containing validation data, defaults to None
test_dataset_dir (str, optional) – Directory containing test data, defaults to None
- class perceptionmetrics.datasets.goose.GOOSELiDARSegmentationDataset(train_dataset_dir=None, val_dataset_dir=None, test_dataset_dir=None, is_goose_ex=False)[source]¶
Bases:
LiDARSegmentationDatasetSpecific class for GOOSE-styled LiDAR segmentation datasets. All data can be downloaded from the official webpage (https://goose-dataset.de): train -> https://goose-dataset.de/storage/gooseEx_3d_train.zip val -> https://goose-dataset.de/storage/gooseEx_3d_val.zip test -> https://goose-dataset.de/storage/gooseEx_3d_test.zip
- Parameters:
train_dataset_dir (str) – Directory containing training data
val_dataset_dir (str, optional) – Directory containing validation data, defaults to None
test_dataset_dir (str, optional) – Directory containing test data, defaults to None
is_goose_ex (bool, optional) – Whether the dataset is GOOSE Ex or GOOSE, defaults to False
- perceptionmetrics.datasets.goose.build_dataset(data_type, data_suffix, label_suffix, train_dataset_dir=None, val_dataset_dir=None, test_dataset_dir=None, is_goose_ex=False)[source]¶
Build dataset and ontology dictionaries from GOOSE dataset structure
- Parameters:
train_dataset_dir (str) – Directory containing training data
data_type (str) – Data to be read (e.g. images or lidar)
data_suffix (str) – File suffix to be used to filter data (e.g., windshield_vis.png or vls128.bin)
label_suffix (str) – File suffix to be used to filter labels (e.g., vis_labelids.png or goose.label)
val_dataset_dir (str, optional) – Directory containing validation data, defaults to None
test_dataset_dir (str, optional) – Directory containing test data, defaults to None
is_goose_ex (bool, optional) – Whether the dataset is GOOSE Ex or GOOSE, defaults to False
- Returns:
Dataset and onotology
- Return type:
Tuple[dict, dict]
perceptionmetrics.datasets.perception module¶
- class perceptionmetrics.datasets.perception.PerceptionDataset(dataset, dataset_dir, ontology)[source]¶
Bases:
ABCAbstract perception dataset class.
- Parameters:
dataset (pd.DataFrame) – Segmentation/Detection dataset as a pandas DataFrame
dataset_dir (str) – Dataset root directory
ontology (dict) – Dataset ontology definition
- append(new_dataset)[source]¶
Append another dataset with common ontology
- Parameters:
new_dataset (Self) – Dataset to be appended
perceptionmetrics.datasets.rellis3d module¶
- class perceptionmetrics.datasets.rellis3d.Rellis3DImageSegmentationDataset(dataset_dir, split_dir, ontology_fname)[source]¶
Bases:
ImageSegmentationDatasetSpecific class for Rellis3D-styled image segmentation datasets. All data can be downloaded from the official repo (https://github.com/unmannedlab/RELLIS-3D): images -> https://drive.google.com/file/d/1F3Leu0H_m6aPVpZITragfreO_SGtL2yV labels -> https://drive.google.com/file/d/16URBUQn_VOGvUqfms-0I8HHKMtjPHsu5 split -> https://drive.google.com/file/d/1zHmnVaItcYJAWat3Yti1W_5Nfux194WQ ontology -> https://drive.google.com/file/d/1K8Zf0ju_xI5lnx3NTDLJpVTs59wmGPI6
- Parameters:
dataset_dir (str) – Directory where both RGB images and annotations have been extracted to
split_dir (str) – Directory where train, val, and test files (.lst) have been extracted to
ontology_fname (str) – YAML file contained in the ontology compressed directory
- class perceptionmetrics.datasets.rellis3d.Rellis3DLiDARSegmentationDataset(dataset_dir, split_dir, ontology_fname)[source]¶
Bases:
LiDARSegmentationDatasetSpecific class for Rellis3D-styled LiDAR segmentation datasets. All data can be downloaded from the official repo (https://github.com/unmannedlab/RELLIS-3D): points -> https://drive.google.com/file/d/1lDSVRf_kZrD0zHHMsKJ0V1GN9QATR4wH labels -> https://drive.google.com/file/d/12bsblHXtob60KrjV7lGXUQTdC5PhV8Er split -> https://drive.google.com/file/d/1raQJPySyqDaHpc53KPnJVl3Bln6HlcVS ontology -> https://drive.google.com/file/d/1K8Zf0ju_xI5lnx3NTDLJpVTs59wmGPI6
- Parameters:
dataset_dir (str) – Directory where both points and labels have been extracted to
split_dir (str) – Directory where train, val, and test files (.lst) have been extracted to
ontology_fname (str) – YAML file contained in the ontology compressed directory
- perceptionmetrics.datasets.rellis3d.build_dataset(dataset_dir, split_fnames, ontology_fname)[source]¶
Build dataset and ontology dictionaries from Rellis3D dataset structure
- Parameters:
dataset_dir (str) – Directory where both RGB images and annotations have been extracted to
split_fnames (
dict) – Dictionary that contains the paths where train, val, and test split files (.lst) have been extracted toontology_fname (str) – YAML file contained in the ontology compressed directory
- Returns:
Dataset and onotology
- Return type:
Tuple[dict, dict]
perceptionmetrics.datasets.rugd module¶
- class perceptionmetrics.datasets.rugd.RUGDImageSegmentationDataset(images_dir, labels_dir, ontology_fname, split_sequences={'creek': 'test', 'park-1': 'test', 'park-2': 'train', 'park-8': 'val', 'trail': 'train', 'trail-10': 'train', 'trail-11': 'train', 'trail-12': 'train', 'trail-13': 'test', 'trail-14': 'train', 'trail-15': 'train', 'trail-3': 'train', 'trail-4': 'train', 'trail-5': 'val', 'trail-6': 'train', 'trail-7': 'test', 'trail-9': 'train', 'village': 'train'})[source]¶
Bases:
ImageSegmentationDatasetSpecific class for RUGD-styled image segmentation dataset.
- Parameters:
images_dir (str) – Directory containing images
labels_dir (str) – Directory containing labels (in RGB format)
ontology_fname (str) – text file containing the dataset ontology (RUGD_annotation-colormap.txt)
split_sequences (dict, optional) – Dictionary containing the split sequences for train, val, and test, defaults to DEFAULT_SPLIT
- perceptionmetrics.datasets.rugd.build_dataset(data_dir, labels_dir, ontology_fname, split_sequences)[source]¶
Build dataset and ontology dictionaries
- Parameters:
data_dir (str) – Directory containing data
labels_dir (str) – Directory containing labels (in RGB format)
ontology_fname (str) – text file containing the dataset ontology (RUGD_annotation-colormap.txt)
split_sequences (dict) – Dictionary containing the split sequences for train, val, and test
- Returns:
Dataset and onotology
- Return type:
Tuple[dict, dict]
perceptionmetrics.datasets.segmentation module¶
- class perceptionmetrics.datasets.segmentation.ImageSegmentationDataset(dataset, dataset_dir, ontology, is_label_rgb=False)[source]¶
Bases:
SegmentationDatasetParent image segmentation dataset class
- Parameters:
dataset (pd.DataFrame) – Image segmentation dataset as a pandas DataFrame
dataset_dir (str) – Dataset root directory
ontology (dict) – Dataset ontology definition
is_label_rgb (bool, optional) – Whether the labels are in RGB format or not, defaults to False
- export(outdir, new_ontology=None, ontology_translation=None, classes_to_remove=None, resize=None, include_label_count=True)[source]¶
Export dataset dataframe and image files in SemanticKITTI format. Optionally, modify ontology before exporting.
- Parameters:
outdir (str) – Directory where Parquet and images files will be stored
new_ontology (dict) – Target ontology definition
ontology_translation (Optional[dict], optional) – Ontology translation dictionary, defaults to None
classes_to_remove (Optional[List[str]], optional) – Classes to remove from the old ontology, defaults to []
resize (Optional[Tuple[int, int]], optional) – Resize images and labels to the given dimensions, defaults to None
include_label_count (bool, optional) – Whether to include class weights in the dataset, defaults to True
- class perceptionmetrics.datasets.segmentation.LiDARSegmentationDataset(dataset, dataset_dir, ontology, is_kitti_format=True, has_intensity=True)[source]¶
Bases:
SegmentationDatasetParent lidar segmentation dataset class
- Parameters:
dataset (pd.DataFrame) – LiDAR segmentation dataset as a pandas DataFrame
dataset_dir (str) – Dataset root directory
ontology (dict) – Dataset ontology definition
is_kitti_format (bool, optional) – Whether the linked files in the dataset are stored in SemanticKITTI format or not, defaults to True
has_intensity (bool, optional) – Whether the point cloud files contain intensity values, defaults to True
- export(outdir, new_ontology=None, ontology_translation=None, classes_to_remove=[], include_label_count=True, remove_origin=False)[source]¶
Export dataset dataframe and LiDAR files in SemanticKITTI format. Optionally, modify ontology before exporting.
- Parameters:
outdir (str) – Directory where Parquet and LiDAR files will be stored
new_ontology (dict) – Target ontology definition
ontology_translation (Optional[dict], optional) – Ontology translation dictionary, defaults to None
classes_to_remove (Optional[List[str]], optional) – Classes to remove from the old ontology, defaults to []
include_label_count (bool, optional) – Whether to include class weights in the dataset, defaults to True
remove_origin (bool, optional) – Whether to remove the origin from the point cloud (mostly for removing RELLIS-3D spurious points), defaults to False
- class perceptionmetrics.datasets.segmentation.SegmentationDataset(dataset, dataset_dir, ontology)[source]¶
Bases:
PerceptionDatasetAbstract perception dataset class.
- Parameters:
dataset (DataFrame)
dataset_dir (str)
ontology (dict)
perceptionmetrics.datasets.wildscenes module¶
- class perceptionmetrics.datasets.wildscenes.WildscenesImageSegmentationDataset(dataset_dir, split_dir)[source]¶
Bases:
ImageSegmentationDatasetSpecific class for Wildscenes-styled image segmentation datasets. All data can be downloaded from the official repo: dataset -> https://data.csiro.au/collection/csiro:61541 split -> https://github.com/csiro-robotics/WildScenes/tree/main/data/splits/opt2d
- Parameters:
dataset_dir (str) – Directory where dataset images and labels are stored (Wildscenes2D)
split_dir (str) – Directory where train, val, and test files (.csv) have been extracted to
- class perceptionmetrics.datasets.wildscenes.WildscenesLiDARSegmentationDataset(dataset_dir, split_dir)[source]¶
Bases:
LiDARSegmentationDatasetSpecific class for Wildscenes-styled LiDAR segmentation datasets. All data can be downloaded from the official repo: dataset -> https://data.csiro.au/collection/csiro:61541 split -> https://github.com/csiro-robotics/WildScenes/tree/main/data/splits/opt3d
- Parameters:
dataset_dir (str) – Directory where dataset images and labels are stored (Wildscenes3D)
split_dir (str) – Directory where train, val, and test files (.csv) have been extracted to
- perceptionmetrics.datasets.wildscenes.build_dataset(dataset_dir, split_fnames, ontology)[source]¶
Build dataset and ontology dictionaries from Wildscenes dataset structure
- Parameters:
dataset_dir (str) – Directory where both RGB images and annotations have been extracted to
split_fnames (
dict) – Dictionary that contains the paths where train, val, and test split files (.csv) have been extracted toontology (dict) – Ontology definition as found in the official repo
- Returns:
Dataset and onotology
- Return type:
Tuple[dict, dict]
perceptionmetrics.datasets.yolo module¶
- class perceptionmetrics.datasets.yolo.YOLODataset(dataset_fname, dataset_dir, im_ext='jpg')[source]¶
Bases:
ImageDetectionDatasetSpecific class for YOLO-styled object detection datasets.
- Parameters:
dataset_fname (str) – Path to the YAML dataset configuration file
dataset_dir (Optional[str]) – Path to the directory containing images and annotations. If not provided, it will be inferred from the dataset file
im_ext (str) – Image file extension (default is “jpg”)
- read_annotation(fname, image_size=None)[source]¶
Return bounding boxes, and category indices for a given image ID.
- Parameters:
fname (str) – Annotation path
image_size (Optional[Tuple[int, int]]) – Corresponding image size in (w, h) format for converting relative bbox size to absolute. If not provided, we will assume image path
- Return type:
Tuple[List[List[float]],List[int],List[int]]- Returns:
Tuple of (boxes, category_indices)
- perceptionmetrics.datasets.yolo.build_dataset(dataset_fname, dataset_dir=None, im_ext='jpg')[source]¶
Build dataset and ontology dictionaries from YOLO dataset structure
- Parameters:
dataset_fname (str) – Path to the YAML dataset configuration file
dataset_dir (Optional[str]) – Path to the directory containing images and annotations. If not provided, it will be inferred from the dataset file
im_ext (str) – Image file extension (default is “jpg”)
- Returns:
Dataset DataFrame and ontology dictionary
- Return type:
Tuple[pd.DataFrame, dict]