ObjectDetection3DDataFilter Class
ml_debugger.data_filter.object_detection_3d.object_detection_3d_torch_data_filter.ObjectDetection3DTorchDataFilter
Bases: CommonObjectDetectionDataFilter, ObjectDetection3DTorchLogger
DataFilter for 3D object detection tasks using PyTorch models.
Combines CommonObjectDetectionDataFilter (query, filter, aggregation) with ObjectDetection3DTorchLogger (NMS, _parse_and_save_io_data, bbox hashing) to provide real-time filtering and Active Learning query support.
__init__(model, model_name, version_name, result_name=None, n_epoch='latest', filter_config=None, target_layers=None, additional_fields=None, auto_sync=False, force_table_recreate=False, api_endpoint=None, api_key=None, n_class=None, score_thresh=None, iou_thresh=None, max_detections_per_frame=None)
Initialize 3D object detection data filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Module
|
PyTorch model to trace. |
required |
|
str
|
Name of the ML model. |
required |
|
str
|
Version identifier for the ML model. |
required |
|
Optional[str]
|
The name of the existing evaluation result to retrieve. |
None
|
|
Union[str, Optional[int]]
|
Filter option for n_epoch value. |
'latest'
|
|
Optional[Union[BBoxStrategy, Dict[str, Any]]]
|
BBox-level aggregation strategy for real-time filtering. Controls bbox selection, aggregation, and sample-level threshold. Can be a BBoxStrategy instance, a dict, or None (defaults). |
None
|
|
Optional[Dict[str, str]]
|
Mapping of layer aliases to module paths. |
None
|
|
Optional[List[dict]]
|
Extra fields for database schema. |
None
|
|
bool
|
Enable background syncing of logged data. |
False
|
|
bool
|
Whether to drop and recreate existing tables. |
False
|
|
Optional[str]
|
URL of the service API for data upload. |
None
|
|
Optional[str]
|
API key for authenticating with the service. |
None
|
|
Optional[int]
|
Number of classes. Auto-detected from model if None. |
None
|
|
Optional[float]
|
Minimum score threshold for user-visible NMS decisions. |
None
|
|
Optional[float]
|
IoU threshold for user-visible NMS decisions. |
None
|
|
Optional[int]
|
Maximum detections per frame for user-visible output. |
None
|
__call__(model_input, input_ids, dataset_type='pool', **kwargs)
Invoke the data filter on a single inference, recording I/O data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Any
|
Input data for the model inference. |
required |
|
List[str]
|
Identifiers of each input data. Must not contain duplicates. |
required |
|
str
|
Identifier of input dataset. (e.g. 'pool') |
'pool'
|
|
Any
|
Additional keyword arguments for parsing and saving I/O data. |
{}
|
Returns:
| Type | Description |
|---|---|
Tuple[Any, List[Optional[bool]]]
|
Tuple of (model_output, filter_flags): - model_output: Raw model output. - filter_flags: List of booleans indicating if each input matches filter condition (True = matches / should be extracted, False = does not match, None = no filter configured or no bboxes). |
get_hooked_features(layer_name)
Retrieve the captured output for a given layer alias.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Alias of the layer whose activation was captured. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Activation data stored for the specified layer. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no activation has been captured for |
export(output_path=None)
Export extracted features into a ZIP archive.
Uses the internal n_epoch resolved during validator setup to
filter records, consistent with upload() and wait_for_save().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Optional[str]
|
Path or directory for saving the ZIP file. If no .zip extension, the default filename is appended. Defaults to cwd. |
None
|
Returns:
| Type | Description |
|---|---|
Optional[Path]
|
Path to the created ZIP file, or None on non-primary distributed ranks. |
wait_for_save(interval=3, *, force=False)
upload(*, force=False)
query(n_data, strategy, dataset_type='pool', type_cast=None, aggregation_level='input_id', sequence_mapping=None, sampling=None)
Sort and query dataset based on strategy with input or sequence aggregation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
int
|
Maximum number of images to query. |
required |
|
Union[str, BBoxStrategy, Dict[str, Any]]
|
Query strategy. - 'high_error_proba': Per-image max of error_proba, sorted descending. Returns images where at least one bbox has high error. - 'low_error_proba': Per-image min of error_proba, sorted ascending. Returns images where at least one bbox has low error. - BBoxStrategy: Full control over bbox selection, aggregation, target column, and sort order. - dict: Validated as BBoxStrategy. Unknown keys are rejected, unspecified fields use defaults. Example: {'target_column': 'det_error_proba', 'aggregation': 'mean', 'top_n': 3} |
required |
|
str
|
Filter of input dataset (e.g. 'pool'). |
'pool'
|
|
Optional[type]
|
Type for casting input_id (e.g. int). Ignored when aggregation_level="sequence". |
None
|
|
str
|
Result granularity. - "input_id": returns input_ids. - "sequence": returns sequence_ids. Requires sequence_mapping. |
'input_id'
|
|
Optional[SequenceMappingInput]
|
Required when aggregation_level="sequence". Pattern A: {sequence_id: [input_id, ...]} Pattern B: {input_id: sequence_id} |
None
|
|
Optional[Union[str, SamplingConfig, Dict[str, Any]]]
|
Sampling configuration for result selection. - None: Existing sort+top-N behavior (default). - "class_balanced": Per-class bbox pipeline with quota (string shorthand). - dict: Full configuration with method and options. Random requires dict form with min_value/max_value range filter: e.g. {"method": "random", "min_value": 0.5} e.g. {"method": "random", "min_value": 0.3, "max_value": 0.8} Class-balanced with options: e.g. {"method": "class_balanced", "min_per_class": 5, "seed": 42} |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List of input_ids or sequence_ids. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy is invalid or no data found. |
get_image_scores(strategy='high_error_proba', dataset_type='pool', aggregation_level='input_id', sequence_mapping=None)
Get input-level or sequence-level ranking values sorted by strategy.
Aggregates per-bbox target values to input-level ranking values using the specified strategy, and returns them as a sorted DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[str, BBoxStrategy, Dict[str, Any]]
|
Scoring strategy. - 'high_error_proba': Sort by error_proba descending (default). - 'low_error_proba': Sort by error_proba ascending. - BBoxStrategy: Full control over bbox selection, aggregation, target column, and sort order. - dict: Validated as BBoxStrategy, with optional sequence config. - file path: YAML/JSON strategy config. |
'high_error_proba'
|
|
str
|
Filter of input dataset (e.g. 'pool'). |
'pool'
|
|
str
|
Result granularity. - "input_id": input-level results. - "sequence": sequence-level results. Requires sequence_mapping. |
'input_id'
|
|
Optional[SequenceMappingInput]
|
Required when aggregation_level="sequence". Pattern A: {sequence_id: [input_id, ...]} Pattern B: {input_id: sequence_id} |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
aggregation_level="input_id": If target_column=="error_proba": [input_id, error_proba, pred_score]. Otherwise: [input_id, rank_value, pred_score]. |
DataFrame
|
aggregation_level="sequence": [sequence_id, rank_value, n_inputs_available, n_inputs_used]. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy is invalid, aggregation_level is invalid, or no data found. |