ObjectDetectionDataFilter Class
ml_debugger.data_filter.object_detection.object_detection_torch_data_filter.ObjectDetectionTorchDataFilter
Bases: CommonObjectDetectionDataFilter, ObjectDetectionTorchLogger
DataFilter for object detection tasks using PyTorch models.
Combines CommonObjectDetectionDataFilter (query, filter, aggregation) with ObjectDetectionTorchLogger (NMS, _parse_and_save_io_data, bbox hashing) to provide real-time filtering and Active Learning query support.
__init__(model, model_name, version_name, result_name=None, n_epoch='latest', filter_config=None, target_layers=None, additional_fields=None, auto_sync=False, force_table_recreate=False, api_endpoint=None, api_key=None, n_class=None, score_thresh=None, iou_thresh=None, max_detections_per_image=None)
Initialize object detection data filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Module
|
PyTorch model to trace. |
required |
|
str
|
Name of the ML model. |
required |
|
str
|
Version identifier for the ML model. |
required |
|
Optional[str]
|
The name of the existing evaluation result to retrieve. |
None
|
|
Union[str, Optional[int]]
|
Filter option for n_epoch value. |
'latest'
|
|
Optional[Union[BBoxStrategy, Dict[str, Any]]]
|
BBox-level aggregation strategy for real-time filtering. Controls bbox selection, aggregation, and image-level threshold. Can be a BBoxStrategy instance, a dict, or None (defaults). |
None
|
|
Optional[Dict[str, str]]
|
Mapping of layer aliases to module paths. |
None
|
|
Optional[List[dict]]
|
Extra fields for database schema. |
None
|
|
bool
|
Enable background syncing of logged data. |
False
|
|
bool
|
Whether to drop and recreate existing tables. |
False
|
|
Optional[str]
|
URL of the service API for data upload. |
None
|
|
Optional[str]
|
API key for authenticating with the service. |
None
|
|
Optional[int]
|
Number of classes. Auto-detected from model if None. |
None
|
|
Optional[float]
|
Minimum score threshold for user-visible NMS decisions. |
None
|
|
Optional[float]
|
IoU threshold for user-visible NMS decisions. |
None
|
|
Optional[int]
|
Maximum detections per image for user-visible output. |
None
|
__call__(model_input, input_ids, dataset_type='pool', **kwargs)
Invoke the data filter on a single inference, recording I/O data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Any
|
Input data for the model inference. |
required |
|
List[str]
|
Identifiers of each input data. Must not contain duplicates. |
required |
|
str
|
Identifier of input dataset. (e.g. 'pool') |
'pool'
|
|
Any
|
Additional keyword arguments for parsing and saving I/O data. |
{}
|
Returns:
| Type | Description |
|---|---|
Tuple[Any, List[Optional[bool]]]
|
Tuple of (model_output, filter_flags): - model_output: Raw model output. - filter_flags: List of booleans indicating if each input matches filter condition (True = matches / should be extracted, False = does not match, None = no filter configured or no bboxes). |
Raises:
| Type | Description |
|---|---|
ValueError
|
If input_ids contains duplicates. |
get_hooked_features(layer_name)
Retrieve the captured output for a given layer alias.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
str
|
Alias of the layer whose activation was captured. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
Activation data stored for the specified layer. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If no activation has been captured for |
export(output_path=None)
Export extracted features into a ZIP archive.
Uses the internal n_epoch resolved during validator setup to
filter records, consistent with upload() and wait_for_save().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Optional[str]
|
Path or directory for saving the ZIP file. If no .zip extension, the default filename is appended. Defaults to cwd. |
None
|
Returns:
| Type | Description |
|---|---|
Optional[Path]
|
Path to the created ZIP file, or None on non-primary distributed ranks. |
wait_for_save(interval=3)
upload()
query(n_data, strategy, dataset_type='pool', type_cast=None)
Sort and query dataset based on strategy with image-level aggregation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
int
|
Maximum number of images to query. |
required |
|
Union[str, BBoxStrategy, Dict[str, Any]]
|
Query strategy. - 'high_error_proba': Per-image max of error_proba, sorted descending. Returns images where at least one bbox has high error. - 'low_error_proba': Per-image min of error_proba, sorted ascending. Returns images where at least one bbox has low error. - BBoxStrategy: Full control over bbox selection, aggregation, target column, and sort order. - dict: Validated as BBoxStrategy. Unknown keys are rejected, unspecified fields use defaults. Example: {'target_column': 'det_error_proba', 'aggregation': 'mean', 'top_n': 3} |
required |
|
str
|
Filter of input dataset (e.g. 'pool'). |
'pool'
|
|
Optional[type]
|
Type for casting input_id (e.g. int). |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List of input_ids of queried images. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy is invalid or no data found. |
get_image_scores(strategy='high_error_proba', dataset_type='pool')
Get image-level error scores sorted by the given strategy.
Aggregates per-bbox error probabilities to image-level scores using the specified strategy, and returns them as a sorted DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
|
Union[str, BBoxStrategy, Dict[str, Any]]
|
Scoring strategy. - 'high_error_proba': Sort by error_proba descending (default). - 'low_error_proba': Sort by error_proba ascending. - BBoxStrategy: Full control over bbox selection, aggregation, target column, and sort order. - dict: Validated as BBoxStrategy. |
'high_error_proba'
|
|
str
|
Filter of input dataset (e.g. 'pool'). |
'pool'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns [input_id, error_proba, pred_score], |
DataFrame
|
sorted by error_proba according to the strategy's sort order. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If strategy is invalid or no data found. |