Overview

MLdebugger is a platform that provides end-to-end support for AI model performance evaluation, debugging, and monitoring.

By analyzing the relationship between a model's internal features during inference and error codes, MLdebugger categorizes error patterns and optimizes data quality and model improvement actions for each category.

Key Use of MLdebugger

The MLdebugger SDK can be utilized in the following two phases of the ML model lifecycle.

Development Phase: Evaluate and Improve

Identify model weaknesses and drive improvement cycles

Collect inference logs and internal features using datasets with ground truth labels, and run model evaluations. The evaluation results classify each data point into an Issue Category.

Issue Category	Meaning	Model State
Coverage (Highly Stable / Stable)	Areas where reliable predictions are possible	No improvement needed
Hotspot (Unstable / Under-Confidence)	Areas with unstable predictions that need improvement	Improvable with additional data and retraining
Critical Hotspot (Over-Confidence)	Dangerous areas where the model errs with high confidence	Requires highest priority action

Supported Tasks: Classification, Object Detection (2D/3D)

Classes used in this workflow:

ClassificationTracer / ObjectDetectionTracer / ObjectDetection3DTracer — Collect inference log data and annotation information
Evaluator — Run evaluations
Result — Review evaluation results (metrics and Issue Category)

Detailed evaluation results can also be visually explored in the web app (app.adansons.ai) through Heatmaps and error code distributions.

What you can do after evaluation:

DataFiltering — Data selection and filtering based on error patterns (ClassificationDataFilter / ObjectDetectionDataFilter / ObjectDetection3DDataFilter)

Production Phase: Monitor and Detect

Monitor model inference behavior during operation

Collect inference results in real-time from production environments and monitor model inference behavior in the web app.

Condition	Available Features
Tracing + Evaluation Not Completed	Basic metrics only (inference count, statistics, etc.)
Tracing + Evaluation Completed	Basic metrics + Error Estimation features (error probability estimation, Issue Category classification)

Classes used in this workflow:

ClassificationLogger / ObjectDetectionLogger / ObjectDetection3DLogger — Collect inference logs (wraps the model for seamless use like normal inference)

Gradual Adoption

Log Monitoring can start with basic monitoring without Tracing + Evaluation. You can later run Evaluation to enable Error Estimation features.

How to Get Started

Installation — Install the Python SDK
Authentication — Configure API Key and Endpoint
Tracing + Evaluation — Classification / Object Detection / 3D Object Detection
DataFiltering — Data selection and filtering based on error patterns
Logging — Inference log monitoring during operation