Skip to content

Overview

MLdebugger is a platform that provides end-to-end support for AI model performance evaluation, debugging, and monitoring.

By analyzing the relationship between a model's internal features during inference and error codes, MLdebugger categorizes error patterns and optimizes data quality and model improvement actions for each category.

Key Use of MLdebugger

The MLdebugger SDK can be utilized in the following two phases of the ML model lifecycle.

Development Phase: Evaluate and Improve

Identify model weaknesses and drive improvement cycles

Collect inference logs and internal features using datasets with ground truth labels, and run model evaluations. The evaluation results classify each data point into an Issue Category.

Issue Category Meaning Model State
Coverage (Highly Stable / Stable) Areas where reliable predictions are possible No improvement needed
Hotspot (Unstable / Under-Confidence) Areas with unstable predictions that need improvement Improvable with additional data and retraining
Critical Hotspot (Over-Confidence) Dangerous areas where the model errs with high confidence Requires highest priority action

Supported Tasks: Classification, Object Detection (2D/3D)

Classes used in this workflow:

  1. ClassificationTracer / ObjectDetectionTracer / ObjectDetection3DTracer — Collect inference log data and annotation information
  2. Evaluator — Run evaluations
  3. Result — Review evaluation results (metrics and Issue Category)

Detailed evaluation results can also be visually explored in the web app (app.adansons.ai) through Heatmaps and error code distributions.

What you can do after evaluation:

  • DataFiltering — Data selection and filtering based on error patterns (ClassificationDataFilter / ObjectDetectionDataFilter / ObjectDetection3DDataFilter)

Production Phase: Monitor and Detect

Monitor model inference behavior during operation

Collect inference results in real-time from production environments and monitor model inference behavior in the web app.

Condition Available Features
Tracing + Evaluation Not Completed Basic metrics only (inference count, statistics, etc.)
Tracing + Evaluation Completed Basic metrics + Error Estimation features (error probability estimation, Issue Category classification)

Classes used in this workflow:

  1. ClassificationLogger / ObjectDetectionLogger / ObjectDetection3DLogger — Collect inference logs (wraps the model for seamless use like normal inference)

Gradual Adoption

Log Monitoring can start with basic monitoring without Tracing + Evaluation. You can later run Evaluation to enable Error Estimation features.

How to Get Started

  1. Installation — Install the Python SDK
  2. Authentication — Configure API Key and Endpoint
  3. Tracing + Evaluation — Classification / Object Detection / 3D Object Detection
  4. DataFiltering — Data selection and filtering based on error patterns
  5. Logging — Inference log monitoring during operation