Read Detection Boxes from Model Output

Field	Value
Difficulty	Intermediate
Estimated Read Time	15-20 minutes
Labels	`postprocessing`, `boxdecode`, `detection`

Concept

Decode raw model output into usable bounding boxes using SimaBoxDecode — thresholding, NMS, and coordinate mapping built into one postprocessing stage. Read both decoded tensors and the raw byte format so you can handle any runtime shape.

BoxDecode is a highly optimized detection postprocessing path for vision workloads. It transforms inference tensors into final bounding-box results with thresholding and NMS in a single step.

Common box-decode controls in this tutorial:

decode_type (for example yolov8): selects model-family decode behavior.
score_threshold: drops low-confidence detections early.
nms_iou_threshold: controls overlap suppression aggressiveness.
top_k: limits final detection count for deterministic downstream cost.
original_width, original_height: maps decoded boxes to the source image coordinate space.

Use-case guidance

Too many noisy boxes: increase score_threshold and/or reduce top_k.
Duplicate overlapping boxes: lower nms_iou_threshold to make suppression stricter.
Missed true positives: decrease score_threshold cautiously.
Boxes appear scaled/offset incorrectly: verify original_width and original_height match real source frames.
Porting between detector variants: ensure decode_type matches the model family expected by the MPK.

APIs introduced

pyneat.ModelOptions() with .decode_type, .score_threshold, .nms_iou_threshold, .top_k, .original_width/height.
pyneat.Tensor.from_numpy(array, image_format=...) — build the input tensor.
sample.tensor with dtype=UInt8 — the packed BBOX byte buffer (wire format documented below).

Prerequisites Chapter 001. Chapter 004 for ModelOptions basics.

References

Learning Process

Configure model/postproc options for a detector-style pipeline.
Run deterministic preproc + inference + boxdecode flow.
Inspect decoded output signals (box count, output kind/fields).

Run

Python:

python3 share/sima-neat/tutorials/006_read_detection_boxes/read_detection_boxes.py \
  --mpk /tmp/yolo_v8s_mpk.tar.gz --width 640 --height 640

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_006_read_detection_boxes \
  --mpk /tmp/yolo_v8s_mpk.tar.gz --image /path/to/frame.jpg

C++ (build from source):

./build.sh --target tutorial_006_read_detection_boxes
./build/tutorials-standalone/tutorial_006_read_detection_boxes \
  --mpk /tmp/yolo_v8s_mpk.tar.gz --image /path/to/frame.jpg

To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.

Code

tutorials/006_read_detection_boxes/read_detection_boxes.cpp
// Decompose model execution into stages: Preproc -> Infer -> BoxDecode.
//
// Usage:
//   tutorial_006_read_detection_boxes --mpk /path/to/yolo_v8s.tar.gz --image /path/to.jpg

#include "neat.h"

#include "pipeline/StageRun.h"

#include <opencv2/imgcodecs.hpp>

#include <iostream>
#include <stdexcept>
#include <string>

namespace {

bool get_arg(int argc, char** argv, const std::string& key, std::string& out) {
  for (int i = 1; i + 1 < argc; ++i) {
    if (key == argv[i]) {
      out = argv[i + 1];
      return true;
    }
  }
  return false;
}

} // namespace

int main(int argc, char** argv) {
  try {
    std::string mpk, image;
    if (!get_arg(argc, argv, "--mpk", mpk) || !get_arg(argc, argv, "--image", image)) {
      std::cerr << "Usage: tutorial_006_read_detection_boxes --mpk <path> --image <path>\n";
      return 1;
    }

    cv::Mat bgr = cv::imread(image, cv::IMREAD_COLOR);
    if (bgr.empty())
      throw std::runtime_error("failed to load image: " + image);

    simaai::neat::Model::Options opt;
    opt.preprocess.color_convert.input_format = simaai::neat::PreprocessColorFormat::BGR;
    opt.preprocess.input_max_width = bgr.cols;
    opt.preprocess.input_max_height = bgr.rows;
    opt.preprocess.input_max_depth = bgr.channels();
    opt.decode_type = simaai::neat::BoxDecodeType::YoloV8;

    simaai::neat::Model model(mpk, opt);

    // CORE LOGIC
    // Stage-by-stage: each stages::* call runs one piece of the model pipeline.
    simaai::neat::TensorList pre = simaai::neat::stages::Preproc(std::vector<cv::Mat>{bgr}, model);
    simaai::neat::SampleList infer_samples = simaai::neat::stages::Infer(
        simaai::neat::SampleList{simaai::neat::sample_from_tensors(pre)}, model);
    if (infer_samples.empty())
      throw std::runtime_error("infer stage returned no samples");
    simaai::neat::Sample infer = infer_samples.front();

    simaai::neat::stages::BoxDecodeOptions box(simaai::neat::BoxDecodeType::YoloV8);
    (void)box.decode_type;
    (void)bgr.cols;
    (void)bgr.rows;
    box.detection_threshold = 0.55;
    box.nms_iou_threshold = 0.5;
    box.top_k = 100;

    // BoxDecode parses the "BBOX" tensor into {x1, y1, x2, y2, score, class_id}
    // entries clamped to original_width x original_height source pixels.
    simaai::neat::BoxDecodeResult decoded = simaai::neat::stages::BoxDecode(infer, model, box);

    std::cout << "boxes=" << decoded.boxes.size() << "\n";
    std::cout << "[OK] 006_read_detection_boxes\n";
    return 0;
  } catch (const std::exception& e) {
    std::cerr << "[FAIL] " << e.what() << "\n";
    return 1;
  }
}

Concept​

Learning Process​

Run​

Code​

Source​

Concept

Learning Process

Run

Code

Source