Skip to main content

Preprocess Images Before Inference

FieldValue
DifficultyIntermediate
Estimated Read Time15-20 minutes
Labelspreprocessing, normalization, image

Concept

Configure the preprocessing stage — format, dimensions, and per-channel normalization — so raw image input becomes the exact tensor your model was trained on. Accurate preprocessing is usually the difference between a model that works and a model that looks broken.

This chapter focuses on the preproc controls you use most in real deployments:

  • format: declares input image layout/color order (RGB/BGR/GRAY) expected at ingress.
  • input_max_width, input_max_height, input_max_depth: runtime bounds for accepted dynamic inputs.
  • preproc.input_width, preproc.input_height: expected source dimensions entering preproc.
  • preproc.output_width, preproc.output_height: tensor dimensions produced for inference.
  • preproc.normalize: enables value normalization before inference.
  • preproc.channel_mean, preproc.channel_stddev: per-channel normalization constants that should match model training assumptions.

Use-case guidance

  • Model output is unstable or low-confidence after deployment: verify format and channel_mean / channel_stddev first.
  • Multiple cameras/sources with different resolutions: set input_max_* and explicit preproc in/out dimensions for predictable behavior.
  • Porting a model from another framework: mirror the training-time normalization recipe with preproc.normalize + channel stats.
  • Isolating preprocessing issues: run/inspect model.preprocess() path and confirm shape/dtype before debugging inference/postproc.

APIs introduced

  • pyneat.ModelOptions().preproc.* — the preprocessing sub-struct with the fields listed above.
  • model.preprocess() — retrieves the preprocessing node group so you can inspect it in isolation.

Prerequisites Chapter 001. Chapter 004 for the rest of ModelOptions.

References

Learning Process

  1. Configure Model::Options / ModelOptions with explicit preproc dimensions, format, and normalization policy.
  2. Build the model and inspect preprocessing-stage behavior (group composition and tensor contract cues).
  3. Execute a deterministic run path and verify resulting output/type signals.

Run

Python:

python3 share/sima-neat/tutorials/005_preprocess_images/preprocess_images.py \
--mpk /tmp/resnet_50_mpk.tar.gz --size 224

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_005_preprocess_images \
--mpk /tmp/resnet_50_mpk.tar.gz --size 224

C++ (build from source):

./build.sh --target tutorial_005_preprocess_images
./build/tutorials-standalone/tutorial_005_preprocess_images \
--mpk /tmp/resnet_50_mpk.tar.gz --size 224

To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.

Code

tutorials/005_preprocess_images/preprocess_images.cpp
// Run preprocessing standalone via stages::Preproc and inspect the resulting tensor.
//
// Usage:
// tutorial_005_preprocess_images --mpk /path/to/resnet_50.tar.gz [--size 224]

#include "neat.h"

#include "pipeline/StageRun.h"

#include <opencv2/core.hpp>

#include <array>
#include <iostream>
#include <stdexcept>
#include <string>

namespace {

bool get_arg(int argc, char** argv, const std::string& key, std::string& out) {
for (int i = 1; i + 1 < argc; ++i) {
if (key == argv[i]) {
out = argv[i + 1];
return true;
}
}
return false;
}

int parse_int_arg(int argc, char** argv, const std::string& key, int def) {
std::string value;
if (!get_arg(argc, argv, key, value))
return def;
return std::stoi(value);
}

} // namespace

int main(int argc, char** argv) {
try {
std::string mpk;
if (!get_arg(argc, argv, "--mpk", mpk)) {
std::cerr << "Usage: tutorial_005_preprocess_images --mpk <path> [--size <n>]\n";
return 1;
}
const int size = parse_int_arg(argc, argv, "--size", 224);

simaai::neat::Model::Options opt;
opt.preprocess.color_convert.input_format = simaai::neat::PreprocessColorFormat::BGR;
opt.preprocess.input_max_width = size;
opt.preprocess.input_max_height = size;
opt.preprocess.input_max_depth = 3;
opt.preprocess.resize.width = size;
opt.preprocess.resize.height = size;
opt.preprocess.resize.width = size;
opt.preprocess.resize.height = size;
opt.preprocess.normalize.enable = simaai::neat::AutoFlag::On;
opt.preprocess.normalize.mean = std::array<float, 3>{0.5f, 0.5f, 0.5f};
opt.preprocess.normalize.stddev = std::array<float, 3>{0.5f, 0.5f, 0.5f};

simaai::neat::Model model(mpk, opt);

cv::Mat bgr(size, size, CV_8UC3, cv::Scalar(40, 80, 120));
if (!bgr.isContinuous())
bgr = bgr.clone();

// CORE LOGIC
// stages::Preproc runs just the preprocessing step from the model's Options
// and returns the preprocessed Tensor.
simaai::neat::Tensor pre =
simaai::neat::stages::Preproc(std::vector<cv::Mat>{bgr}, model).front();

std::cout << "preproc_rank=" << pre.shape.size() << "\n";
std::cout << "preproc_dtype=" << static_cast<int>(pre.dtype) << "\n";
std::cout << "[OK] 005_preprocess_images\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "[FAIL] " << e.what() << "\n";
return 1;
}
}

Source