LatencyProfiler.h File
Per-kernel latency profiler — V1 surface. More...
Included Headers
Namespaces Index
| namespace | simaai |
| namespace | neat |
Classes Index
| struct | ProfilerKernelInvocation |
|
One kernel-invocation telemetry event. More... | |
| struct | ProfilerMemcpySite |
|
Aggregate counters for one instrumented memcpy site. More... | |
| struct | ProfilerKernelAggregate |
|
Aggregated timings for one (backend, kernel, stage, slot) tuple. More... | |
| struct | ProfilerReport |
|
Snapshot bundle returned by LatencyProfiler::finalize(). More... | |
| struct | LatencyProfilerOptions |
|
Construction options for LatencyProfiler. More... | |
| class | LatencyProfiler |
|
Per-sample latency tracker; attach to a Run to capture timing telemetry. More... | |
Description
Per-kernel latency profiler — V1 surface.
Attach a LatencyProfiler to a simaai::neat::Run (or to a Session that has produced one) BEFORE you start pushing frames. After your run loop, call finalize() to get a ProfilerReport and pass it to to_text() / to_chrome_trace() to dump a human-readable summary or a JSON trace file loadable in chrome://tracing or Perfetto.
The profiler aggregates four classes of telemetry:
- Per-kernel-invocation events (MLA, A65, EV74, BoxDecode, Memcpy) drained from libsimaaineatprofiler.so's cross-shared-library ring. Each event carries (start_ns, end_ns, backend, phase, physical_input_index, output_slot, frame_id, request_id, kernel_name, stage_name, in/out segment names, bytes).
- Per-element aggregate timings (existing Run::diag_snapshot()).
- End-to-end per-frame stats (existing Run::stats()).
- Per-site memcpy totals (calls / total_ns / total_bytes) for the five hot copy sites the runtime instruments.
Off-path overhead is gated by sima_neat_profiler_enabled() — when no profiler is attached, every emit site is one atomic-load + branch.
- See Also
Run.h for RunStats, InputStreamStats, RunDiagSnapshot.
File Listing
The file content with the documentation metadata removed is:
Generated via doxygen2docusaurus 2.0.0 by Doxygen 1.9.8.