Skip to main content

ProfilerKernelAggregate Struct

Aggregated timings for one (backend, kernel, stage, slot) tuple. More...

Declaration

struct simaai::neat::ProfilerKernelAggregate { ... }

Included Headers

#include <LatencyProfiler.h>

Public Member Functions Index

doubleavg_ms () const

Mean latency per invocation, in milliseconds. More...

Public Member Attributes Index

std::stringbackend

Backend label ("MLA", "A65", ...). More...

std::stringkernel_name

Kernel name within the backend. More...

std::stringstage_name

Pipeline stage name. More...

std::int32_tphysical_input_index = -1

Physical input index, -1 if N/A. More...

std::int32_toutput_slot = -1

Output slot, -1 if N/A. More...

std::uint64_tcount = 0

Number of invocations in the bucket. More...

doubletotal_ms = 0.0

Total time across invocations (ms). More...

doublemin_ms = 0.0

Minimum single-invocation time (ms). More...

doublemax_ms = 0.0

Maximum single-invocation time (ms). More...

Description

Aggregated timings for one (backend, kernel, stage, slot) tuple.

Bucketed view over ProfilerKernelInvocation records: call count plus total/min/max latency in milliseconds. Use avg_ms() for the mean.

Definition at line 130 of file LatencyProfiler.h.

Public Member Functions

avg_ms()

double simaai::neat::ProfilerKernelAggregate::avg_ms ()
inline

Mean latency per invocation, in milliseconds.

Definition at line 141 of file LatencyProfiler.h.

141 double avg_ms() const {
142 return count > 0 ? (total_ms / static_cast<double>(count)) : 0.0;
143 }

Public Member Attributes

backend

std::string simaai::neat::ProfilerKernelAggregate::backend

Backend label ("MLA", "A65", ...).

Definition at line 131 of file LatencyProfiler.h.

131 std::string backend;

count

std::uint64_t simaai::neat::ProfilerKernelAggregate::count = 0

Number of invocations in the bucket.

Definition at line 136 of file LatencyProfiler.h.

136 std::uint64_t count = 0;

kernel_name

std::string simaai::neat::ProfilerKernelAggregate::kernel_name

Kernel name within the backend.

Definition at line 132 of file LatencyProfiler.h.

132 std::string kernel_name;

max_ms

double simaai::neat::ProfilerKernelAggregate::max_ms = 0.0

Maximum single-invocation time (ms).

Definition at line 139 of file LatencyProfiler.h.

139 double max_ms = 0.0;

min_ms

double simaai::neat::ProfilerKernelAggregate::min_ms = 0.0

Minimum single-invocation time (ms).

Definition at line 138 of file LatencyProfiler.h.

138 double min_ms = 0.0;

output_slot

std::int32_t simaai::neat::ProfilerKernelAggregate::output_slot = -1

Output slot, -1 if N/A.

Definition at line 135 of file LatencyProfiler.h.

135 std::int32_t output_slot = -1;

physical_input_index

std::int32_t simaai::neat::ProfilerKernelAggregate::physical_input_index = -1

Physical input index, -1 if N/A.

Definition at line 134 of file LatencyProfiler.h.

134 std::int32_t physical_input_index = -1;

stage_name

std::string simaai::neat::ProfilerKernelAggregate::stage_name

Pipeline stage name.

Definition at line 133 of file LatencyProfiler.h.

133 std::string stage_name;

total_ms

double simaai::neat::ProfilerKernelAggregate::total_ms = 0.0

Total time across invocations (ms).

Definition at line 137 of file LatencyProfiler.h.

137 double total_ms = 0.0;

The documentation for this struct was generated from the following file:


Generated via doxygen2docusaurus 2.0.0 by Doxygen 1.9.8.