QuantSpec Struct
Quantization metadata for INT8/INT16 tensors. More...
Declaration
Included Headers
Public Member Attributes Index
| float | scale = 1.0f |
|
Per-tensor scale (x_real = (x_int - zero_point) * scale). More... | |
| int32_t | zero_point = 0 |
|
Per-tensor zero point. More... | |
| int | axis = -1 |
|
Channel axis for per-channel quantization (-1 = per-tensor). More... | |
| std::vector< float > | scales |
|
Per-channel scales (used when axis >= 0). More... | |
| std::vector< int32_t > | zero_points |
|
Per-channel zero points (used when axis >= 0). More... | |
Description
Quantization metadata for INT8/INT16 tensors.
For per-tensor quantization, set scale and zero_point directly. For per-channel (typical for quantized weights), populate scales and zero_points and set axis to the channel dimension index.
Definition at line 226 of file TensorCore.h.
Public Member Attributes
axis
|
Channel axis for per-channel quantization (-1 = per-tensor).
Definition at line 229 of file TensorCore.h.
scale
|
Per-tensor scale (x_real = (x_int - zero_point) * scale).
Definition at line 227 of file TensorCore.h.
scales
|
Per-channel scales (used when axis >= 0).
Definition at line 230 of file TensorCore.h.
zero_point
|
zero_points
|
Per-channel zero points (used when axis >= 0).
Definition at line 231 of file TensorCore.h.
The documentation for this struct was generated from the following file:
Generated via doxygen2docusaurus 2.0.0 by Doxygen 1.9.8.