MAC Module
Description
The hs_npu_mac (Multiply-Accumulate) module in the ScaleNPU serves as the fundamental computational block for performing multiply-accumulate (MAC) operations, crucial for a wide variety of neural processing tasks. The module is designed to take in two values, perform element-wise multiplication, and accumulate the result into a running sum, providing the basis for complex computations, such as matrix multiplications.
Each instance of hs_npu_mac operates independently, supporting parallel processing and enabling high-throughput computation across multiple elements. For applications like matrix multiplication, the inputs a_in and b_in can represent values from two matrices. The module multiplies these inputs, adds the result to an incoming sum, and outputs the result while forwarding the input values for further processing.
I/O Table
Input Table
| Input Name | Direction | Type | Description |
|---|---|---|---|
clk |
Input | logic |
Clock signal for synchronization. |
enable_in |
Input | logic |
Enable signal that controls when b_in is registered. |
a_in |
Input | short |
First input value for the MAC operation. |
b_in |
Input | short |
Second input value for the MAC operation. |
sum |
Input | word |
Running sum to which the product of a_in and b_in is added. |
Output Table
| Output Name | Direction | Type | Description |
|---|---|---|---|
a_out |
Output | short |
Forwarded output of a_in for use in subsequent operations. |
b_out |
Output | short |
Forwarded output of b_in for use in subsequent operations. |
result |
Output | word |
Output result of the multiply-accumulate operation \( a \times b + c \). |
Operation
The hs_npu_mac module performs the multiply-accumulate operation as follows:
-
Receives
a_inandb_inas inputs for multiplication. These inputs may represent values from two matrices in a matrix multiplication scenario. -
When
enable_inis active, registers theb_ininput value, allowing selective updates to the second operand in the multiplication. In our case this is for "fixed weight" inference. -
Performs the multiply-accumulate operation by calculating \( (a\_in \times b\_ff) + sum \), where
b_ffstores the last registeredb_invalue. -
Outputs the result of the operation in
result. -
Forwards
a_ffandb_ffasa_outandb_outto propagate values to the next MAC unit.
Internal Signals
a_ff: Flip-flop register that stores the current value ofa_in.b_ff: Flip-flop register that stores the persistent value ofb_inwhenenable_inis active.
Submodule Diagram
Related Files
| File Name | Type |
|---|---|
| hs_npu_mac | Top |