SMX
Configuration
This compute unit implements the Kepler architecture. It consists of a single block with the following execution units. Additionally, this compute unit has multiple issue ports. Instructions scheduled onto separate issue ports can execute in parallel, but they require some instruction-level parallelism in the input.
Data type | Issue port | Execution rate |
FP32 | 0, 1, 2, 3, 4, 5 | 32 lanes, executing one operation per cycle |
---|---|---|
INT32 | 0, 1, 2, 3, 4, 5 | 32 lanes, executing one operation per cycle |
FP64 | 7, 8, 9, 10 | 16 lanes, executing one operation per cycle |
Block diagram
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
0
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
1
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
2
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
3
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
4
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
5
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
7
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
8
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
9
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
10