GPU database

Ampere SM

Configuration

This compute unit implements the Ampere (HPC) architecture. It consists of 4 blocks, each containing the following execution units. Additionally, this compute unit has multiple issue ports. Instructions scheduled onto separate issue ports can execute in parallel, but they require some instruction-level parallelism in the input.

Data type Issue port Execution rate
FP32 0 16 lanes, executing one operation per cycle
FP64 0 8 lanes, executing one operation per cycle
INT32 1 16 lanes, executing one operation per cycle

Block diagram

FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
DP
DP
DP
DP
DP
DP
DP
DP
0
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
1
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
DP
DP
DP
DP
DP
DP
DP
DP
0
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
1
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
DP
DP
DP
DP
DP
DP
DP
DP
0
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
1
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
FP
DP
DP
DP
DP
DP
DP
DP
DP
0
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
INT
1

ASICs

The following ASICs are using this compute unit: