GPU database

SM

Configuration

This compute unit implements the Fermi 1.0 architecture. It consists of a single block with the following execution units. Additionally, this compute unit has multiple issue ports. Instructions scheduled onto separate issue ports can execute in parallel, but they require some instruction-level parallelism in the input.

Data type Issue port Execution rate
FP32 0, 1 16 lanes, executing one operation per cycle, running at 2× base frequency
INT32 0, 1 16 lanes, executing one operation per cycle, running at 2× base frequency
FP64 2 16 lanes, executing one operation per cycle, running at 2× base frequency

Block diagram

ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
0
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
1
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
2

This compute unit has exclusive ports, depicted with a thick border. An instruction issued to an exclusive port blocks co-issue to other ports.

ASICs

The following ASICs are using this compute unit:

Cards

The following cards are built using this compute unit: