GPU database

VLIW4 SIMD Engine

Configuration

This compute unit implements the TeraScale 3 architecture. It consists of a single block with the following execution units.

Data type Execution rate
FP32 VLIW unit with 16 lanes, executing 4 operations/cycle
INT32 VLIW unit with 16 lanes, executing 4 operations/cycle
FP64 16 lanes, executing one operation per cycle

Block diagram

ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
ALU
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP
DP

ASICs

The following ASICs are using this compute unit:

Cards

The following cards are built using this compute unit: