The Case for a Cray on a Chip

Proc. SAICSIT-2023, Muldersdrift, South Africa, July 2023, pp 14–30

Philip Machanick

Erratum: Figure 2 in the published version had the wrong caption; corrected here.

Moore’s Law is usually interpreted as a prediction of how many transistors you can buy for the same money at some future date. It can also be interpreted as how long you need to wait until a given number of transistors falls below a target price. An example of this reverse-application of Moore’s Law is transitions such as the emergence of microprocessors competitive with traditional larger-scale computers and the emergence of smartphones. Since the late 1990s, it has become increasingly common for growth in transistors to equate to more CPUs (cores) per die. Recent designs have over 50-billion transistors and far more potential parallelism than can be supported by memory. I argue the case for a rebalancing of design goals with a much larger, faster on-chip memory and a CPU that is designed around this memory system. The proposal: a Cray-class vector CPU on a die with 1 Gibyte of static RAM, or Crayon (for Cray on a chip). The kind of organization classically used by Cray vector supercomputers is feasible to achieve on a single chip. I argue that a design like this can use the available memory bandwidth, as opposed to over-CPU designs with a large number of cores and GPU threads that are memory limited and propose how such a design could be used.

PDF (1.7MBytes)