From 49ac8ad13782dccc02798f3ea35a1e85b42448aa Mon Sep 17 00:00:00 2001 From: Spencer Tipping Date: Sat, 15 Mar 2025 09:51:26 -0700 Subject: [PATCH] Clarify reasoning behind reciprocal throughput as a unit --- content/english/hpc/pipelining/tables.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/tables.md b/content/english/hpc/pipelining/tables.md index ad90c400..5c6e9eba 100644 --- a/content/english/hpc/pipelining/tables.md +++ b/content/english/hpc/pipelining/tables.md @@ -30,7 +30,7 @@ You can get latency and throughput numbers for a specific architecture from spec Some comments: -- Because our minds are so used to the cost model where "more" means "worse," people mostly use *reciprocals* of throughput instead of throughput. +- Reciprocal throughput (a unit of time) is generally used instead of throughput (a frequency) because time is a linear unit in the time domain and frequency is not. Reciprocal throughput also aligns with the natural intuition that higher values typically represent lower performance. - If a certain instruction is especially frequent, its execution unit could be duplicated to increase its throughput — possibly to even more than one, but not higher than the [decode width](/hpc/architecture/layout). - Some instructions have a latency of 0. This means that these instruction are used to control the scheduler and don't reach the execution stage. They still have non-zero reciprocal throughput because the [CPU front-end](/hpc/architecture/layout) still needs to process them. - Most instructions are pipelined, and if they have the reciprocal throughput of $n$, this usually means that their execution unit can take another instruction after $n$ cycles (and if it is below 1, this means that there are multiple execution units, all capable of taking another instruction on the next cycle). One notable exception is [integer division](/hpc/arithmetic/division): it is either very poorly pipelined or not pipelined at all.