algorithmica-org · spencertipping · Mar 15, 2025
diff --git a/content/english/hpc/pipelining/tables.md b/content/english/hpc/pipelining/tables.md
@@ -30,7 +30,7 @@ You can get latency and throughput numbers for a specific architecture from spec
 
 Some comments:
 
-- Because our minds are so used to the cost model where "more" means "worse," people mostly use *reciprocals* of throughput instead of throughput.
+- Reciprocal throughput (a unit of time) is generally used instead of throughput (a frequency) because time is a linear unit in the time domain and frequency is not. Reciprocal throughput also aligns with the natural intuition that higher values typically represent lower performance.
 - If a certain instruction is especially frequent, its execution unit could be duplicated to increase its throughput — possibly to even more than one, but not higher than the [decode width](/hpc/architecture/layout).
 - Some instructions have a latency of 0. This means that these instruction are used to control the scheduler and don't reach the execution stage. They still have non-zero reciprocal throughput because the [CPU front-end](/hpc/architecture/layout) still needs to process them.
 - Most instructions are pipelined, and if they have the reciprocal throughput of $n$, this usually means that their execution unit can take another instruction after $n$ cycles (and if it is below 1, this means that there are multiple execution units, all capable of taking another instruction on the next cycle). One notable exception is [integer division](/hpc/arithmetic/division): it is either very poorly pipelined or not pipelined at all.