Skip to content

Commit 5a239f6

Browse files
authored
[10.0.0]: x64: Remove recursion in to_amode helper (#6997)
* x64: Remove recursion in `to_amode` helper This commit removes the recursion present in the x64 backend's `to_amode` and `to_amode_add` helpers. The recursion is currently unbounded and controlled by user input meaning it's not too hard to craft user input which triggers stack overflow in the host. By removing recursion there's no need to worry about this since the stack depth will never be hit. The main concern with removing recursion is that code quality may not be quite as good any more. The purpose of the recursion here is to "hunt for constants" and update the immediate `Offset32`, and now without recursion only at most one constant is found and folded instead of an arbitrary number of constants as before. This should continue to produce the same code as before so long as optimizations are enabled, but without optimizations this will produce worse code than before. Note with a hypothetical mid-end optimization that does this constant folding for us the rules here could be further simplified to purely consider the shape of the input `Value` to amode computation without considering constants at all. * Perform a `--locked` install of `cargo vet`
1 parent a45abad commit 5a239f6

File tree

3 files changed

+103
-43
lines changed

3 files changed

+103
-43
lines changed

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -354,7 +354,7 @@ jobs:
354354
- uses: ./.github/actions/install-rust
355355
with:
356356
toolchain: nightly-2023-03-20
357-
- run: cargo install cargo-fuzz --vers "^0.11"
357+
- run: cargo install cargo-fuzz --vers "^0.11" --locked
358358
# Install the OCaml packages necessary for fuzz targets that use the
359359
# `wasm-spec-interpreter`.
360360
- run: sudo apt install -y ocaml-nox ocamlbuild ocaml-findlib libzarith-ocaml-dev

cranelift/codegen/src/isa/x64/inst.isle

Lines changed: 79 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1025,56 +1025,93 @@
10251025
;; Converts a `Value` and a static offset into an `Amode` for x64, attempting
10261026
;; to be as fancy as possible with offsets/registers/shifts/etc to make maximal
10271027
;; use of the x64 addressing modes.
1028+
;;
1029+
;; This is a bit subtle unfortunately due to a few constraints. This function
1030+
;; was originally written recursively but that can lead to stack overflow
1031+
;; for certain inputs due to the recursion being defined by user-controlled
1032+
;; input. This means that nowadays this function is not recursive and has a
1033+
;; specific structure to handle that.
1034+
;;
1035+
;; Additionally currently in CLIF all loads/stores have an `Offset32` immediate
1036+
;; to go with them, but the wasm lowering to CLIF doesn't use this meaning that
1037+
;; it's frequently 0. Additionally mid-end optimizations do not fold `iconst`
1038+
;; values into this `Offset32`, meaning that it's left up to backends to hunt
1039+
;; for constants for good codegen. That means that one important aspect of this
1040+
;; function is that it searches for constants to fold into the `Offset32` to
1041+
;; avoid unnecessary instructions.
1042+
;;
1043+
;; Note, though, that the "optimal addressing modes" are only guaranteed to be
1044+
;; generated if egraph-based optimizations have run. For example this will only
1045+
;; attempt to find one constant as opposed to many, and that'll only happen
1046+
;; with constant folding from optimizations.
1047+
;;
1048+
;; Finally there's two primary entry points for this function. One is this
1049+
;; function here, `to_amode,` and another is `to_amode_add`. The latter is used
1050+
;; by the lowering of `iadd` in the x64 backend to use the `lea` instruction
1051+
;; where the input is two `Value` operands instead of just one. Most of the
1052+
;; logic here is then deferred through `to_amode_add`.
1053+
;;
1054+
;; In the future if mid-end optimizations fold constants into `Offset32` then
1055+
;; this in theory can "simply" delegate to the `amode_imm_reg` helper, and
1056+
;; below can delegate to `amode_imm_reg_reg_shift`, or something like that.
10281057
(decl to_amode (MemFlags Value Offset32) Amode)
1029-
1030-
;; Base case, "just put it in a register"
1031-
(rule (to_amode flags base offset)
1032-
(Amode.ImmReg offset base flags))
1033-
1034-
;; Slightly-more-fancy case, if the address is the addition of two things then
1035-
;; delegate to the `to_amode_add` helper.
1058+
(rule 0 (to_amode flags base offset)
1059+
(amode_imm_reg flags base offset))
10361060
(rule 1 (to_amode flags (iadd x y) offset)
1037-
(to_amode_add flags x y offset))
1061+
(to_amode_add flags x y offset))
10381062

10391063
;; Same as `to_amode`, except that the base address is computed via the addition
10401064
;; of the two `Value` arguments provided.
1041-
(decl to_amode_add (MemFlags Value Value Offset32) Amode)
1042-
1043-
;; Base case, "just put things in registers". Note that the shift value of 0
1044-
;; here means `x + (y << 0)` which is the same as `x + y`.
1045-
(rule (to_amode_add flags x y offset)
1046-
(Amode.ImmRegRegShift offset x y 0 flags))
1047-
1048-
;; If the one of the arguments being added is itself a constant shift then
1049-
;; that can be modeled directly so long as the shift is a modestly small amount.
1050-
(rule 1 (to_amode_add flags x (ishl y (iconst (uimm8 shift))) offset)
1051-
(if (u32_lteq (u8_as_u32 shift) 3))
1052-
(Amode.ImmRegRegShift offset x y shift flags))
1053-
(rule 2 (to_amode_add flags (ishl y (iconst (uimm8 shift))) x offset)
1054-
(if (u32_lteq (u8_as_u32 shift) 3))
1055-
(Amode.ImmRegRegShift offset x y shift flags))
1056-
1057-
;; Constant extraction rules.
10581065
;;
1059-
;; These rules attempt to find a constant within one of `x` or `y`, or deeper
1060-
;; within them if they have their own adds. These only succeed if the constant
1061-
;; itself can be represented with 32-bits and can be infallibly added to the
1062-
;; offset that we already have.
1066+
;; The primary purpose of this is to hunt for constants within the two `Value`
1067+
;; operands provided. Failing that this will defer to `amode_imm_reg` or
1068+
;; `amode_imm_reg_reg_shift` which is the final step in amode lowering and
1069+
;; performs final pattern matches related to shifts to see if that can be
1070+
;; peeled out into the amode.
10631071
;;
1064-
;; Note the recursion here where this rule is defined in terms of itself to
1065-
;; "peel" layers of constants.
1072+
;; In other words this function's job is to find constants and then defer to
1073+
;; `amode_imm_reg*`.
1074+
(decl to_amode_add (MemFlags Value Value Offset32) Amode)
1075+
1076+
(rule 0 (to_amode_add flags x y offset)
1077+
(amode_imm_reg_reg_shift flags x y offset))
1078+
(rule 1 (to_amode_add flags x (iconst (simm32 c)) offset)
1079+
(if-let sum (s32_add_fallible offset c))
1080+
(amode_imm_reg flags x sum))
1081+
(rule 2 (to_amode_add flags (iconst (simm32 c)) x offset)
1082+
(if-let sum (s32_add_fallible offset c))
1083+
(amode_imm_reg flags x sum))
10661084
(rule 3 (to_amode_add flags (iadd x (iconst (simm32 c))) y offset)
1067-
(if-let sum (s32_add_fallible offset c))
1068-
(to_amode_add flags x y sum))
1069-
(rule 4 (to_amode_add flags x (iadd y (iconst (simm32 c))) offset)
1070-
(if-let sum (s32_add_fallible offset c))
1071-
(to_amode_add flags x y sum))
1072-
(rule 5 (to_amode_add flags x (iconst (simm32 c)) offset)
1073-
(if-let sum (s32_add_fallible offset c))
1074-
(to_amode flags x sum))
1075-
(rule 6 (to_amode_add flags (iconst (simm32 c)) x offset)
1076-
(if-let sum (s32_add_fallible offset c))
1077-
(to_amode flags x sum))
1085+
(if-let sum (s32_add_fallible offset c))
1086+
(amode_imm_reg_reg_shift flags x y sum))
1087+
(rule 4 (to_amode_add flags (iadd (iconst (simm32 c)) x) y offset)
1088+
(if-let sum (s32_add_fallible offset c))
1089+
(amode_imm_reg_reg_shift flags x y sum))
1090+
(rule 5 (to_amode_add flags x (iadd y (iconst (simm32 c))) offset)
1091+
(if-let sum (s32_add_fallible offset c))
1092+
(amode_imm_reg_reg_shift flags x y sum))
1093+
(rule 6 (to_amode_add flags x (iadd (iconst (simm32 c)) y) offset)
1094+
(if-let sum (s32_add_fallible offset c))
1095+
(amode_imm_reg_reg_shift flags x y sum))
1096+
1097+
;; Final cases of amode lowering. Does not hunt for constants and only attempts
1098+
;; to pattern match add-of-shifts to generate fancier `ImmRegRegShift` modes,
1099+
;; otherwise falls back on `ImmReg`.
1100+
(decl amode_imm_reg (MemFlags Value Offset32) Amode)
1101+
(rule 0 (amode_imm_reg flags base offset)
1102+
(Amode.ImmReg offset base flags))
1103+
(rule 1 (amode_imm_reg flags (iadd x y) offset)
1104+
(amode_imm_reg_reg_shift flags x y offset))
1105+
1106+
(decl amode_imm_reg_reg_shift (MemFlags Value Value Offset32) Amode)
1107+
(rule 0 (amode_imm_reg_reg_shift flags x y offset)
1108+
(Amode.ImmRegRegShift offset x y 0 flags)) ;; 0 == y<<0 == "no shift"
1109+
(rule 1 (amode_imm_reg_reg_shift flags x (ishl y (iconst (uimm8 shift))) offset)
1110+
(if (u32_lteq (u8_as_u32 shift) 3))
1111+
(Amode.ImmRegRegShift offset x y shift flags))
1112+
(rule 2 (amode_imm_reg_reg_shift flags (ishl y (iconst (uimm8 shift))) x offset)
1113+
(if (u32_lteq (u8_as_u32 shift) 3))
1114+
(Amode.ImmRegRegShift offset x y shift flags))
10781115

10791116
;; Offsetting an Amode. Used when we need to do consecutive
10801117
;; loads/stores to adjacent addresses.

tests/all/module.rs

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -216,3 +216,26 @@ fn missing_sse_and_floats_still_works() -> Result<()> {
216216

217217
Ok(())
218218
}
219+
220+
#[test]
221+
#[cfg_attr(miri, ignore)]
222+
fn large_add_chain_no_stack_overflow() -> Result<()> {
223+
let mut config = Config::new();
224+
config.cranelift_opt_level(OptLevel::None);
225+
let engine = Engine::new(&config)?;
226+
let mut wat = String::from(
227+
"
228+
(module
229+
(func (result i64)
230+
(i64.const 1)
231+
",
232+
);
233+
for _ in 0..20_000 {
234+
wat.push_str("(i64.add (i64.const 1))\n");
235+
}
236+
237+
wat.push_str(")\n)");
238+
Module::new(&engine, &wat)?;
239+
240+
Ok(())
241+
}

0 commit comments

Comments
 (0)