-
Notifications
You must be signed in to change notification settings - Fork 404
Description
Running circt-synth on an 8-bit adder with and without the carry-out we see an unexpected improvement in longest path when computing the carry-out:
hw.module @add_two_no_carry(in %a : i8, in %b : i8, out sum : i8) {
%0 = comb.add %a, %b : i8
hw.output %0 : i8
}
hw.module @add_two_with_carry(in %a : i8, in %b : i8, out sum : i9) {
%false = hw.constant false
%0 = comb.concat %false, %a : i1, i8
%1 = comb.concat %false, %b : i1, i8
%2 = comb.add %0, %1 : i9
hw.output %2 : i9
}add_two_no_carry - Maximum Path Delay = 16
add_two_with_carry - Maximum Path Delay = 14
It's reasonable to expect add_two_no_carry to have better or equal delay, since we are computing a subset of the bits. However, inspecting the designs, we see that the comb canonicalizers are greedily folding Xor operators that have very different arrival times, leading to poor lowering choices. Removing the tryFlatteningOperands pattern in XorOp::canonicalize add_two_no_carry can be made to match the depth 14 delay path.
I guess this is a challenge of mixing timing aware and timing unaware passes but any suggestions on how we might resolve this challenge in general?