Skip to content

Conversation

@uenoku
Copy link
Member

@uenoku uenoku commented Oct 2, 2025

This pass performs two main optimizations on mux chains: enhanced mux
chain folding that converts chains of muxes with index comparisons into
balanced mux tree, and priority encoder rebalancing that transforms linear
chains into balanced tree structures reducing depth from O(n) to O(log n).
1.

# Before: Linear chain (O(n) depth)
result = cond0 ? val0 : (cond1 ? val1 : (cond2 ? val2 : default))

# After: Balanced tree (O(log n) depth)  
left = cond0 ? val0 : val1
or_cond = cond0 | cond1
result = or_cond ? left : right
# Before: Mux chain with comparisons
result = (index == 0) ? val0 : ((index == 1) ? val1 : ...)

# After: Balanced mux tree
result = index[n-1] ? (index[n-2] ? (index[n-3]? ...:...))

Unlike the FIRRTL pipeline which avoids mux balancing to preserve design
intent (chipsalliance/chisel#1199), the synthesis
pipeline implements these optimizations since it's focused on circuit
optimization rather than maintaining the original design structure.

The pass handles both false-side and true-side chain patterns and includes
comprehensive test cases covering priority encoding, duplicate conditions,
and index comparison folding.

@uenoku uenoku requested a review from darthscsi as a code owner October 2, 2025 06:29
@uenoku uenoku force-pushed the dev/hidetou/mux-balance branch 3 times, most recently from b8a6f8a to caeb6fb Compare October 2, 2025 09:07
This pass performs two main optimizations on mux chains: enhanced mux chain
folding that converts chains of muxes with index comparisons into array
operations, and priority encoder rebalancing that transforms linear chains
into balanced tree structures reducing depth from O(n) to O(log n).

The pass handles both false-side and true-side chain patterns and includes
comprehensive test cases covering priority encoding, duplicate conditions,
and index comparison folding.
@uenoku uenoku force-pushed the dev/hidetou/mux-balance branch from caeb6fb to c7011fb Compare October 2, 2025 09:15
Copy link
Contributor

@fabianschuiki fabianschuiki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Very useful pass to have 💯

Comment on lines +89 to +99
/// Enum for mux chain folding styles.
enum MuxChainWithComparisonFoldingStyle { None, BalancedMuxTree, ArrayGet };
/// Mux chain folding that converts chains of muxes with index
/// comparisons into array operations or balanced mux trees. `styleFn` is a
/// callback that returns the desired folding style based on the index
/// width and number of entries.
bool foldMuxChainWithComparison(
PatternRewriter &rewriter, MuxOp rootMux, bool isFalseSide,
llvm::function_ref<MuxChainWithComparisonFoldingStyle(size_t indexWidth,
size_t numEntries)>
styleFn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea to have this as a utility on the Comb dialect 💯

Comment on lines +126 to +128
LDBG() << "Rebalanced priority mux with " << conditions.size()
<< " conditions, using " << (useFalseSideChain ? "false" : "true")
<< "-side chain.\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never seen LDBG() before. Very handy! 😄

@cowardsa
Copy link
Contributor

cowardsa commented Oct 2, 2025

@uenoku - didn't see any timing guidance after a brief scan - did you plan to integrate the timing analysis into this pass? As the discussion you pointed to highlighted the dependence on arrival times for an optimal mux tree structure?

@uenoku
Copy link
Member Author

uenoku commented Oct 2, 2025

Yes the current code structure would not be too hard to integrate timing analysis. Will do that in a follow-up. For priority mux we can choose ideal separation points based on timing, for index comparison it's a bit trickier since it's necessary to know don't care bits of index.

Copy link
Contributor

@cowardsa cowardsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - nice work and certainly seems like a valuable addition to the synthesis pipeline.

Only potential more major suggestion - could add circt-lec tests and longest-path-analysis to check the reduction in levels is as expected i.e. O(n) -> O(log n)?

Comment on lines +1977 to +1979
/// `array_get (array_create(A, B, C), VAL)` or a balanced mux tree which is far
/// more compact and allows synthesis tools to do more interesting
/// optimizations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of interest - Synopsys documentation favours priority encoders: see slide 88

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah. I wish there is a SV builtin for priority mux so that we can capture design intents. Maybe eventually we also want to introduce an operation for priority mux.

// a tremendous number of replicated entries in the array. Some sparsity is
// ok though, so we require the table to be at least 5/8 utilized.
uint64_t tableSize = 1ULL << indexWidth;
if (numEntries >= tableSize * 5 / 8)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Interesting - is there any data to support this 5/8 ths? If not perhaps worth adding a todo to benchmark - someone might pick it up?

Copy link
Member Author

@uenoku uenoku Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter was introduced here df37fa1 a while ago. Not sure the rational behind this and I think it's fine to change the behaivor. This is general Comb folder where creating a balance tree may not be always ideal (especially when emitting SystemVerilog, it would be easier for commertial tools to pattern match as well).

Comment on lines +67 to +70
// Balance mux chains. For area oriented flow, we want to keep the mux chains
// unless they are very deep.
comb::BalanceMuxOptions balanceOptions{OptimizationStrategyTiming ? 16 : 64};
pm.addPass(comb::createBalanceMux(balanceOptions));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@uenoku uenoku force-pushed the dev/hidetou/mux-balance branch from 52dad7a to bf531be Compare October 3, 2025 17:38
@uenoku uenoku merged commit e589add into llvm:main Oct 3, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants