UPSTREAM PR #18106: model : add ASR support for LFM2-Audio-1.5B (conformer)#592
UPSTREAM PR #18106: model : add ASR support for LFM2-Audio-1.5B (conformer)#592loci-dev wants to merge 9 commits into
Conversation
|
Explore the complete analysis inside the Version Insights Performance Analysis Summary - PR #592OverviewPR #592 adds LFM2-Audio-1.5B Conformer architecture support for ASR. The changes introduce 640 additions across 17 files, primarily in the MTMD (multimodal) module. Performance impact is isolated to Key FindingsImpacted Functions in Performance-Critical AreasThe analysis reveals degradation concentrated in STL container operations within the MTMD module: Most-Impacted Functions (Absolute Change):
Root Cause: The Impact on Inference Performance (Tokens per Second)Core inference functions remain unaffected. Analysis of critical inference paths shows:
Tokens per second impact: 0% The degradation is confined to MTMD module initialization (model loading phase), not the inference loop. Using the reference that 2 ms slower Power Consumption AnalysisImpacted Binary:
Unaffected Binaries:
The power increase stems from cumulative throughput time increases in STL operations during MTMD model initialization. The 0.902% increase represents the energy cost of initializing larger data structures and loading additional tensors (21 per layer) for Conformer models. Code Changes ContextThe performance changes reflect intentional feature additions:
The degradation is proportional to the structural complexity added: each Conformer layer requires 21 additional tensor pointers, directly explaining the +48 ns constructor overhead and cascading STL operation costs. |
a014a6b to
eda9f43
Compare
15838f1 to
006b713
Compare
Mirrored from ggml-org/llama.cpp#18106
Supersede ggml-org/llama.cpp#17694