-
Notifications
You must be signed in to change notification settings - Fork 22
Description
We have started hitting the following issue with the CI coverage job when we recently upgraded to GCC 15:
lcov: ERROR: (inconsistent) "/__w/mrdocs/third-party/llvm-project/install/include/clang/AST/DeclCXX.h":2795: function '_ZN5clang18CXXConstructorDecl16getCanonicalDeclEv' is not hit but line 2796 is.
To skip consistency checks, see the 'check_data_consistency' section in man lcovrc(5).
(use "lcov --ignore-errors inconsistent ..." to bypass this error)
The upgraded happened in #933
The problem started happening after the above was merged.
We worked around this problem in #938, by simply disabling the consistency check as the error
suggested.
After a preliminary investigation of this issue, we ruled out a GCC 14 / GCC 15 interop problem, by cleaning the caches and building everything fresh from GCC 15.
We would wish to remove this workaround, as even though its happening to the LLVM coverage data, which is not very important, the
workaround may mask future problems happening on mrdocs coverage data itself.
According to the GCC docs, the optimization level does affect the quality of the coverage data, and indeed this job uses an optimized build. It may be worth exploring what happens on an unoptimized build, but if this is indeed the cause, this may point towards an issue worth filing on the GCC bug system, due to how user-hostile the error is, given that coverage for optimized builds is a supported feature.
On the other hand, it may be worth exploring moving this job to Clang instead. Clang takes a different approach where the coverage data is
not affected by optimization levels: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html
This likely implies clang does not optimize coverage-enabled compilations as well as GCC does it, as the latter compromises the coverage quality in favor of optimization.
But this seems like the right tradeoff: the coverage-enabled build is for testing coverage, not normal uses, and this worse optimization
would only be a problem if it makes the job take many times longer to execute. On the other hand, worse coverage data quality may make us waste engineering time.
Interestingly though, the premerge pipeline for the first PR did pass, which is a problem on its own.
We need to make sure this failure happens in premerge as well, to be able to properly test any fixes.