Skip to content

Overlay: check query packs for compatibility #2993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

cklin
Copy link
Contributor

@cklin cklin commented Jul 25, 2025

This PR updates the init action to prevent overlay analysis (or overlay-base database construction) when one of the query packs involved does not support overlay analysis with the installed CodeQL CLI.

  • @alexet Please review the aspects concerning interaction with the CodeQL bundle
  • @mbg Please review all other aspects of the change

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.
  • Confirm the changelog has been updated if necessary.

@cklin cklin force-pushed the cklin/overlay-pack-check branch 4 times, most recently from 230fc32 to d0d4112 Compare July 29, 2025 18:58
@cklin cklin marked this pull request as ready for review July 29, 2025 19:20
@Copilot Copilot AI review requested due to automatic review settings July 29, 2025 19:20
@cklin cklin requested a review from a team as a code owner July 29, 2025 19:20
@cklin cklin requested review from mbg and alexet July 29, 2025 19:20
Copilot

This comment was marked as outdated.

Copy link
Contributor

@alexet alexet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't feel to me that resolve packs is quite the right command (although it looks like it will work).It feels like we should run resolve queries with --format startingpacks to find the query packs we care about, and then just use those.

But I am also not fully sure of all the subtleties of either choice.

@cklin
Copy link
Contributor Author

cklin commented Jul 31, 2025

It doesn't feel to me that resolve packs is quite the right command (although it looks like it will work).It feels like we should run resolve queries with --format startingpacks to find the query packs we care about, and then just use those.

But I am also not fully sure of all the subtleties of either choice.

Thanks for the suggestion!

When I first looked into the problem, I considered this approach but was unable to get it to work. While codeql resolve queries with --format startingpacks can map queries to their containing query packs, there does not seem to be a way to get the list of queries that would be used in the analysis and feed that into codeql resolve queries. I could see that codeql database init already computes that information (though a deep-plumbing subcommand) for internal use, but no easy way to obtain that information from outside the CodeQL CLI. Re-interpreting the Code Scanning config file in the action seems like a lot of work for relatively little gain.

Prompted by your suggestion, I looked again. This time I noticed that codeql database init actually generates config-queries.qls suite file with the list of queries, so all I need to do is to feed that suite file into codeql resolve queries, and I can get the list of query packs to check for overlay compatibility.

(There is the slight complication that codeql resolve queries --format startingpacks would also return query packs that have not been compiled. I will have to make the action ignore those query packs when checking for overlay compatibility.)

@cklin cklin marked this pull request as draft July 31, 2025 16:06
@cklin cklin force-pushed the cklin/overlay-pack-check branch from d0d4112 to f1a0360 Compare July 31, 2025 18:31
@cklin
Copy link
Contributor Author

cklin commented Jul 31, 2025

It doesn't feel to me that resolve packs is quite the right command (although it looks like it will work).It feels like we should run resolve queries with --format startingpacks to find the query packs we care about, and then just use those.

I updated the PR to use this approach to identify the query packs to check for overlay compatibility. Related changes:

  • I removed the commit that adds resolvePacks() to the CodeQL interface
  • Instead, there is now a commit to add resolveQueriesStartingPacks()
  • To avoid confusion, I removed the unused resolveQueries() from CodeQL
  • While I am at it, I also removed the unused packDownload() from CodeQL

The other commits remain the same as before. PTAL.

@cklin cklin marked this pull request as ready for review July 31, 2025 20:38
@cklin cklin requested review from alexet and Copilot July 31, 2025 20:38
@cklin cklin force-pushed the cklin/overlay-pack-check branch from f1a0360 to 119d629 Compare August 5, 2025 18:03
Copilot

This comment was marked as outdated.

@cklin
Copy link
Contributor Author

cklin commented Aug 5, 2025

The latest force push updated CODEQL_OVERLAY_MINIMUM_VERSION to 2.22.3 at Alex's request.

Copy link
Member

@mbg mbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As always, thanks for taking on this work! I have added a few suggestions, questions, and points for discussion.

@@ -218,6 +218,7 @@ export interface CodeQL {
export interface VersionInfo {
version: string;
features?: { [name: string]: boolean };
overlayVersion?: number;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I know the other fields here don't have documentation, but perhaps it would be useful to add a comment for overlayVersion to document what purpose it serves so that it's easy to look up when working on the action code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const output = await runCli(cmd, codeqlArgs, { noStreamStdout: true });

try {
return JSON.parse(output) as string[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not specific to this PR: I see that we follow the same JSON.parse(output) as type pattern throughout the implementation here, but that doesn't actually give us any guarantees that the result of JSON.parse is compatible with the type here. This may effectively delay an error until a later point. Where possible, we should probably check that the result actually what we expect (and not just valid JSON) as done in e.g. #2956.

I don't think this has to be addressed here, but we might want to look into improving this throughout in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged. I will leave it to you to decide if and when you want to perform that overall cleanup.

src/init.ts Outdated
Comment on lines 130 to 121
const suitePath = path.join(
config.dbLocation,
language,
"temp",
"config-queries.qls",
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now at least the second place where we construct this path (after my changes in #2935) so we may want to add a utility function that constructs this path for a given config+language somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/init.ts Outdated
Comment on lines 137 to 129
for (const packDir of packDirs) {
if (
!checkPackForOverlayCompatibility(packDir, codeQlOverlayVersion, logger)
) {
return false;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: perhaps this could be nicer as something like:

if (packDirs.some((packDir) => !checkPackForOverlayCompatibility(packDir, codeQlOverlayVersion, logger))) {
  return false;
}

You could potentially also chain something together with the outer loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/init.ts Outdated
const qlpackPath = path.join(packDir, "qlpack.yml");
const qlpackContents = yaml.load(
fs.readFileSync(qlpackPath, "utf8"),
) as any;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Copilot here - it would be good to have typings for (at least) the parts of the format that are used here.

This might also make the tests a little nicer since you can then have test objects of this type that you can serialise, rather than hard-coded strings.


const packInfoFileContents = JSON.parse(
fs.readFileSync(packInfoPath, "utf8"),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with Copilot. It would be good to distinguish errors while reading/parsing JSON from other errors.

Comment on lines +190 to +183
logger.warning(
`The query pack at ${packDir} is not compatible with overlay analysis.`,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This is very non-specific. How about "The query pack at ${packDir} does not contain the required overlayVersion property."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioning the overlayVersion property would make the message more specific, but it would not add any useful information.

Code Scanning users do not know what the overlayVersion property is or what it does. It is an implementation detail used to indicate presence and compatibility of overlay support. If a customer files a support ticket asking about the overlayVersion property being absent, our answer would be "that means your pack is not compatible with overlay analysis".

So I don't see how adding that detail would be useful.

logger.warning(
`The query pack at ${packDir} was compiled with ` +
`overlay version ${packOverlayVersion}, but the CodeQL CLI ` +
`supports only overlay version ${codeQlOverlayVersion}. The ` +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording here suggests that codeQlOverlayVersion is smaller than packOverlayVersion, but the check just tests that they are not equal. Should the condition be different or the wording of the warning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The !== comparison is intended. If the pack overlay version is lower than the CLI overlay version, overlay analysis is not guaranteed to work. If the pack overlay version is higher than the CLI overlay version, overlay analysis is also not guaranteed to work.

We don't plan to ever reduce overlay versions with a new CodeQL CLI release, so the most common case would be the CLI having a higher overlay version that the pack. But I think the error message works both ways:

The query pack at ${packDir} was compiled with overlay version 6, but the CodeQL CLI supports only overlay version 4. The query pack needs to be recompiled to support overlay analysis.

and

The query pack at ${packDir} was compiled with overlay version 4, but the CodeQL CLI supports only overlay version 6. The query pack needs to be recompiled to support overlay analysis.

@@ -78,21 +78,40 @@ export async function runInit(
apiDetails: GitHubApiCombinedDetails,
logger: Logger,
): Promise<TracerConfig | undefined> {
fs.mkdirSync(config.dbLocation, { recursive: true });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting this for future reference: it doesn't look like dbLocation is used by generateRegistries, so moving this seems fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, codeql database init generally expects the database path to be non-existent and reports an error when the database path already exists. (There are some exceptions, such as when initializing an overlay database.) So I would be very surprised if generateRegistries() writes anything into dbLocation.

Comment on lines +737 to +777
// To check custom query packs for compatibility with overlay analysis, we
// need to first initialize the database cluster, which downloads the
// user-specified custom query packs. But we also want to check custom query
// pack compatibility first, because database cluster initialization depends
// on the overlay database mode. The solution is to initialize the database
// cluster first, check custom query pack compatibility, and if we need to
// revert to `OverlayDatabaseMode.None`, re-initialize the database cluster
// with the new overlay database mode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a good bit of effort in the last couple of commits here is spent on enabling this. I am sure you've already thought about this, but is there any way that we could change the overlay database mode for the database cluster after it has been initialised? Or alternatively, shuffle things around so that we can download the user-specified custom query packs?

I am trying to understand the design space here a bit better. I am also thinking about concurrent efforts we'll have to look into related to code quality, which might mean that we have to break up what happens when a database is initialised anyway. It might be that we can find a solution for both in one go depending on what timelines permit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I responded in more detail in the internal tracking issue, but the short answer is that neither of the suggested approach would be practical. Nonetheless thank you for the suggestions!

@cklin cklin force-pushed the cklin/overlay-pack-check branch from 119d629 to 7ba8b26 Compare August 5, 2025 21:08
@Copilot Copilot AI review requested due to automatic review settings August 5, 2025 21:44
@cklin cklin force-pushed the cklin/overlay-pack-check branch from 7ba8b26 to 210aa6a Compare August 5, 2025 21:44
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds query pack compatibility checking for overlay analysis in the CodeQL Action. The main purpose is to ensure that when overlay analysis is enabled, all query packs involved are compatible with the installed CodeQL CLI's overlay analysis support before proceeding.

  • Introduces a new checkPacksForOverlayCompatibility function that validates query pack overlay compatibility
  • Refactors the runInit function into runDatabaseInitCluster and updates the overlay compatibility checking workflow
  • Updates the minimum CodeQL overlay version requirement from "2.20.5" to "2.22.3"

Reviewed Changes

Copilot reviewed 20 out of 29 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/util.ts Adds getGeneratedSuitePath helper function for query suite paths
src/testing-utils.ts Updates mock functions to support overlayVersion parameter
src/overlay-database-utils.ts Updates minimum CodeQL overlay version constant
src/init.ts Adds overlay compatibility checking logic and refactors database initialization
src/init.test.ts Adds comprehensive tests for overlay compatibility checking
src/init-action.ts Updates main initialization flow to handle overlay compatibility checks
src/config-utils.test.ts Updates test configuration to use new minimum overlay version
src/codeql.ts Adds overlayVersion to VersionInfo and resolveQueriesStartingPacks method
src/analyze.ts Uses new getGeneratedSuitePath helper function
src/analyze.test.ts Removes unused packDownload mock
Comments suppressed due to low confidence (2)

src/init.ts:79

  • [nitpick] The parameter order in runDatabaseInitCluster is inconsistent with the original runInit function. Consider maintaining the same parameter order (codeql, config, sourceRoot, processName, then additional parameters) for better consistency and easier migration.
export async function runDatabaseInitCluster(
  databaseInitEnvironment: Record<string, string | undefined>,
  codeql: CodeQL,
  config: configUtils.Config,
  sourceRoot: string,
  processName: string | undefined,
  qlconfigFile: string | undefined,
  logger: Logger,
): Promise<void> {

src/init.ts:138

  • [nitpick] The interface QlPack is too generic and incomplete for representing a qlpack.yml file structure. Consider renaming it to QlPackMetadata or QlPackYaml to be more specific about its purpose.
interface QlPack {
  buildMetadata?: string;
}

const packOverlayVersion = packInfoFileContents.overlayVersion;
if (typeof packOverlayVersion !== "number") {
logger.warning(
`The query pack at ${packDir} is not compatible with overlay analysis.`,
Copy link
Preview

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message 'is not compatible with overlay analysis' is vague. Consider making it more specific, such as 'is missing the overlayVersion field in .packinfo' or 'has an invalid overlayVersion field in .packinfo' to help users understand exactly what's wrong.

Suggested change
`The query pack at ${packDir} is not compatible with overlay analysis.`,
`The query pack at ${packDir} has an invalid or missing overlayVersion field in .packinfo and is not compatible with overlay analysis.`,

Copilot uses AI. Check for mistakes.

const qlpackContents = yaml.load(
fs.readFileSync(qlpackPath, "utf8"),
) as QlPack;
if (!qlpackContents.buildMetadata) {
Copy link
Preview

Copilot AI Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code assumes qlpack.yml exists and can be parsed as YAML, but doesn't handle the case where the file doesn't exist or contains invalid YAML. This could cause the function to throw an unhandled exception instead of logging a warning and returning false.

Copilot uses AI. Check for mistakes.

@cklin
Copy link
Contributor Author

cklin commented Aug 5, 2025

I responded to some of the comments, and will follow up on the remaining ones tomorrow.

alexet
alexet previously approved these changes Aug 6, 2025
@cklin
Copy link
Contributor Author

cklin commented Aug 6, 2025

I responded to all comments (responses to copilot-initiated comments might be visible only under the original copilot reviews).

Also rebased the PR against current main to resolve merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants