Skip to content

feat(middleware): replace recursive jq walk_schema with native Go implementation#4750

Merged
lpcox merged 3 commits into
mainfrom
copilot/go-fan-review-itchyny-gojq
Apr 28, 2026
Merged

feat(middleware): replace recursive jq walk_schema with native Go implementation#4750
lpcox merged 3 commits into
mainfrom
copilot/go-fan-review-itchyny-gojq

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 28, 2026

The itchyny/gojq module review identified three actionable improvements to internal/middleware/jqschema.go: replace the pure-jq recursive schema walk with a native Go function, harden the type switch against json.Number/nil inputs, and clarify benchmark intent.

Native Go walk_schema via gojq.WithFunction

Adds inferSchema(v interface{}) interface{} — a pure-Go recursive schema walker — and registers it as a native gojq function. The jq filter collapses from 12 lines of recursive jq bytecode to a single call:

// Before: 12-line recursive jq def
const jqSchemaFilter = `
def walk_schema:
  . as $in |
  if type == "object" then
    reduce keys[] as $k ({}; . + {($k): ($in[$k] | walk_schema)})
  ...
walk_schema
`

// After: native Go function, trivial jq entry point
const jqSchemaFilter = `walk_schema`

// Registered at compile time:
gojq.Compile(query,
    gojq.WithFunction("walk_schema", 0, 0, func(v interface{}, _ []interface{}) interface{} {
        return inferSchema(v)
    }),
)

inferSchema handles all jq type names ("null", "boolean", "number", "string") and preserves the array-collapsing behaviour (first element only, [] for empty).

Defensive type switch in WrapToolHandler

Adds case nil (early return, nil schema) and case json.Number (convert via .Float64()) to the type switch that decides whether to skip the json.Unmarshal round-trip. Guards against decoders using UseNumber().

Tests

  • TestInferSchema — exhaustive unit tests for every type case of inferSchema
  • TestInferSchema_MatchesJqOutput — cross-checks native Go output against applyJqSchema to prevent drift
  • TestWrapToolHandler_JsonNumberData — end-to-end coverage of the new json.Number path

Benchmark clarification

Adds an explicit comment to BenchmarkApplyJqSchema_ParseEveryTime noting it intentionally simulates the degraded parse-every-call path for comparison purposes only.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • example.com
    • Triggering command: /tmp/go-build2697671537/b513/launcher.test /tmp/go-build2697671537/b513/launcher.test -test.testlogfile=/tmp/go-build2697671537/b513/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2697671537/b410/vet.cfg se 690953/b038/vet.cfg x_amd64/vet . pb (dns block)
    • Triggering command: /tmp/go-build2068934155/b509/launcher.test /tmp/go-build2068934155/b509/launcher.test -test.testlogfile=/tmp/go-build2068934155/b509/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true 83096d3e13560ff0 64/pkg/tool/linu/opt/hostedtoolcjson 7671537/b486/vet.cfg OuzHqgGgH cfg 64/pkg/tool/linu--log /opt/hostedtoolc/var/lib/docker/buildkit/executor/runc-log.json -ato�� -bool ettings.Ports}} 3874a3be3dd1a781b6hh82lckewk8vbuoyc6zqssf -errorsas -ifaceassert -nilfunc ache/go/1.25.9/x-buildtags (dns block)
  • invalid-host-that-does-not-exist-12345.com
    • Triggering command: /tmp/go-build2697671537/b495/config.test /tmp/go-build2697671537/b495/config.test -test.testlogfile=/tmp/go-build2697671537/b495/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2697671537/b395/vet.cfg se t.go x_amd64/compile . --gdwarf2 --64 x_amd64/compile (dns block)
    • Triggering command: /tmp/go-build2068934155/b491/config.test /tmp/go-build2068934155/b491/config.test -test.testlogfile=/tmp/go-build2068934155/b491/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true 5272951ab22a7972c140e0b7a44abee6634cf399ad4a01e54bb x_amd64/compile ateway_with_pipe.sh (dns block)
  • nonexistent.local
    • Triggering command: /tmp/go-build2697671537/b513/launcher.test /tmp/go-build2697671537/b513/launcher.test -test.testlogfile=/tmp/go-build2697671537/b513/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2697671537/b410/vet.cfg se 690953/b038/vet.cfg x_amd64/vet . pb (dns block)
    • Triggering command: /tmp/go-build2068934155/b509/launcher.test /tmp/go-build2068934155/b509/launcher.test -test.testlogfile=/tmp/go-build2068934155/b509/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true 83096d3e13560ff0 64/pkg/tool/linu/opt/hostedtoolcjson 7671537/b486/vet.cfg OuzHqgGgH cfg 64/pkg/tool/linu--log /opt/hostedtoolc/var/lib/docker/buildkit/executor/runc-log.json -ato�� -bool ettings.Ports}} 3874a3be3dd1a781b6hh82lckewk8vbuoyc6zqssf -errorsas -ifaceassert -nilfunc ache/go/1.25.9/x-buildtags (dns block)
  • slow.example.com
    • Triggering command: /tmp/go-build2697671537/b513/launcher.test /tmp/go-build2697671537/b513/launcher.test -test.testlogfile=/tmp/go-build2697671537/b513/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build2697671537/b410/vet.cfg se 690953/b038/vet.cfg x_amd64/vet . pb (dns block)
    • Triggering command: /tmp/go-build2068934155/b509/launcher.test /tmp/go-build2068934155/b509/launcher.test -test.testlogfile=/tmp/go-build2068934155/b509/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true 83096d3e13560ff0 64/pkg/tool/linu/opt/hostedtoolcjson 7671537/b486/vet.cfg OuzHqgGgH cfg 64/pkg/tool/linu--log /opt/hostedtoolc/var/lib/docker/buildkit/executor/runc-log.json -ato�� -bool ettings.Ports}} 3874a3be3dd1a781b6hh82lckewk8vbuoyc6zqssf -errorsas -ifaceassert -nilfunc ache/go/1.25.9/x-buildtags (dns block)
  • this-host-does-not-exist-12345.com
    • Triggering command: /tmp/go-build2697671537/b522/mcp.test /tmp/go-build2697671537/b522/mcp.test -test.testlogfile=/tmp/go-build2697671537/b522/testlog.txt -test.paniconexit0 -test.timeout=10m0s -o g_.a elemetry.io/otel-ifaceassert x_amd64/vet -p net/http/interna/usr/bin/runc -lang=go1.25 x_amd64/vet -uns�� .cfg elemetry.io/otel@v1.43.0/propagation/doc.go x_amd64/vet go1.25.9 .io/otel/baggage/usr/bin/runc -nolocalimports x_amd64/vet (dns block)
    • Triggering command: /tmp/go-build2068934155/b518/mcp.test /tmp/go-build2068934155/b518/mcp.test -test.testlogfile=/tmp/go-build2068934155/b518/testlog.txt -test.paniconexit0 -test.timeout=10m0s -test.v=true 93bff293c3594aa8 -buildtags .d/chrony-onoffline by/edae42a48ad93bash -ifaceassert -nilfunc .d/chrony-onoffline -tes�� cf399ad4a01e54bb -test.timeout=10m0s "CURL_CA_BUNDLE=/-id by/17dafc0aef64ebash t.go x_amd64/compile 152/log.json (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI linked an issue Apr 28, 2026 that may be closed by this pull request
4 tasks
… improved benchmarks

- Replace pure-jq recursive walk_schema with a native Go inferSchema function
  registered via gojq.WithFunction; jq filter simplifies to a single `walk_schema` call
- Add inferSchema(v interface{}) implementing the same schema-walk logic in Go:
  objects traversed key-by-key, arrays collapsed to first element, leaves replaced
  by jq type names ("null", "boolean", "number", "string")
- Add nil and json.Number cases to the type switch in WrapToolHandler for defensive
  robustness against decoders using UseNumber() or nil interface values
- Add clarifying comment to BenchmarkApplyJqSchema_ParseEveryTime noting it
  intentionally simulates the degraded (parse-every-call) path
- Add TestInferSchema, TestInferSchema_MatchesJqOutput, and
  TestWrapToolHandler_JsonNumberData tests
- Update README to document the native Go schema-walk approach

Agent-Logs-Url: https://github.com/github/gh-aw-mcpg/sessions/b5412910-106f-47de-ae48-8a2d7037d2d7

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Copilot AI changed the title [WIP] Review Go module itchyny/gojq feat(middleware): replace recursive jq walk_schema with native Go implementation Apr 28, 2026
Copilot AI requested a review from lpcox April 28, 2026 17:05
@lpcox lpcox marked this pull request as ready for review April 28, 2026 17:37
Copilot AI review requested due to automatic review settings April 28, 2026 17:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the recursive jq-based schema walker with a native Go implementation registered as a custom gojq function, and hardens schema inference when tool payloads include json.Number/nil.

Changes:

  • Add inferSchema (pure Go recursive schema walker) and register it as walk_schema via gojq.WithFunction.
  • Update WrapToolHandler schema-prep type switch to handle nil and json.Number.
  • Add/extend tests and update middleware README and benchmark commentary.
Show a summary per file
File Description
internal/middleware/jqschema.go Introduces inferSchema and registers it as native walk_schema; adjusts schema-prep type switch.
internal/middleware/jqschema_test.go Adds unit/integration-style tests for inferSchema and new json.Number handling.
internal/middleware/jqschema_bench_test.go Clarifies benchmark intent for parse-every-time path (but benchmark logic needs updating for native function).
internal/middleware/README.md Updates documentation to describe native Go schema walk and type mapping.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 4

Comment thread internal/middleware/jqschema_test.go Outdated
Comment on lines +1537 to +1541
// Wrap data in a map that contains a json.Number value so the type switch exercises
// the json.Number case via the json.Unmarshal fallback path after marshaling.
//
// To exercise the direct json.Number case in the type switch we pass json.Number as
// the top-level data value (the switch is on `data`, not on individual map values).
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above TestWrapToolHandler_JsonNumberData says it wraps data in a map to exercise the json.Unmarshal fallback path, but the mock handler actually returns json.Number("42") as the top-level data value, which hits the direct case json.Number branch. Please adjust/remove the misleading comment so future readers understand what scenario is being tested.

Suggested change
// Wrap data in a map that contains a json.Number value so the type switch exercises
// the json.Number case via the json.Unmarshal fallback path after marshaling.
//
// To exercise the direct json.Number case in the type switch we pass json.Number as
// the top-level data value (the switch is on `data`, not on individual map values).
// Return json.Number as the top-level data value so the type switch hits the
// direct json.Number case (the switch is on `data`, not on nested map values).

Copilot uses AI. Check for mistakes.
Comment thread internal/middleware/jqschema.go Outdated
Comment on lines +94 to +98
case float64, int, json.Number:
return "number"
case string:
return "string"
default:
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inferSchema currently only treats float64, int, and json.Number as "number". This contradicts the function comment ("other numerics") and will misclassify common numeric types like int64, uint64, float32, etc. as "string" via the default case. Consider handling all integer/unsigned/float kinds (e.g., via reflect.Kind) and adding a unit test that exercises at least one non-int numeric type (like int64).

Copilot uses AI. Check for mistakes.
Comment thread internal/middleware/jqschema_test.go Outdated
Comment on lines +1500 to +1508
// TestInferSchema_MatchesJqOutput verifies that inferSchema produces the same JSON as
// applyJqSchema (which invokes the gojq runtime). This guards against drift between the
// native Go implementation and the jq-backed path.
func TestInferSchema_MatchesJqOutput(t *testing.T) {
inputs := []interface{}{
map[string]interface{}{"name": "test", "count": 42},
map[string]interface{}{"user": map[string]interface{}{"id": 123, "active": true}},
map[string]interface{}{"items": []interface{}{map[string]interface{}{"id": 1, "name": "test"}}},
map[string]interface{}{"items": []interface{}{}},
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestInferSchema_MatchesJqOutput is described as guarding against drift between the native Go implementation and the jq-backed path, but applyJqSchema now calls the same inferSchema implementation via gojq.WithFunction("walk_schema"). As written, this test is effectively comparing inferSchema to itself (just through a different call path). Either update the test name/comment to reflect that it validates the gojq function wiring, or keep a legacy jq-only filter (in tests) to provide an independent cross-check.

Copilot uses AI. Check for mistakes.
Comment on lines 88 to 92
// NOTE: This benchmark intentionally simulates the degraded path where the
// query is parsed (but not compiled) on every call, bypassing gojq.Compile.
// It exists solely to quantify the speedup delivered by the compile-once
// approach used in production. Do not switch production code to this pattern.
func BenchmarkApplyJqSchema_ParseEveryTime(b *testing.B) {
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new NOTE says this benchmark bypasses gojq.Compile, but the benchmark body still runs jqSchemaFilter (now walk_schema) via query.RunWithContext without registering the native walk_schema function. As a result, the benchmark will error at runtime (undefined function) and no longer measures anything meaningful. Either (a) compile each iteration with gojq.WithFunction("walk_schema", ...) (measuring parse+compile-every-time), or (b) benchmark a legacy jq-only filter that doesn't require native function registration.

Suggested change
// NOTE: This benchmark intentionally simulates the degraded path where the
// query is parsed (but not compiled) on every call, bypassing gojq.Compile.
// It exists solely to quantify the speedup delivered by the compile-once
// approach used in production. Do not switch production code to this pattern.
func BenchmarkApplyJqSchema_ParseEveryTime(b *testing.B) {
// NOTE: This benchmark used to simulate a degraded path that parsed the jq
// query without compiling it. That is no longer a valid benchmark for the
// current filter because `walk_schema` is a native gojq function and must be
// registered during compilation. Running the old path now fails at runtime
// with an undefined-function error and does not produce meaningful results.
func BenchmarkApplyJqSchema_ParseEveryTime(b *testing.B) {
b.Skip("invalid benchmark: walk_schema requires gojq.Compile with function registration")

Copilot uses AI. Check for mistakes.
@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented Apr 28, 2026

@copilot address the review feedback

…nd test comments

- Expand inferSchema to handle all integer/unsigned/float types via explicit cases
  (float32, int8/16/32/64, uint/8/16/32/64) plus a reflect.Kind fallback; add
  int64 and float32 test cases to TestInferSchema
- Skip BenchmarkApplyJqSchema_ParseEveryTime: walk_schema now requires
  gojq.WithFunction registration so the parse-only path is no longer runnable
- Fix TestInferSchema_MatchesJqOutput comment: clarifies it validates gojq wiring,
  not an independent cross-check (applyJqSchema calls inferSchema via gojq)
- Fix misleading comment in TestWrapToolHandler_JsonNumberData: clarify it tests
  the direct json.Number case in the type switch

Agent-Logs-Url: https://github.com/github/gh-aw-mcpg/sessions/3d0e81f6-55cb-4418-b0fa-9cc14439da91

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 28, 2026

@copilot address the review feedback

Addressed all four review points in fb4a917:

  1. inferSchema numeric coverage — expanded the type switch to explicitly cover float32, int8/16/32/64, uint/8/16/32/64 alongside float64, int, and json.Number; added a reflect.Kind fallback in default for any remaining numeric types. Added int64 and float32 test cases to TestInferSchema.

  2. BenchmarkApplyJqSchema_ParseEveryTime — skipped the benchmark with an explanatory message: walk_schema now requires gojq.WithFunction registration, so the parse-only path fails at runtime.

  3. TestInferSchema_MatchesJqOutput comment — updated to clarify it validates the gojq.WithFunction wiring (that the compiled jq code correctly dispatches to the native Go implementation), not an independent cross-check.

  4. TestWrapToolHandler_JsonNumberData comment — replaced the contradictory two-paragraph comment with the suggested single clear sentence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[go-fan] Go Module Review: itchyny/gojq

3 participants