-
Notifications
You must be signed in to change notification settings - Fork 4
Implementing secrets env
egress schema
#430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
secrets env
egress schema
52ceb74
to
09cdc52
Compare
Ajv does not support automatically removing additional properties from a composed schema. It has an open issue addressing a way to do so (see ajv-validator/ajv#1346), but as of commenting, this is not yet implemented. Instead, Ajv will be used without async function loadEnvSchema(schemaPath) {
const schema = await $RefParser.bundle(schemaPath);
const props = new Set();
const required = new Set();
const defaults = {};
(function extract(s) {
if (!s || typeof s !== 'object') return;
// Collect properties and their defaults
if (s.properties) {
for (const [k, p] of Object.entries(s.properties)) {
props.add(k);
if (p.default != null) {
defaults[k] = p.default;
}
}
}
// Collect required properties
if (s.required) s.required.forEach((r) => required.add(r));
// Process composition keywords
['allOf', 'anyOf', 'oneOf'].forEach(
(k) => Array.isArray(s[k]) && s[k].forEach(extract),
);
})(schema);
return {
allKeys: [...props],
requiredKeys: [...required],
defaults: defaults,
};
} Finally, this can be processed as such to yield a populated const { allKeys, requiredKeys, defaults } = await loadEnvSchema('path/to/schema.json');
const env = {};
for (const key of allKeys) {
let value = process.env[key];
if (value == null && defaults[key] !== undefined) {
value = defaults[key];
}
// Note: An explicit empty string in process.env (value === '') will take precedence over a default.
if (requiredKeys.includes(key) && (value == null || value === '')) {
throw new Error(
`Required environment variable ${key} is missing, empty, or its default is invalid.`,
);
}
if (value !== undefined) {
// Boundary env values must be strings, otherwise it can cause type errors
env[key] = value.toString();
}
}
// Example usage
const someSecretValue = env['SECRET_VALUE']; |
One might consider whether we want to do strictly what the schema says. But remember there's a schema option called https://json-schema.org/understanding-json-schema/reference/object#additionalproperties. Without that, we should understand schemas as "gradual schemas". Any non-mentioned properties should be allowed to just pass through. |
Oh and you should check this https://json-schema.org/understanding-json-schema/reference/object#unevaluatedproperties |
Ah, so setting So far, I manually unwrapped the schema to extract the relevant keys to avoid requesting unnecessary keys, then validated the final result while applying defaults before returning that as the final environment variables. So far, if the schema was provided, the all secrets will be filtered against that, and all other secrets will be trimmed. It should be a fairly trivial change to check additional properties and allow all secrets or strictly only the requested ones. I feel the behaviour of I will look into them both, however, and get my progress reviewed to make sure I'm on the right track. |
The default expected behaviour is to allow any additional properties and only apply the rules to that which is specified. However this is why you want to use a pre-existing json schema parser that fully understands ALL the rules, and is up to date with respect to the json schema spec (they update the spec sometimes). Do not reinvent the wheel here. You're just wasting time if that's the case. It's far more efficient to use a well known json schema tool.
Just remember ajv doesn't understand how to "resolve" references. It requires additional tooling to do so, that's why I used the simplified library in my scripts. However for production tooling, I believe you're supposed to use ajv resolver. Look I asked this question with chatgpt, and it gives a perfect answer: Context & AssumptionsAssumptions:
1. AJV Core Capabilities and Limitations (as of 2025)
2. Current SOTA AJV Reference Resolution Ecosystem (July 2025)Key Packages:A.
|
Use-Case | SOTA Combo | Notes |
---|---|---|
Arbitrary/dynamic refs (runtime) | AJV + @hyperjump/json-schema | Best for dynamic/unknown schema graphs, latest spec |
Flatten refs (static/codegen/CI) | AJV + @apidevtools/json-schema-ref-parser | Best for known schema sets, simpler but less dynamic |
Complex "format" validation | + ajv-formats | Always include |
6. Caveats & Final Thoughts
-
Do not write your own resolver. All major SOTA tools are robust, fast, and actively maintained (as of July 2025).
-
AJV’s core maintainers and the JSON Schema community have converged on using
@hyperjump/json-schema
for serious resolver needs. -
For edge-case enterprise requirements (private schema registries, custom URI schemes, caching), you can extend
@hyperjump/json-schema
with your own loaders, but this is rare. -
For full spec-compliance, ensure all packages are at their latest major versions and test against your target draft.
References / Sources
If you want example code for a real-world AJV + resolver stack, or want to compare perf/compat between the two above, specify your scenario.
## Context & AssumptionsAssumptions:
- You want to use AJV (Another JSON Schema Validator) for validating JSON objects against JSON Schemas, potentially including remote/local
$ref
references. - You want fully dynamic, production-grade resolution: nested, cyclic, local and remote refs, all up-to-date with the latest JSON Schema (2020-12 and 2023-12) spec.
- You are not interested in writing your own resolver logic or patching together partial solutions.
- You require a solution that's "SOTA" (State of the Art) as of July 2025—with maximal ecosystem support, stability, and compliance.
1. AJV Core Capabilities and Limitations (as of 2025)
- AJV by itself supports local references (
$ref
) within the same schema or within pre-loaded schemas. - For dynamic, on-demand remote/local reference resolution (HTTP, file, etc.), AJV does not include a built-in universal resolver.
- AJV users must supplement with a schema loader/resolver library—especially for recursive, cyclic, or distributed schema graphs.
2. Current SOTA AJV Reference Resolution Ecosystem (July 2025)
Key Packages:
A. [@hyperjump/json-schema](https://www.npmjs.com/package/@hyperjump/json-schema)
-
Widely adopted, full JSON Schema reference resolver and evaluator, supports all official drafts up to the latest (2023-12).
-
Handles
$ref
resolution for:- Local files (filesystem URIs)
- HTTP(S) resources
- In-memory schemas
- Cyclic and recursive references (both local and remote)
-
Acts as both a resolver and validator, but can also act as a resolver only to plug into AJV or other engines.
B. [json-schema-ref-resolver](https://github.com/APIDevTools/json-schema-ref-resolver)
(a.k.a. @apidevtools/json-schema-ref-parser
)
- Longstanding, stable tool for dereferencing JSON Schemas, supports local and remote refs.
- Used by many codegen and validation tools.
- Sometimes falls short for the very latest JSON Schema drafts and complex edge cases, but often "just works".
C. [ajv-formats](https://github.com/ajv-validator/ajv-formats%60)
- Not a resolver, but important to support advanced format validation (e.g.,
uri
,date-time
, etc.) up to the latest spec.
D. [ajv-draft-2020](https://github.com/ajv-validator/ajv-draft-2020)
or AJV built-in draft support
- Ensures your AJV version is using the latest JSON Schema draft logic.
E. AJV v8+ built-in $refs
loader hooks
- AJV 8+ supports custom loaders via the
loadSchema
option (async schema loading), allowing you to plug in resolvers.
3. Recommended SOTA Combination (2025)
For a production-grade, fully dynamic, spec-compliant AJV-based validation stack:
Option 1: AJV + @hyperjump/json-schema
- Use AJV for validation, but delegate all reference resolution to
@hyperjump/json-schema
. @hyperjump/json-schema
is now widely recognized as the most spec-compliant resolver, and can be run in "resolver only" mode.- In AJV, set
loadSchema
to a wrapper that calls@hyperjump/json-schema
for schema loading.
import Ajv from "ajv/dist/2020"; // or latest
import { get } from "@hyperjump/json-schema";
const ajv = new Ajv({ loadSchema: async (uri) => {
return get(uri); // this uses @hyperjump/json-schema to fetch and resolve
}});
// Use ajv.compileAsync for schema compilation
Option 2: AJV + json-schema-ref-resolver
- Use
json-schema-ref-resolver
(or@apidevtools/json-schema-ref-parser
) to fully dereference your schemas before passing them to AJV. - This flattens the schema graph, so AJV doesn't have to resolve any
$ref
at validation time.
import $RefParser from '@apidevtools/json-schema-ref-parser';
import Ajv from 'ajv/dist/2020';
const parser = new $RefParser();
const schema = await parser.dereference(schemaPathOrObj);
const ajv = new Ajv();
const validate = ajv.compile(schema);
Note: This approach is not as dynamic at runtime for streaming validation of arbitrarily-referenced schemas, but is great for most use-cases where you have a bounded schema set.
4. Which Should You Use?
-
For runtime, dynamic reference resolution (arbitrary, possibly untrusted sources, e.g., API gateways, extensible pipelines):
Use AJV + @hyperjump/json-schema as your SOTA combo.
-
For static or semi-static schema sets (microservices, codegen, build-time validation):
Use AJV + json-schema-ref-resolver/@apidevtools/json-schema-ref-parser to flatten the schema tree ahead of time.
In both cases:
- Always use
ajv-formats
to cover full JSON Schema format validation. - Pin your AJV version to 8.x or later, and explicitly set the draft version.
5. Summary Table
Use-Case | SOTA Combo | Notes |
---|---|---|
Arbitrary/dynamic refs (runtime) | AJV + @hyperjump/json-schema | Best for dynamic/unknown schema graphs, latest spec |
Flatten refs (static/codegen/CI) | AJV + @apidevtools/json-schema-ref-parser | Best for known schema sets, simpler but less dynamic |
Complex "format" validation | + ajv-formats | Always include |
6. Caveats & Final Thoughts
- Do not write your own resolver. All major SOTA tools are robust, fast, and actively maintained (as of July 2025).
- AJV’s core maintainers and the JSON Schema community have converged on using
@hyperjump/json-schema
for serious resolver needs. - For edge-case enterprise requirements (private schema registries, custom URI schemes, caching), you can extend
@hyperjump/json-schema
with your own loaders, but this is rare. - For full spec-compliance, ensure all packages are at their latest major versions and test against your target draft.
References / Sources
- [AJV documentation](https://ajv.js.org/)
- [@hyperjump/json-schema](https://www.npmjs.com/package/@hyperjump/json-schema)
- [@apidevtools/json-schema-ref-parser](https://www.npmjs.com/package/@apidevtools/json-schema-ref-parser)
- [ajv-formats](https://ajv.js.org/packages/ajv-formats.html)
- [JSON Schema latest draft status](https://json-schema.org/)
If you want example code for a real-world AJV + resolver stack, or want to compare perf/compat between the two above, specify your scenario.
Previously, I was going out-of-scope a little bit, but now I've realigned this PR to better match the expectations. |
I have tested this on the schema from zeta house, and it seems to work as intended. The schemas do not supply a type other than a string, so for the time being, that has not been tested within a repo, but the tests actually handle this case.
If this behaviour is fine, then after a quick cleanup, this is ready for merging @CMCDragonkai. |
The command can now take a flag for The schema validation is applied at the end of collecting the secrets, so behaviour controls like duplicate name behaviour, etc. are unmodified. The default behaviour for a failing schema validation is an error being thrown, printing no secrets. Other potential behaviour could be printing the secrets but warning the user about each validation error. However, these options have not been incorporated in this iteration of the command. We need to figure out the default behaviour of a failing validation. Noop makes no sense here as you would just not specify a schema for that. The two real options are to fail the command without printing the secrets, just the error, or we can print the secrets and the validation errors at the end of the command, so the schema was more a guideline than a strict rule. I feel the default would make sense to be strict validation rather than printing both the secrets and errors, but this needs more discussion before going ahead. |
I feel like the error report shouldn't be stuffed into the data... There should be more rich reporting. But it's something we have to deal with at a meta level.
|
Ok start with this behaviour you've specified for now. And just merge.
|
My first instinct was to use an A similar issue plagues Polykey when it fails to connect to any seednode. In Polykey, the details of all the failing connections are included as a part of the error message itself, which I felt was also kinda lacking. We need a way to extend |
Description
A schema will dictate which secrets need to be exported and which ones aren't, better enforcing POLP, and make secrets management easier.
Issues Fixed
polykey secrets env
#429 (FIX ENG-638)Tasks
ajv
to validate a schemaFinal checklist