feat(telegram): add Telegram bot deployment, provider deploy to Hermes, and Ollama/Custom provider support#5
Conversation
|
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughAdds provider baseUrl support (Ollama/custom), Telegram deploy/pairing/test flows wired over SSH with compose generation, SSH utility refactor, dashboard records/summaries/metrics, react-hook-form/zod migrations and UI components, TypeScript incremental builds with CI cache, extensive tests, and updated docs/ADRs. ChangesProvider, Telegram, SSH, and Deployment
Forms, UI, and UX
Build, tooling, and docs
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for Ollama and Custom AI providers by adding a base_url column to the database and updating the provider settings UI. It also adds features to deploy and test Telegram bots on a VPS running Hermes. The review feedback highlights several critical and high-severity issues: a missing database migration for the new telegram_configs columns, potential key corruption in decryptApiKey when parsing keys starting with {, hardcoded Docker image names instead of using the defaultHermesImage constant, and logical bugs where old deployment status is incorrectly preserved when connecting a new bot. Additionally, the Telegram bot testing endpoint needs more robust error handling for API and parsing failures.
| deployedServerId: text("deployed_server_id"), | ||
| deployedServerHost: text("deployed_server_host"), | ||
| apiServerKey: text("api_server_key"), |
There was a problem hiding this comment.
The database migration for the new columns added to telegram_configs (deployed_server_id, deployed_server_host, and api_server_key) is missing. Only the migration for ai_providers.base_url was added in drizzle/0001_ai_providers_base_url.sql. Please run drizzle-kit generate to generate the missing migration, or manually add the statements to a new migration file.
| function decryptApiKey(encryptedStr: string): string { | ||
| try { | ||
| const decrypted = decryptSecret(encryptedStr); | ||
| // Backward-compatibility: keys saved before the explicit baseUrl column | ||
| // may have been stored as JSON {apiKey, baseUrl}. Unwrap if so. | ||
| if (decrypted.startsWith("{")) { | ||
| const parsed = JSON.parse(decrypted) as Record<string, unknown>; | ||
| if (typeof parsed.apiKey === "string") { | ||
| return parsed.apiKey; | ||
| } | ||
| } | ||
| return decrypted; | ||
| } catch { | ||
| return ""; | ||
| } | ||
| } |
There was a problem hiding this comment.
If JSON.parse fails because the decrypted key starts with { but is not valid JSON, the entire function will throw an error and return an empty string "". This will corrupt valid API keys that happen to start with {. Wrap the JSON.parse call in its own try-catch block so that if parsing fails, the function falls back to returning the raw decrypted string.
function decryptApiKey(encryptedStr: string): string {
try {
const decrypted = decryptSecret(encryptedStr);
// Backward-compatibility: keys saved before the explicit baseUrl column
// may have been stored as JSON {apiKey, baseUrl}. Unwrap if so.
if (decrypted.startsWith("{")) {
try {
const parsed = JSON.parse(decrypted) as Record<string, unknown>;
if (typeof parsed.apiKey === "string") {
return parsed.apiKey;
}
} catch {
// If parsing fails, treat it as a raw key starting with '{'
}
}
return decrypted;
} catch {
return "";
}
}| import { normalizeAuthMethod, resolveServerCredential } from "./server-records"; | ||
| import { withSshConnection } from "./ssh"; |
There was a problem hiding this comment.
Import defaultHermesImage from ./constants to avoid hardcoding the Docker image name in the deployment step.
| import { normalizeAuthMethod, resolveServerCredential } from "./server-records"; | |
| import { withSshConnection } from "./ssh"; | |
| import { defaultHermesImage } from "./constants"; | |
| import { normalizeAuthMethod, resolveServerCredential } from "./server-records"; | |
| import { withSshConnection } from "./ssh"; |
| deployedServerId: existing?.deployedServerId ?? null, | ||
| deployedServerHost: existing?.deployedServerHost ?? null, | ||
| apiServerKey: existing?.apiServerKey ?? null, |
There was a problem hiding this comment.
When connecting a new Telegram bot, copying the active deployment status (deployedServerId, deployedServerHost, apiServerKey) from the existing config creates a split-brain scenario. The UI/database will claim the new bot is deployed, but the VPS is actually still running the old bot token. These fields should be reset to null when a new bot token is connected, forcing the user to redeploy the new token.
| deployedServerId: existing?.deployedServerId ?? null, | |
| deployedServerHost: existing?.deployedServerHost ?? null, | |
| apiServerKey: existing?.apiServerKey ?? null, | |
| deployedServerId: null, | |
| deployedServerHost: null, | |
| apiServerKey: null, |
| botUsername: bot.username, | ||
| botTokenLast4: getTokenLast4(botToken), | ||
| isActive: true, | ||
| deployedServerHost: existing?.deployedServerHost ?? null, |
There was a problem hiding this comment.
| const composeContent = [ | ||
| "services:", | ||
| " hermes:", | ||
| " image: nousresearch/hermes-agent:latest", | ||
| " container_name: hermes", | ||
| " restart: unless-stopped", | ||
| " command: gateway run", | ||
| " ports:", | ||
| ' - "8642:8642"', | ||
| " volumes:", | ||
| " - ~/.hermes:/opt/data", | ||
| " environment:", | ||
| " - API_SERVER_ENABLED=true", | ||
| " - API_SERVER_HOST=0.0.0.0", | ||
| ` - API_SERVER_KEY=${apiServerKey}`, | ||
| ` - TELEGRAM_BOT_TOKEN=${decryptedToken}`, | ||
| ].join("\n"); |
There was a problem hiding this comment.
The Docker image name nousresearch/hermes-agent:latest is hardcoded here, whereas server/install.ts uses the defaultHermesImage constant imported from ./constants. Use defaultHermesImage here as well to maintain consistency.
| const composeContent = [ | |
| "services:", | |
| " hermes:", | |
| " image: nousresearch/hermes-agent:latest", | |
| " container_name: hermes", | |
| " restart: unless-stopped", | |
| " command: gateway run", | |
| " ports:", | |
| ' - "8642:8642"', | |
| " volumes:", | |
| " - ~/.hermes:/opt/data", | |
| " environment:", | |
| " - API_SERVER_ENABLED=true", | |
| " - API_SERVER_HOST=0.0.0.0", | |
| ` - API_SERVER_KEY=${apiServerKey}`, | |
| ` - TELEGRAM_BOT_TOKEN=${decryptedToken}`, | |
| ].join("\n"); | |
| const composeContent = [ | |
| "services:", | |
| " hermes:", | |
| ` image: ${defaultHermesImage}`, | |
| " container_name: hermes", | |
| " restart: unless-stopped", | |
| " command: gateway run", | |
| " ports:", | |
| ' - "8642:8642"', | |
| " volumes:", | |
| " - ~/.hermes:/opt/data", | |
| " environment:", | |
| " - API_SERVER_ENABLED=true", | |
| " - API_SERVER_HOST=0.0.0.0", | |
| ` - API_SERVER_KEY=${apiServerKey}`, | |
| ` - TELEGRAM_BOT_TOKEN=${decryptedToken}`, | |
| ].join("\n"); |
| let parsed: { choices?: Array<{ message?: { content?: string } }> }; | ||
| try { | ||
| parsed = JSON.parse(stdout) as { | ||
| choices?: Array<{ message?: { content?: string } }>; | ||
| }; | ||
| } catch { | ||
| throw new Error( | ||
| `Invalid JSON from Hermes API: ${stdout.slice(0, 200)}`, | ||
| ); | ||
| } | ||
|
|
||
| const content = parsed.choices?.[0]?.message?.content; | ||
| if (!content) { | ||
| throw new Error("No response content from Hermes"); | ||
| } |
There was a problem hiding this comment.
If the Hermes API returns an error response (e.g., 401 Unauthorized), curl will still exit with code 0 (since --fail is not used). The response JSON will be parsed successfully, but parsed.choices will be undefined, resulting in a generic 'No response content from Hermes' error. Additionally, if JSON.parse returns null, accessing parsed.choices will throw a TypeError. Check if parsed is a valid object, and check for any error returned by the API to propagate a more helpful error message.
let parsed: any;
try {
parsed = JSON.parse(stdout);
} catch {
throw new Error(
`Invalid JSON from Hermes API: ${stdout.slice(0, 200)}`,
);
}
if (parsed && typeof parsed === "object" && parsed.error) {
const errMsg = typeof parsed.error === "object" && parsed.error.message
? parsed.error.message
: JSON.stringify(parsed.error);
throw new Error(`Hermes API returned error: ${errMsg}`);
}
const content = parsed && typeof parsed === "object" ? parsed.choices?.[0]?.message?.content : null;
if (!content) {
throw new Error("No response content from Hermes");
}| setSavedConfig({ | ||
| ...payload.telegram, | ||
| deployedServerHost: | ||
| payload.telegram.deployedServerHost ?? | ||
| savedConfig?.deployedServerHost ?? | ||
| null, | ||
| }); |
There was a problem hiding this comment.
If a user connects a new bot token, the old deployment is no longer valid for this new token. Preserving the old deployedServerHost here will make the UI falsely claim that the new bot is already deployed, and enable the test section even though the VPS is still running the old bot token. Directly use the payload.telegram returned by the backend (which should have deployedServerHost reset to null for a new connection).
setSavedConfig(payload.telegram);
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/telegram.ts (1)
129-143:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClear deploy metadata when the bot token changes.
This keeps the new token marked as deployed even though the remote
docker-compose.ymlstill has the previousTELEGRAM_BOT_TOKENuntil/api/telegram/deployruns again. The result is a false deployed state and misleading test behavior for the newly connected bot.Suggested fix
- const existing = await getLatestTelegramRecord(session.user.id); - await db .update(telegramConfigs) .set({ isActive: false }) .where(eq(telegramConfigs.userId, session.user.id)); await db.insert(telegramConfigs).values({ userId: session.user.id, botToken: encryptSecret(botToken), botUsername: bot.username, isActive: true, - deployedServerId: existing?.deployedServerId ?? null, - deployedServerHost: existing?.deployedServerHost ?? null, - apiServerKey: existing?.apiServerKey ?? null, + deployedServerId: null, + deployedServerHost: null, + apiServerKey: null, }); ... return context.json({ telegram: { botUsername: bot.username, botTokenLast4: getTokenLast4(botToken), isActive: true, - deployedServerHost: existing?.deployedServerHost ?? null, + deployedServerHost: null, } satisfies TelegramConfigSummary, });Also applies to: 156-162
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.ts` around lines 129 - 143, When inserting the new telegramConfigs row, clear deployedServerId/deployedServerHost/apiServerKey whenever the bot token changed so we don't preserve a "deployed" state for a different token; use getLatestTelegramRecord and compare existing?.botToken to encryptSecret(botToken) (or decrypt then compare) to decide whether to carry over those fields, and set them to null when they differ. Update the same logic in the second insert block that covers lines 156-162 to match.
🧹 Nitpick comments (1)
server/telegram.test.ts (1)
43-159: ⚡ Quick winAdd coverage for the new deploy/test handlers and reconnect state.
This suite still only exercises connect/disconnect. The new server-side behavior needs regression coverage for deploy success/failure, undeployed test rejection, and reconnect clearing deployment metadata, otherwise the stale-deployment path can slip through unchanged.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.test.ts` around lines 43 - 159, Add tests in telegram.test.ts for the new deploy/test and reconnect flows: write unit tests that call the exported handlers (e.g., deployTelegram, testTelegram, reconnectTelegram) similar to connectTelegram/disconnectTelegram; mock fetch responses to simulate deploy success and failure and assert response status/body and that updateSet/insertValues are called with expected fields (deployedServerHost, isActive, botToken encryption via encryptSecret), add a test that ensures testTelegram rejects when the bot is not deployed (selectLimit returning empty) and returns the proper error, and add a reconnectTelegram test that verifies reconnect clears stale deployment metadata (deployedServerHost -> null) and sets isActive appropriately by inspecting updateSet/insertValues/selectLimit/mock calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@drizzle/0001_ai_providers_base_url.sql`:
- Line 1: The migration currently only adds the "base_url" column to the
"ai_providers" table but must also add the Telegram-related columns declared in
server/db/schema.ts: add nullable text columns "deployed_server_id",
"deployed_server_host", and "api_server_key" to the same ALTER TABLE statement
so upgraded DBs include those fields that server/telegram.ts reads/writes;
update drizzle/0001_ai_providers_base_url.sql to ALTER TABLE "ai_providers" ADD
COLUMN for each of those three columns (nullable text) alongside "base_url".
In `@server/providers.ts`:
- Around line 183-187: The code currently passes user-supplied
resolvedApiKey.baseUrl into verifyProviderConnection (and similar call sites
around the other ranges) allowing SSRF; add a pre-check that parses and
constrains the baseUrl before any outbound fetch: create/ reuse a helper like
validateOutboundUrl(url, provider) and call it where verifyProviderConnection is
invoked (e.g., before verifyProviderConnection in the block using
parsed.provider and resolvedApiKey.baseUrl and the other call sites at ~278-288
and ~304-368); the helper must (1) require http(s) scheme, (2) normalize/resolve
host and reject localhost/127.0.0.0/8, ::1, link-local and RFC1918/private IP
ranges, and (3) enforce provider-specific allowlists/denylists (for 'custom'
only allow pre-configured domains or explicitly whitelisted hosts), and
throw/return an error if validation fails so the code never fetches an arbitrary
user-supplied URL.
In `@server/telegram.ts`:
- Around line 283-306: The SSH exec calls inside the withSshConnection callback
(the ssh.execCommand invocations that produce writeResult and restartResult when
using writeCmd and "cd ~/hermes && sudo docker compose up -d") need to be
bounded with a timeout like testTelegramBot does; update both ssh.execCommand
calls to pass execOptions.timeout (or an object with timeout set) so the remote
write and restart won't hang the HTTP request, and ensure errors still check
result.code as before.
- Around line 263-281: The inline composeContent and writeCmd in
server/telegram.ts overwrite ~/hermes/docker-compose.yml with a hardcoded
nousresearch/hermes-agent:latest which can drift from the install-time compose
built from defaultHermesImage in server/install.ts; instead, reuse the canonical
compose template or image value by importing/reading the same source (the
defaultHermesImage or the compose-builder used in server/install.ts) and
generate the composeContent dynamically from that shared value so the deploy
path does not hardcode the image/version or duplicate the template logic (update
references in composeContent and writeCmd to consume the shared
defaultHermesImage/compose builder).
- Around line 261-315: Persisting apiServerKey in plaintext is unsafe; before
saving the generated apiServerKey (the variable apiServerKey used in the
db.update(telegramConfigs) call) call encryptSecret() and store the encrypted
blob in the DB field (e.g., apiServerKey) instead of raw text, and when you
later need to send Authorization headers decrypt it with decryptSecret()
immediately before constructing the header (replace any direct reads of
apiServerKey used to build Authorization with a decryptSecret call); ensure all
other places that persist or read apiServerKey follow the same pattern (use
encryptSecret() on save and decryptSecret() on use).
In `@src/features/providers/provider-settings.tsx`:
- Around line 57-69: The updateProvider function always resets model to
getDefaultAiModel(provider), which drops a previously saved model when toggling
providers back to the active one; change the model field in setForm inside
updateProvider to use savedConfig.model when the selected provider matches
savedConfig.provider and savedConfig.model exists (falling back to
getDefaultAiModel(provider) otherwise) so that providers like "openrouter",
"ollama", and "custom" preserve their saved model; locate updateProvider (and
helper getAiProviderOption) and update the model assignment logic accordingly.
In `@src/features/telegram/telegram-settings.tsx`:
- Around line 67-75: The code currently preserves deployedServerHost across
reconnects by falling back to savedConfig?.deployedServerHost when
payload.telegram.deployedServerHost is absent; change this so deployedServerHost
is taken only from payload.telegram (or null) to force redeploy on token
replace. Update the setSavedConfig call in Telegram settings (where
setSavedConfig({...payload.telegram, deployedServerHost: ...})) to remove the
savedConfig fallback and set deployedServerHost to
payload.telegram.deployedServerHost ?? null so a replaced token does not inherit
an old host.
- Around line 111-143: The handleDeploy function currently only handles non-2xx
responses but doesn't catch network/fetch rejections, so wrap the fetch call in
a try/catch (or catch the fetch promise) and on any thrown error call
setDeployError(error.message ?? String(error)) before returning; ensure
setIsDeploying is still cleared in the existing finally block. Do the same
defensive pattern for the equivalent test flow (the block handling test API
calls around lines 146-176) so both fetch rejections set testError/deployError
and prevent unhandled promise rejections.
---
Outside diff comments:
In `@server/telegram.ts`:
- Around line 129-143: When inserting the new telegramConfigs row, clear
deployedServerId/deployedServerHost/apiServerKey whenever the bot token changed
so we don't preserve a "deployed" state for a different token; use
getLatestTelegramRecord and compare existing?.botToken to
encryptSecret(botToken) (or decrypt then compare) to decide whether to carry
over those fields, and set them to null when they differ. Update the same logic
in the second insert block that covers lines 156-162 to match.
---
Nitpick comments:
In `@server/telegram.test.ts`:
- Around line 43-159: Add tests in telegram.test.ts for the new deploy/test and
reconnect flows: write unit tests that call the exported handlers (e.g.,
deployTelegram, testTelegram, reconnectTelegram) similar to
connectTelegram/disconnectTelegram; mock fetch responses to simulate deploy
success and failure and assert response status/body and that
updateSet/insertValues are called with expected fields (deployedServerHost,
isActive, botToken encryption via encryptSecret), add a test that ensures
testTelegram rejects when the bot is not deployed (selectLimit returning empty)
and returns the proper error, and add a reconnectTelegram test that verifies
reconnect clears stale deployment metadata (deployedServerHost -> null) and sets
isActive appropriately by inspecting updateSet/insertValues/selectLimit/mock
calls.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 62f36388-112a-4baf-a661-11ac0d36b00c
📒 Files selected for processing (15)
drizzle/0001_ai_providers_base_url.sqldrizzle/meta/_journal.jsonserver/app.tsserver/db/schema.tsserver/install.tsserver/providers.test.tsserver/providers.tsserver/telegram.test.tsserver/telegram.tssrc/features/providers/provider-settings.test.tsxsrc/features/providers/provider-settings.tsxsrc/features/telegram/telegram-settings.test.tsxsrc/features/telegram/telegram-settings.tsxsrc/lib/ai-providers.tssrc/routes/telegram.tsx
| @@ -0,0 +1 @@ | |||
| ALTER TABLE "ai_providers" ADD COLUMN "base_url" text; | |||
There was a problem hiding this comment.
Add the telegram_configs columns in this migration too.
server/db/schema.ts now declares deployed_server_id, deployed_server_host, and api_server_key, and server/telegram.ts already reads/writes them. With only this ALTER TABLE "ai_providers" statement, upgraded databases will be missing those Telegram columns and the new deploy/test flows will fail at runtime.
Suggested migration addition
ALTER TABLE "ai_providers" ADD COLUMN "base_url" text;
+ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_id" text;
+ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_host" text;
+ALTER TABLE "telegram_configs" ADD COLUMN "api_server_key" text;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ALTER TABLE "ai_providers" ADD COLUMN "base_url" text; | |
| ALTER TABLE "ai_providers" ADD COLUMN "base_url" text; | |
| ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_id" text; | |
| ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_host" text; | |
| ALTER TABLE "telegram_configs" ADD COLUMN "api_server_key" text; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@drizzle/0001_ai_providers_base_url.sql` at line 1, The migration currently
only adds the "base_url" column to the "ai_providers" table but must also add
the Telegram-related columns declared in server/db/schema.ts: add nullable text
columns "deployed_server_id", "deployed_server_host", and "api_server_key" to
the same ALTER TABLE statement so upgraded DBs include those fields that
server/telegram.ts reads/writes; update drizzle/0001_ai_providers_base_url.sql
to ALTER TABLE "ai_providers" ADD COLUMN for each of those three columns
(nullable text) alongside "base_url".
| await verifyProviderConnection({ | ||
| provider: parsed.provider, | ||
| apiKey: resolvedApiKey.apiKey, | ||
| baseUrl: resolvedApiKey.baseUrl, | ||
| }); |
There was a problem hiding this comment.
Constrain baseUrl before using it for backend fetches.
/api/providers/test now lets the caller choose the outbound URL that the server fetches. For the new custom path in particular, this becomes an SSRF primitive against internal hosts/services. Please validate the URL up front and enforce provider-specific allowlists/denylists instead of fetching any user-supplied target.
Also applies to: 278-288, 304-368
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/providers.ts` around lines 183 - 187, The code currently passes
user-supplied resolvedApiKey.baseUrl into verifyProviderConnection (and similar
call sites around the other ranges) allowing SSRF; add a pre-check that parses
and constrains the baseUrl before any outbound fetch: create/ reuse a helper
like validateOutboundUrl(url, provider) and call it where
verifyProviderConnection is invoked (e.g., before verifyProviderConnection in
the block using parsed.provider and resolvedApiKey.baseUrl and the other call
sites at ~278-288 and ~304-368); the helper must (1) require http(s) scheme, (2)
normalize/resolve host and reject localhost/127.0.0.0/8, ::1, link-local and
RFC1918/private IP ranges, and (3) enforce provider-specific
allowlists/denylists (for 'custom' only allow pre-configured domains or
explicitly whitelisted hosts), and throw/return an error if validation fails so
the code never fetches an arbitrary user-supplied URL.
| function updateProvider(provider: AiProviderId) { | ||
| const option = getAiProviderOption(provider); | ||
| setForm({ | ||
| provider, | ||
| model: getDefaultAiModel(provider), | ||
| apiKey: "", | ||
| baseUrl: | ||
| option?.id === savedConfig?.provider && savedConfig?.baseUrl | ||
| ? savedConfig.baseUrl | ||
| : provider === "ollama" | ||
| ? "http://localhost:11434/v1" | ||
| : "", | ||
| }); |
There was a problem hiding this comment.
Preserve the saved model when switching back to the active provider.
You already restore savedConfig.baseUrl here, but model always resets to the provider default. For openrouter, ollama, and custom, a quick provider toggle drops the saved custom model from the form and makes the next save easy to get wrong.
Suggested state fix
function updateProvider(provider: AiProviderId) {
const option = getAiProviderOption(provider);
+ const isSavedProvider = option?.id === savedConfig?.provider;
setForm({
provider,
- model: getDefaultAiModel(provider),
+ model: isSavedProvider
+ ? savedConfig.model
+ : getDefaultAiModel(provider),
apiKey: "",
baseUrl:
- option?.id === savedConfig?.provider && savedConfig?.baseUrl
+ isSavedProvider && savedConfig?.baseUrl
? savedConfig.baseUrl
: provider === "ollama"
? "http://localhost:11434/v1"
: "",
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function updateProvider(provider: AiProviderId) { | |
| const option = getAiProviderOption(provider); | |
| setForm({ | |
| provider, | |
| model: getDefaultAiModel(provider), | |
| apiKey: "", | |
| baseUrl: | |
| option?.id === savedConfig?.provider && savedConfig?.baseUrl | |
| ? savedConfig.baseUrl | |
| : provider === "ollama" | |
| ? "http://localhost:11434/v1" | |
| : "", | |
| }); | |
| function updateProvider(provider: AiProviderId) { | |
| const option = getAiProviderOption(provider); | |
| const isSavedProvider = option?.id === savedConfig?.provider; | |
| setForm({ | |
| provider, | |
| model: isSavedProvider | |
| ? savedConfig.model | |
| : getDefaultAiModel(provider), | |
| apiKey: "", | |
| baseUrl: | |
| isSavedProvider && savedConfig?.baseUrl | |
| ? savedConfig.baseUrl | |
| : provider === "ollama" | |
| ? "http://localhost:11434/v1" | |
| : "", | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/providers/provider-settings.tsx` around lines 57 - 69, The
updateProvider function always resets model to getDefaultAiModel(provider),
which drops a previously saved model when toggling providers back to the active
one; change the model field in setForm inside updateProvider to use
savedConfig.model when the selected provider matches savedConfig.provider and
savedConfig.model exists (falling back to getDefaultAiModel(provider) otherwise)
so that providers like "openrouter", "ollama", and "custom" preserve their saved
model; locate updateProvider (and helper getAiProviderOption) and update the
model assignment logic accordingly.
| async function handleDeploy() { | ||
| setIsDeploying(true); | ||
| setDeployError(null); | ||
| setDeploySuccess(null); | ||
|
|
||
| try { | ||
| const response = await fetch("/api/telegram/deploy", { | ||
| method: "POST", | ||
| }); | ||
|
|
||
| const payload = (await response.json().catch(() => null)) as { | ||
| error?: string; | ||
| status?: string; | ||
| serverHost?: string; | ||
| } | null; | ||
|
|
||
| if (!response.ok) { | ||
| setDeployError(payload?.error ?? "Deploy failed"); | ||
| return; | ||
| } | ||
|
|
||
| const serverHost = payload?.serverHost ?? null; | ||
| // Update savedConfig so isDeployed / deployedHost derive from a single | ||
| // source of truth rather than a separate currentDeployStatus atom. | ||
| setSavedConfig((prev) => | ||
| prev ? { ...prev, deployedServerHost: serverHost } : prev, | ||
| ); | ||
| setDeploySuccess( | ||
| `Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`, | ||
| ); | ||
| } finally { | ||
| setIsDeploying(false); | ||
| } |
There was a problem hiding this comment.
Handle thrown fetch() failures in the new deploy/test flows.
These handlers only set UI errors for non-2xx responses. If fetch rejects, the promise bubbles out and the user gets no deployError/testError feedback.
Suggested pattern
try {
const response = await fetch("/api/telegram/deploy", {
method: "POST",
});
...
`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
);
+ } catch (error) {
+ setDeployError(error instanceof Error ? error.message : "Deploy failed");
} finally {
setIsDeploying(false);
} try {
const response = await fetch("/api/telegram/test", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ message: testMessage }),
});
...
setTestResponse(payload?.response ?? "(empty response)");
+ } catch (error) {
+ setTestError(error instanceof Error ? error.message : "Test failed");
} finally {
setIsTesting(false);
}Also applies to: 146-176
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/telegram/telegram-settings.tsx` around lines 111 - 143, The
handleDeploy function currently only handles non-2xx responses but doesn't catch
network/fetch rejections, so wrap the fetch call in a try/catch (or catch the
fetch promise) and on any thrown error call setDeployError(error.message ??
String(error)) before returning; ensure setIsDeploying is still cleared in the
existing finally block. Do the same defensive pattern for the equivalent test
flow (the block handling test API calls around lines 146-176) so both fetch
rejections set testError/deployError and prevent unhandled promise rejections.
- Add buildHermesComposeContent() to server/compose.ts for the duplicated docker-compose template between install and telegram - Add resolveServerSshConfig() to server/server-records.ts to DRY up the normalizeAuthMethod + resolveServerCredential pattern - Apply both helpers across install, server-actions, servers, and dashboard; remove local duplicate functions (getInstallCredential, getServerCredential, normalizeAuthMethod)
…n, handle API errors - Reset deployedServerId/Host/apiServerKey to null when connecting a new bot token to prevent split-brain (old token still on VPS) - Return deployedServerHost: null in connect response for consistency - Wrap telegramConfigs update + auditLog insert in db.transaction() - Handle Hermes API error responses in testTelegramBot curl results (curl does not use --fail, so error JSON was silently swallowed) - Add null safety to JSON.parse result in testTelegramBot - Remove frontend code that preserved stale deployedServerHost on reconnect - Add DB migration for telegram_configs columns
… ternaries
- Merge identical ollama and custom provider test branches using the
requiresBaseUrl flag from AiProviderOption
- Add defaultBaseUrl field to AiProviderOption metadata
- Replace 7 scattered form.provider === 'ollama' conditionals in
provider-settings.tsx with provider metadata lookups
- Wrap JSON.parse in decryptApiKey with its own try-catch so keys
starting with '{' that are not valid JSON fall back to raw text
globalThis.process guard prevents tree-shaking of dev-only early return in requireHttps and the Resend API key check when Dockerfile bakes NODE_ENV=production at build time.
Swap NODE_ENV to development so hot-reload works in local dev. Mailpit provides a catch-all SMTP server with a web UI on port 8025, replacing the need for a real Resend API key during development.
Covers TanStack Start + Hono dispatch, PostgreSQL + Drizzle ORM, Better Auth with magic links, Docker multi-stage build, AES-256-GCM credential encryption, TanStack Router file-based routing, and Tailwind v4 with island components.
Covers four scenarios: selecting custom provider from empty state, loading saved config with model/baseUrl, empty model/url, and undefined baseUrl edge case.
feat(ui): migrate ConnectionWizard and login to react-hook-form + zod
Result: {"status":"keep","test_time_ms":7200}
Result: {"status":"keep","test_time_ms":7300}
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Result: {"status":"keep","test_time_ms":7900}
Result: {"status":"keep","test_time_ms":9200}
…art plugins in test environment
Result: {"status":"keep","test_time_ms":7300}
…hecking
Result: {"status":"keep","test_time_ms":7500}
Result: {"status":"keep","test_time_ms":4800}
Result: {"status":"keep","test_time_ms":4600}
…pilation
Result: {"status":"keep","test_time_ms":4600}
…time
Result: {"status":"keep","test_time_ms":5100}
… for component tests
Result: {"status":"keep","metric":5200}
Result: {"status":"keep","metric":4800}
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/telegram.ts (1)
295-308:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd timeouts to SSH exec commands to prevent request hangs.
Both
ssh.execCommandcalls here run without explicit timeouts, whiletestTelegramBot(line 441) andrunHermesPairingJsonCommand(line 725) correctly useexecOptions: { timeout: ... }. A hung SSH session or slow container restart can pin the HTTP request indefinitely.🛡️ Proposed fix to add timeouts
async (ssh) => { - const writeResult = await ssh.execCommand(writeCmd); + const writeResult = await ssh.execCommand(writeCmd, { + execOptions: { timeout: 30_000 }, + }); if (writeResult.code !== 0) { throw new Error( writeResult.stderr || "Failed to write docker-compose.yml", ); } const restartResult = await ssh.execCommand( "cd ~/hermes && sudo docker compose up -d --force-recreate", + { execOptions: { timeout: 120_000 } }, ); if (restartResult.code !== 0) { throw new Error(restartResult.stderr || "Failed to restart Hermes"); } },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.ts` around lines 295 - 308, The two ssh.execCommand calls inside the async callback (the one executing writeCmd and the subsequent restart command) lack execOptions timeouts and can hang; update both calls (the execCommand invoking writeCmd and the execCommand running "cd ~/hermes && sudo docker compose up -d --force-recreate") to pass execOptions: { timeout: <ms> } (use the same timeout constant used by testTelegramBot/runHermesPairingJsonCommand or a sensible value like 120000ms), propagate/handle a timeout by including the stderr/timeout info in the thrown Error (e.g., throw new Error(restartResult.stderr || '... timed out')). Ensure you reference these exact execCommand calls so both are updated.
♻️ Duplicate comments (1)
server/providers.ts (1)
616-623:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse
decryptApiKey()for backward-compatible JSON unwrapping.Same issue as
getProviderDeployConfig: this usesdecryptSecret()directly rather thandecryptApiKey(), so legacy JSON-wrapped keys will be written verbatim to the Hermes compose file instead of the actual API key.🐛 Proposed fix
let decryptedApiKey = ""; if (providerRecord.encryptedApiKey) { - try { - decryptedApiKey = decryptSecret(providerRecord.encryptedApiKey); - } catch { - return context.json({ error: "Failed to decrypt API key." }, 500); - } + decryptedApiKey = decryptApiKey(providerRecord.encryptedApiKey); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/providers.ts` around lines 616 - 623, The code is calling decryptSecret() directly which fails to unwrap legacy JSON-wrapped API keys; replace the decryptSecret() usage with decryptApiKey() when populating decryptedApiKey (i.e. where providerRecord.encryptedApiKey is handled) so legacy JSON format is normalized the same way as getProviderDeployConfig; ensure you import/keep reference to decryptApiKey() and use it in place of decryptSecret() before returning or writing the Hermes compose file.
🧹 Nitpick comments (1)
server/app.test.ts (1)
19-27: ⚡ Quick winConsider adding test cases for the new mocked routes.
The file adds mock declarations for five new route handlers but does not include test cases to verify that the routes are wired correctly. All existing routes in this file have corresponding tests (lines 86–464), so the new deployment and testing routes should follow the same pattern for consistency and coverage.
Missing route tests:
- POST /api/providers/deploy
- POST /api/telegram/deploy
- POST /api/telegram/test
- GET /api/telegram/pairings
- POST /api/telegram/pairings/approve
Example test skeleton for POST /api/providers/deploy
+ it("routes provider deploy requests through the deploy handler", async () => { + deployProviderToHermes.mockResolvedValueOnce( + new Response(JSON.stringify({ deployedModel: "gpt-4o-mini" }), { + status: 200, + headers: { "content-type": "application/json" }, + }), + ); + + const { apiApp } = await import("./app"); + const response = await apiApp.request( + "http://localhost/api/providers/deploy", + { + method: "POST", + }, + ); + + expect(response.status).toBe(200); + expect(deployProviderToHermes).toHaveBeenCalledTimes(1); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/app.test.ts` around lines 19 - 27, Add unit tests that exercise and assert the five new mocked route handlers are wired like the existing tests: create tests for POST /api/providers/deploy (assert deployProviderToHermes is called with expected body and responds with correct status/body), POST /api/telegram/deploy (assert deployTelegramToServer called), POST /api/telegram/test (assert testTelegramBot called), GET /api/telegram/pairings (assert listTelegramPairings called and response forwarded), and POST /api/telegram/pairings/approve (assert approveTelegramPairing called with correct payload); use the same test structure and fixtures as the other route tests, stub the mock functions deployProviderToHermes, deployTelegramToServer, testTelegramBot, listTelegramPairings, and approveTelegramPairing, and verify both invocation and HTTP response handling for each route.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.planning/codebase/INTEGRATIONS.md:
- Around line 7-26: The four duplicated section headers in INTEGRATIONS.md
should be replaced with unique provider-specific headers so each block is
clearly identifiable; update the repeated "**AI Providers:**" headings to e.g.
"**OpenAI Models API:**", "**Anthropic Models API:**", "**OpenRouter API:**" and
"**Ollama / OpenAI-compatible endpoint:**" respectively, leaving the body text
intact and keeping existing references to server/providers.ts,
src/lib/ai-providers.ts, server/crypto.ts, and server/db/schema.ts so links
remain accurate.
In `@server/crypto.ts`:
- Around line 56-65: The decryptApiServerKey function currently swallows
decryption errors; update it to log a warning (including the caught error) when
decryptSecret throws and we fall back to returning plaintext, add a short
comment above decryptApiServerKey describing the backward-compatibility contract
(plaintext fallback for legacy or unencrypted values and that ENCRYPTION_KEY
misconfiguration will trigger warnings), and emit or increment an observability
metric/audit event when the fallback occurs (call your existing metric/audit
helper such as recordMetric/incrementCounter/auditLog or create one if missing)
so fallback usage is tracked.
In `@server/providers.ts`:
- Around line 544-551: In getProviderDeployConfig, replace the direct call to
decryptSecret(record.encryptedApiKey) with decryptApiKey(record.encryptedApiKey)
so legacy JSON-wrapped keys are unwrapped (matching behavior in
getCurrentProviderConfig and resolveProviderApiKey); preserve the existing
try/catch and error message behavior so decryptedApiKey remains set for
subsequent calls to buildProviderEnvMap and avoid writing raw JSON to Hermes env
vars.
---
Outside diff comments:
In `@server/telegram.ts`:
- Around line 295-308: The two ssh.execCommand calls inside the async callback
(the one executing writeCmd and the subsequent restart command) lack execOptions
timeouts and can hang; update both calls (the execCommand invoking writeCmd and
the execCommand running "cd ~/hermes && sudo docker compose up -d
--force-recreate") to pass execOptions: { timeout: <ms> } (use the same timeout
constant used by testTelegramBot/runHermesPairingJsonCommand or a sensible value
like 120000ms), propagate/handle a timeout by including the stderr/timeout info
in the thrown Error (e.g., throw new Error(restartResult.stderr || '... timed
out')). Ensure you reference these exact execCommand calls so both are updated.
---
Duplicate comments:
In `@server/providers.ts`:
- Around line 616-623: The code is calling decryptSecret() directly which fails
to unwrap legacy JSON-wrapped API keys; replace the decryptSecret() usage with
decryptApiKey() when populating decryptedApiKey (i.e. where
providerRecord.encryptedApiKey is handled) so legacy JSON format is normalized
the same way as getProviderDeployConfig; ensure you import/keep reference to
decryptApiKey() and use it in place of decryptSecret() before returning or
writing the Hermes compose file.
---
Nitpick comments:
In `@server/app.test.ts`:
- Around line 19-27: Add unit tests that exercise and assert the five new mocked
route handlers are wired like the existing tests: create tests for POST
/api/providers/deploy (assert deployProviderToHermes is called with expected
body and responds with correct status/body), POST /api/telegram/deploy (assert
deployTelegramToServer called), POST /api/telegram/test (assert testTelegramBot
called), GET /api/telegram/pairings (assert listTelegramPairings called and
response forwarded), and POST /api/telegram/pairings/approve (assert
approveTelegramPairing called with correct payload); use the same test structure
and fixtures as the other route tests, stub the mock functions
deployProviderToHermes, deployTelegramToServer, testTelegramBot,
listTelegramPairings, and approveTelegramPairing, and verify both invocation and
HTTP response handling for each route.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: debea91c-f7dd-433d-af21-9dac719df63c
📒 Files selected for processing (20)
.planning/codebase/ARCHITECTURE.md.planning/codebase/CONCERNS.md.planning/codebase/CONVENTIONS.md.planning/codebase/INTEGRATIONS.md.planning/codebase/STACK.md.planning/codebase/STRUCTURE.md.planning/codebase/TESTING.mdREADME.mddrizzle/0001_overjoyed_annihilus.sqldrizzle/meta/0001_snapshot.jsondrizzle/meta/_journal.jsonserver/app.test.tsserver/app.tsserver/compose.tsserver/crypto.tsserver/providers.test.tsserver/providers.tsserver/telegram.tssrc/features/telegram/telegram-settings.test.tsxsrc/features/telegram/telegram-settings.tsx
✅ Files skipped from review due to trivial changes (5)
- drizzle/0001_overjoyed_annihilus.sql
- drizzle/meta/0001_snapshot.json
- README.md
- .planning/codebase/STRUCTURE.md
- .planning/codebase/ARCHITECTURE.md
🚧 Files skipped from review as they are similar to previous changes (3)
- server/compose.ts
- server/app.ts
- src/features/telegram/telegram-settings.tsx
| export function decryptApiServerKey(payload: string): string { | ||
| if (!payload) { | ||
| return ""; | ||
| } | ||
| try { | ||
| return decryptSecret(payload); | ||
| } catch { | ||
| return payload; | ||
| } | ||
| } |
There was a problem hiding this comment.
Add logging and documentation for decryption fallback.
The silent catch block makes it impossible to detect when decryption fails. If the ENCRYPTION_KEY is incorrect or if data is corrupted, all encrypted keys will silently fall back to being treated as plaintext, causing downstream authentication failures with no debugging breadcrumbs.
Consider:
- Log a warning when falling back to plaintext (helps track migration progress and detect misconfiguration)
- Document the backward-compatibility contract in a comment
- Add a metric or audit log entry to track fallback usage
🔍 Proposed fix to add observability
export function decryptApiServerKey(payload: string): string {
if (!payload) {
return "";
}
try {
return decryptSecret(payload);
- } catch {
+ } catch (error) {
+ console.warn(
+ "[crypto] Failed to decrypt apiServerKey, treating as plaintext (legacy format). " +
+ "Consider re-encrypting this key.",
+ { error: error instanceof Error ? error.message : String(error) }
+ );
return payload;
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/crypto.ts` around lines 56 - 65, The decryptApiServerKey function
currently swallows decryption errors; update it to log a warning (including the
caught error) when decryptSecret throws and we fall back to returning plaintext,
add a short comment above decryptApiServerKey describing the
backward-compatibility contract (plaintext fallback for legacy or unencrypted
values and that ENCRYPTION_KEY misconfiguration will trigger warnings), and emit
or increment an observability metric/audit event when the fallback occurs (call
your existing metric/audit helper such as recordMetric/incrementCounter/auditLog
or create one if missing) so fallback usage is tracked.
Harden provider/telegram deployment paths by validating model strings,\nquoting shell-sensitive values, and sharing token-last4 logic.\n\nAlso clear dashboard cache after mutating server/provider/telegram actions\nand expand tests around deploy, quoting, and model validation paths.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Server actions now execute DB writes through transactions.\n\nUpdate the test harness to expose db.transaction and run handlers\nwith a transaction-scoped DB object so tests exercise the same path.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Decompose the large telegram settings component into dedicated\nconnect, deploy, pairing, test, and sidebar sections with shared\ninput styling.\n\nThis reduces coupling and keeps each workflow isolated without\nchanging API behavior.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document the Telegram pairing management APIs and troubleshooting\nflow, and record architecture decisions for single-instance state\nand runtime management from the Telegram page.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop the dead updateLatestInstallVersion helper now that version updates\nare handled transactionally inside runServerAction.\n\nThis keeps server-actions focused on active execution paths and\navoids stale helper drift.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Capture when to use DB transactions for coupled multi-write flows\nand when sequential primary+audit writes are intentionally acceptable.\n\nThis makes consistency expectations explicit for future changes in\nTelegram deploy and server action paths.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The 177-line ssh.ts mixed connection lifecycle, error normalization, OS detection, and shell quoting. Split into four focused modules under server/ssh/ so each concern lives in its own file: - connection.ts owns withSshConnection, verifyServerConnection, and the SshConnectionInput type. It owns the NodeSSH lifecycle and the execStrict helper. - errors.ts owns UnsupportedOsError, SshConnectError, and normalizeSshError which translates raw node-ssh errors into the two user-facing categories (invalid credentials, host unreachable). - os.ts owns parseAndValidateOs and VerifiedServerInfo. It reads /etc/os-release, classifies Ubuntu/Debian support, and returns supportLevel for the dashboard. - quoting.ts owns shellQuote, a 3-line helper that doesn't deserve to live in the connection module. server/ssh.ts now re-exports the same public surface (SshAuthMethod, SshConnectionInput, verifyServerConnection, withSshConnection, VerifiedServerInfo, parseAndValidateOs, UnsupportedOsError, SshConnectError, normalizeSshError, shellQuote) so existing imports keep working unchanged.
…elper Two helpers are needed by every code path that SSHes into a server without first verifying ownership: telegram deploy, telegram test, telegram pairings, and provider deploy. They used to reimplement the same try/catch around resolveServerSshConfig and re-query the servers table to find the deployed server. - ServerConnectionRecord is the minimal projection (no ownership, no status) used when the server is already trusted via a previous deployment record. - getServerById looks up a server by ID without an ownership check. This is safe for telegram/provider deploys because the caller already validated the deployment state via telegramConfigs rows. - resolveServerSshConfigOrError returns a discriminated result instead of throwing. Callers map the failure to a 400 response, which removes four near-identical try/catch blocks across the telegram and server-actions handlers. This commit only adds the helpers. Callers will adopt them in follow-up commits so each commit stays reviewable.
…d modules The 437-line servers.ts mixed HTTP handlers (connect/update/delete/ list) with the read-side query logic that powers the dashboard list view. The query logic — including the SQL that joins audit_logs, servers, and installs to surface the latest server action per row — was hard to test in isolation and pulled in drizzle-orm symbols that the HTTP layer doesn't need. - servers/records.ts owns the three read helpers (getOwnedServerListRecords, getLatestInstallRecords, getLatestServerActionRecords), the audit-log name allowlist, and the record types. The SQL filter that pulls only the relevant server.* audit actions is colocated with the helper that uses it. - servers/list.ts owns getServerListSnapshot, the orchestration that combines the three record helpers, walks the latest install/action per server, and computes lastActivityAt. - servers.ts keeps the HTTP handlers and re-exports getServerListSnapshot so the server-fn in src/routes/servers.index.tsx keeps importing it from ./servers.
…o focused modules The 434-line dashboard.ts mixed three concerns: SQL lookups for the latest server/install/provider/telegram rows, pure summary transformation (toAgentSummary, toProviderSummary, etc.), and the live VPS metrics pipeline that shells out via SSH. That made it impossible to unit-test the transformations without mocking drizzle, and pulled a 15-second metrics cache into a file that mostly deals with static snapshot composition. - dashboard/records.ts owns the four getLatest* DB helpers and getServerCount. Pure SQL, no transformation. - dashboard/summaries.ts owns ServerRecord/InstallRecord/Provider- Record/TelegramRecord types, the to*Summary mappers, getHealthTone, and parsePercentValue. Pure functions, no DB, no SSH. - dashboard/metrics.ts owns the 15-second metrics cache and readServerMetrics, the only function in this slice that talks to SSH. clearMetricsCache is the explicit invalidation hook. - dashboard.ts becomes the composition layer: cache the static snapshot, compose the summaries, call getVpsSummary. It also re-exports getHealthTone and the to*Summary functions so existing test imports keep working.
…odules The 260-line install.ts mixed the Hono HTTP/SSE handler with two pieces of business logic: the install step list (six apt-get/ docker compose commands with progress percentages) and the workflow runner that shells out, persists audit logs, and emits SSE events. This made the install pipeline hard to read end-to- end and forced the SSE handler to import a long list of types that the HTTP layer doesn't care about. - install/workflow.ts owns InstallStep, the installSteps array, runInstallWorkflow, the SSH-driven command loop, and the audit-log writes for succeeded/failed states. Pure business logic, no Hono context. - install/records.ts owns upsertInstallRecord (idempotent create-or-reset of the install row) and getServerForInstall (the minimal server projection needed to SSH, with the ownership filter). - install.ts becomes the HTTP shell: startServerInstall claims the SSE slot, upserts the install row, then fires runInstallWorkflow in the background. streamServerInstallEvents and getLatestServerInstallLog stay here because they own the Hono/streamSSE context. The startServerInstall handler also adopts resolveServerSshConfigOrError to drop the inline try/catch around the credential resolve.
…o focused modules The 510-line providers.ts mixed four concerns: provider env-var config (PROVIDER_ENV_CONFIGS, buildProviderEnvMap, custom-vendor API key derivation), the verify-the-key live API call (createProviderTestRequest, verifyProviderConnection, the ProviderConnectionError class), DB lookups (getLatestProvider- Record, getTelegramDeployInfo, decryptApiKey, getApiKeyLast4), and the HTTP handlers + the deploy handler. Each helper pulls in its own set of imports and is meaningful on its own — extract-and-test for the env-var derivation in isolation is a long-standing gap. - providers/config.ts owns ProviderRequest, StoredProviderRecord, ProviderConfigSummary, the PROVIDER_ENV_CONFIGS table, buildProviderEnvMap, isApiKeyRequired, and deriveCustomProviderApiKeyEnvVar. Pure logic, no DB, no HTTP. - providers/connection.ts owns ProviderConnectionError, verifyProviderConnection, and createProviderTestRequest. The per-provider URL/header builder is private to the module since no caller needs to override it. - providers/records.ts owns decryptApiKey (with the legacy JSON unwrap), getApiKeyLast4, getLatestProviderRecord, and getTelegramDeployInfo. The deploy function still reads from here — the SQL naturally lives next to the records it queries. providers.ts becomes a thin HTTP layer: getCurrentProviderConfig, saveProviderConfig, testProviderConfig, parseProviderRequest, resolveProviderApiKey, getProviderDeployConfig, and deployProviderToHermes. The deploy handler stays in providers.ts for now; the follow-up commit moves it (and the shared docker-compose deploy helper) into server/deploy.ts so the provider and telegram deploy paths can share a single SSH pipeline.
…bmodules The provider and telegram deploy handlers both wrote a docker-compose.yml to the remote server and ran `docker compose up -d --force-recreate`. The telegram version also ran `hermes config set model` after a 2-second sleep so the new container would be ready. The duplication drifted: the provider version re-declared the writeCmd template inline, the telegram version kept the shellQuote import only because of a single command, and audit-log writes were copy-pasted. - server/deploy.ts owns deployComposeViaSsh (the SSH pipeline: write compose file, force-recreate the stack, optionally run extra commands inside the same connection) and deployProviderToHermes (moved verbatim from providers.ts so the two deploy paths share one SSH codepath). - server/telegram.ts now calls deployComposeViaSsh with no extraSshCommands and writes the audit log inside a transaction, exactly as before. - server/providers.ts no longer owns deployProviderToHermes. It keeps the public surface (saveProviderConfig, testProviderConfig, getCurrentProviderConfig, getProviderDeployConfig) that the dashboard/settings routes depend on. - server/app.ts imports deployProviderToHermes from ./deploy. The 547-line telegram.ts also split into focused submodules: - telegram/config.ts owns TelegramConnectRequest, TelegramConfigSummary, TelegramPairingSummary, the TelegramConnectionError class, verifyTelegramToken, and getTokenLast4. The Telegram getMe response shape stays private to the verifier. - telegram/pairings.ts owns listTelegramPairings and approveTelegramPairing. Both share a getDeployedTelegramServer resolver that returns either an HTTP response or the (serverRecord, authMethod, credential) triple, eliminating the repeated 401/404/400 branches. The pairing JSON parsing lives here too — Hermes returns snake_case keys, we project to camelCase before crossing the boundary. - telegram/records.ts owns getCurrentTelegramConfig, getLatestTelegramRecord, and findServerForDeploy (the join between installs and servers that returns a server whose most recent install succeeded). decryptSecret is used locally; the botToken is decrypted in this module so the HTTP layer never touches the encrypted form. - telegram.ts becomes the HTTP shell: connectTelegram, disconnectTelegram, deployTelegramToServer, testTelegramBot, and parseChatCompletion. The two SSH handlers adopt resolveServerSshConfigOrError and getServerById from server-records so the duplicated try/catch around resolveServerSshConfig disappears.
The runServerAction handler declared mutable authMethod/credential variables and wrapped resolveServerSshConfig in a try/catch to map any thrown error to a 400 response. The new resolveServerSshConfigOrError returns a discriminated result that removes the try/catch and the temporary let bindings. The import switch and the call site are the only changes — the SshAuthMethod type import stays because it appears later in resolveServerSshConfigOrError's return type for the rollback target.
decryptApiServerKey used to swallow every decryption failure and return the raw payload. The intent was to support legacy keys that were never encrypted, but the same path also masked real failures: a corrupted ciphertext or wrong key was indistinguishable from "this key was stored in plaintext". AES-GCM uses iv:tag:cipher and we join the three parts with a dot. An unencrypted key has no dots, so the presence of a dot is a reliable signal that the payload was encrypted. If decrypt fails AND the payload contains a dot, surface the error so the caller can prompt for a new key. If decrypt fails AND the payload has no dot, fall back to returning the raw value as before.
The provider tests now mock the new server-records helpers (getServerById, resolveServerSshConfigOrError) and import deployProviderToHermes from ./deploy instead of ./providers. The three 'returns 404 when deployed server is not found', 'returns 500 when bot token decryption fails', and 'returns 502 when the SSH pipeline fails' cases drop one selectLimit chain link each: the third .limit() call used to stand in for the servers lookup that getServerById now owns. The beforeEach setup installs the getServerById mock as the default and the 404 test overrides it to null. The shellQuote mock is now set up at module load time rather than lazily so the deploy compose tests can run before any handler is invoked.
The telegram tests mock the new server-records helpers
(getServerById, resolveServerSshConfigOrError) so the deploy
and test-bot paths can run without a live database or real
credential resolution.
The 'deploy writes compose + restarts containers' and
'deploys when API key contains shell metacharacters' cases
lose their second selectLimit chain link each: the second
.limit() used to mock findServerById, which is now
getServerById and resolves from the new server-records
mock rather than a chained select. The two chain links collapse
to one (getLatestTelegramRecord) and getServerById is set up
explicitly alongside it.
resolveServerSshConfigOrError is mocked to return
{ok: true, ...} in both happy-path cases and the SSH-error
case, matching the new discriminated return shape the
handlers expect.
TelegramSettings mounts TelegramPairingSection with a key bound to savedConfig.deployedServerHost. The key remounts the section when the deploy target changes, so a section that started in "not deployed" state cleanly re-initializes when a new server is selected. TelegramPairingSection now polls /api/telegram/pairings every 10 seconds via useMountEffect. The 10s interval matches the Hermes bot's typical pairing-code lifetime; users no longer have to manually click "Refresh" while a code is being typed in Telegram. loadPairings takes a quiet flag so the polling path can update the list without flashing the loading spinner or overwriting an in-flight user-facing error. The approve handler now takes an optional selectedCode so the new list-row approve buttons can pass the code directly. The text input flow still passes the trimmed input value, preserving the existing manual-entry path. The settings test adds a 'loads pending Telegram pairing requests and approves one from the list' case that mounts the component with a deployed config, waits for the polling fetch to land, and clicks the new row-level approve button.
The split-into-submodules commits landed with single-line imports
where biome prefers the type/value grouped form, and the type
imports in dashboard/summaries.ts and ssh/connection.ts were
not sorted. This is a pure formatting pass — no behavioral
changes.
- server/ssh.ts: type group stays at the top of each export
statement, re-exports are wrapped to the multi-line form biome
picks.
- server/dashboard/metrics.ts and summaries.ts: imports are
split into one-symbol-per-line groups.
- server/ssh/connection.ts: type-only imports move to the top.
- server/servers.ts: the servers/list import is reordered so
the new-module imports sit together with server-records.
- server/dashboard.ts: type imports precede the value imports
inside the { } group, matching the import-sort Biome config.
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
server/servers.ts (1)
145-160:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't let audit-log failures flip successful server mutations into 500s.
These handlers commit the primary
serverschange first, then await a historical audit insert. If the audit write fails, the request can return failure even though the server mutation already succeeded, which is especially risky forconnectServer()because retries can create duplicates.As per coding guidelines, "Sequential writes are fine when primary data → audit log sequences are purely historical and the audit log's absence doesn't affect correctness."🩹 Suggested fix
- await db.insert(auditLogs).values({ - userId: session.user.id, - action: "server.connect.succeeded", - details: { - serverId: serverRecord.id, - host: parsed.host, - osName: verified.osName, - osVersion: verified.osVersion, - architecture: verified.architecture, - supportLevel: verified.supportLevel, - }, - ipAddress, - }); - - clearDashboardCache(); + try { + await db.insert(auditLogs).values({ + userId: session.user.id, + action: "server.connect.succeeded", + details: { + serverId: serverRecord.id, + host: parsed.host, + osName: verified.osName, + osVersion: verified.osVersion, + architecture: verified.architecture, + supportLevel: verified.supportLevel, + }, + ipAddress, + }); + } finally { + clearDashboardCache(); + }Apply the same pattern to
updateServer()anddeleteServer(): treat audit logging as best-effort and keep cache invalidation on the successful primary path.Also applies to: 287-299, 327-338
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/servers.ts` around lines 145 - 160, The audit-log insert (db.insert(auditLogs).values(...)) is making successful mutations into 500s when it fails; wrap the audit insert in a try/catch so it becomes best-effort: in connectServer(), updateServer(), and deleteServer() catch any errors from the audit write, log the error (don’t rethrow) and continue; ensure clearDashboardCache() and the primary server mutation response happen on the successful path (not inside the catch) so cache invalidation and client success are preserved even if the audit write fails.server/telegram.ts (4)
360-365:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winHandle
apiServerKeydecryption failures explicitly.
decryptApiServerKey(record.apiServerKey)runs before the endpoint's main error path. If that ciphertext is corrupted or no longer decryptable, this throws out of the handler and skips the normal JSON error response shape. Wrap it and return a clear failure response here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.ts` around lines 360 - 365, The call to decryptApiServerKey(record.apiServerKey) can throw and currently escapes the handler; wrap that call in a try/catch within the same handler (before calling getProviderDeployConfig) and on failure return the same JSON error response shape used elsewhere (e.g., status and message fields) so corrupted/undecryptable apiServerKey yields a clear HTTP error instead of crashing the request; reference decryptApiServerKey, record.apiServerKey, and the surrounding handler so the catch returns the consistent error payload and stops further processing.
159-175:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't let an audit-log failure turn a successful disconnect into a 500.
Lines 160-175 flip
isActivebefore writing the audit row. IfauditLogsfails, the bot is already disconnected but the handler reports failure and skips the cache clear, which leaves the client and dashboard stale. Make the primary update authoritative and handle the audit insert separately.As per coding guidelines, "Sequential writes are fine when primary data → audit log sequences are purely historical and the audit log's absence doesn't affect correctness. If the audit insert fails, the primary operation is still valid — no data divergence."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.ts` around lines 159 - 175, Perform the primary update (the db.update(...).set({ isActive: false }).where(eq(telegramConfigs.userId, session.user.id)) call) as the authoritative operation and then run the audit insert in a separate try/catch so an audit failure cannot turn a successful disconnect into an error; specifically, keep the update as-is, then wrap the db.insert(auditLogs).values({... userId: session.user.id, action: "telegram.disconnected", details: { botUsername: record.botUsername }, ipAddress }) call in its own try/catch, log the error but do not rethrow, and ensure clearDashboardCache() is always called after the primary update regardless of audit insert outcome.
99-125:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMake the config swap atomic and treat the audit insert as best-effort.
Lines 100-125 can already save the new bot config and still return
500if the audit insert fails, and the preceding deactivate+insert sequence is not atomic. Put the primary config mutation indb.transaction()and keeptelegram.connectedfrom changing the outcome of an otherwise successful connect.As per coding guidelines, "Use
db.transaction()when a write path touches multiple Drizzle statements that must commit or roll back together to maintain consistency" and "If the audit insert fails, the primary operation is still valid — no data divergence."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.ts` around lines 99 - 125, Wrap the primary config mutation (the deactivate-then-insert sequence that calls db.update(...).set(...) on telegramConfigs and db.insert(telegramConfigs).values(...)) inside a db.transaction(...) so the deactivation and new config insert commit or roll back together; keep clearDashboardCache() tied to the successful transaction commit. Move the audit insert (db.insert(auditLogs).values(...)) outside the transaction and make it best-effort by surrounding it with its own try/catch so any failure only logs the error and does not throw or change the successful outcome returned to the caller.
262-300:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftThe post-SSH transaction does not keep local and remote deploy state consistent.
Lines 263-300 mutate Hermes first and persist
deployedServerId/apiServerKeyafterward. If the DB transaction fails after SSH succeeds, the server is already reconfigured with a key that was never saved locally, so the app reports a failed deploy but cannot talk to the deployed instance anymore. This needs a compensating rollback or a staged/persisted deploy record before the remote change.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/telegram.ts` around lines 262 - 300, Persist the intended deploy state before making the remote SSH change instead of applying remote changes first: create a staged/pending deploy record in the DB inside the existing transaction (e.g., add or set pendingServerId/pendingServerHost/pendingApiServerKey via the telegramConfigs row and insert the auditLogs entry) so you atomically record the planned key and target (use encryptSecret when storing the pending key), then call deployComposeViaSsh; on SSH success, run a second transaction to promote pendingServerId/pendingServerHost/pendingApiServerKey to deployedServerId/deployedServerHost/apiServerKey (and clear pending fields), and on SSH failure clear the pending fields or perform a compensating SSH rollback using the previous deployedServerHost/apiServerKey; update code around deployComposeViaSsh, db.transaction, telegramConfigs, auditLogs, encryptSecret, and the session.user.id usage to implement this staged-deploy flow.
🧹 Nitpick comments (1)
server/providers/connection.ts (1)
23-27: ⚡ Quick winAdd a timeout to prevent hung requests.
The
fetchcall has no timeout, so requests to slow or unresponsive endpoints can hang indefinitely. Consider usingAbortSignal.timeout()to bound the request duration.♻️ Proposed fix
try { - response = await fetch(request.url, request.init); + response = await fetch(request.url, { + ...request.init, + signal: AbortSignal.timeout(10_000), + }); } catch { throw new ProviderConnectionError("Connection failed", "connection_failed"); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/providers/connection.ts` around lines 23 - 27, The fetch call can hang because it has no timeout; wrap the request with an AbortSignal.timeout (or create an AbortController and Timer) and attach the resulting signal to request.init before calling fetch in the same block where fetch is invoked (replace the call in connection.ts that uses fetch(request.url, request.init)); choose a sensible timeout (e.g. 10000 ms), handle the abort case in the catch to throw a ProviderConnectionError with a distinct code like "connection_timeout" while keeping the existing "connection_failed" for non-timeout errors, and ensure any previously provided request.init.signal is respected/merged or overridden explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/dashboard/metrics.ts`:
- Around line 61-77: The cached branch currently sets updatedAt to the request
time `now`, which hides the real sample time; change the cached return in the
metrics routine to use the cached sample timestamp (use
`cachedMetrics.timestamp`) for `updatedAt` (formatted with `new
Date(...).toISOString()`), keep computing `status` with `getHealthTone(metrics)`
and all other fields as-is, and ensure `cachedMetrics.data` and
`cachedMetrics.timestamp` are used together so stale-but-cached metrics show
their original sample time rather than the current `now`.
In `@server/dashboard/summaries.ts`:
- Around line 50-53: The offline-fallback branch in toAgentSummary (and the
similar branch around lines 95-99) is synthesizing a new updatedAt on every
request; instead, stop generating a fresh timestamp and preserve the existing
updatedAt from installRecord (or leave it undefined/null) when serverRecord is
null. Locate the updatedAt assignment in toAgentSummary (and the analogous code
block later) and replace any fallback like Date.now() / new Date() with
installRecord?.updatedAt (or null) so we only reflect true server-side updates.
In `@server/deploy.ts`:
- Around line 159-192: The audit log inserts around deployComposeViaSsh() must
be best-effort and must not change the HTTP outcome: wrap the
db.insert(auditLogs).values(...) calls (both the success-path insert after
deployComposeViaSsh() and the failure-path insert inside the catch) in their own
try/catch so any insert error is caught and logged (e.g., via processLogger or
console) but does not alter the response returned by context.json; ensure the
main success return (status: "deployed") is returned even if the success audit
insert fails, and ensure when handling a deploy error you record the original
deploy error message to the response and still attempt the failed-audit insert
but suppress/log any errors from that insert so they never overwrite the
original error handling.
- Line 113: The call to decryptApiServerKey(telegramInfo.apiServerKey) can throw
outside the main deploy try block, so wrap this call in a try/catch (or move it
into the existing deploy try) and handle failures the same way other credential
errors are handled: catch the error thrown by decryptApiServerKey, log the
error, and return the JSON error response (with the same status code and shape
used elsewhere in this file) instead of allowing the exception to escape; update
references to decryptedApiServerKey only after successful decryption.
- Around line 148-149: The deploy flow in server/deploy.ts calls ssh.execCommand
to run `docker exec hermes hermes config set model ...` without sudo, causing
privilege mismatch versus the earlier `sudo docker compose ...`; update the
ssh.execCommand invocation that runs `docker exec hermes hermes config set model
${shellQuote(providerRecord.model)}` to be consistent with the other commands
(e.g., prefix with `sudo` like `sudo docker exec ...`) or remove/soften this
post-restart `config set model` step if it's redundant because the compose
restart already sets API_SERVER_MODEL_NAME; locate the call using the
`ssh.execCommand` invocation and adjust it to use `sudo` or eliminate the extra
command.
In `@server/install/workflow.ts`:
- Around line 122-160: The success audit-log write using
getDb().insert(auditLogs).values({...}) in workflow.ts must be made best-effort
so a failing audit insert cannot trigger the outer catch and flip a completed
install to "failed"; wrap that success insert in its own try/catch, log the
insert error (but do not rethrow) and continue, leaving the existing error path
(the outer catch that emits "failed" and writes the failure audit) unchanged.
In `@server/providers/records.ts`:
- Around line 65-75: The current return narrows record to include
deployedServerHost as a string but only checks apiServerKey and
deployedServerId; update the function to either (A) add a null/undefined check
for record.deployedServerHost and return null if it's missing so the cast to {
botToken, apiServerKey, deployedServerId, deployedServerHost: string } is safe,
or (B) change the asserted/written return type to allow deployedServerHost:
string | null (i.e., widen the return type) so callers handle a nullable
deployedServerHost; reference the existing record variable and the returned
fields apiServerKey, deployedServerId, and deployedServerHost when making the
change.
In `@server/server-actions.ts`:
- Around line 146-182: The transaction that writes auditLogs and updates
installs (db.transaction block) can fail after the remote command already
succeeded, but currently any error bubbles up and the outer catch records
server.action.${action}.failed; to fix, make the DB writes best-effort: wrap the
db.transaction(...) in its own try/catch inside executeServerAction (or the
surrounding function), catch any error from the transaction and log it (e.g.,
processLogger.warn or similar) including context (action, serverId,
versionTarget, error) but do not rethrow or return a failure response; keep the
auditLogs/installs logic (auditLogs, installs, actionSuccessMessage,
versionTarget) unchanged inside the transaction so it still runs when possible.
In `@server/servers/records.ts`:
- Around line 94-112: The current query applies a global .limit(100) before
collectLatestActions(), which can omit the newest row for some serverIds;
replace the global limit with a per-server newest-row filter by using a window
function or DISTINCT ON: build a subquery or use sql`ROW_NUMBER() OVER
(PARTITION BY (auditLogs.details->>'serverId') ORDER BY auditLogs.createdAt
DESC)` (or DISTINCT ON (auditLogs.details->>'serverId') with ORDER BY createdAt
DESC) to mark the latest row per serverId, then select only rows where that
row_number = 1; keep the same selected columns (auditLogs.action,
auditLogs.details, auditLogs.createdAt) and the same predicates
(eq(auditLogs.userId, userId), inArray(...), serverId IN ...), and remove the
global .limit(100); reference getDb(), auditLogs, and the surrounding query
block/collectLatestActions().
In `@server/ssh/os.ts`:
- Around line 34-39: The code currently only throws for missing or "unknown"
osId, allowing non-Linux IDs (e.g. "freebsd") to pass; update the check around
osId (the same block that throws UnsupportedOsError using prettyName) to reject
explicit non-Linux IDs by verifying osId is a Linux identifier (e.g., require
osId === "linux" or check against a small allowlist of known Linux distro IDs)
and throw UnsupportedOsError for any osId that is present but not a Linux value;
reference the osId, prettyName and UnsupportedOsError symbols when making the
change so the parser rejects non-Linux hosts before returning
VerifiedServerInfo.
In `@src/features/telegram/telegram-connect-section.tsx`:
- Around line 81-99: The disconnect flow currently only handles non-2xx
responses and ignores fetch rejections; wrap the await
fetch("/api/telegram/disconnect", ...) in a try/catch inside handleDisconnect
(or the enclosing async function) and in the catch call setError with a helpful
message (e.g., payload from the thrown error or a fallback like "Unable to
disconnect Telegram") so the user sees the failure; keep the existing finally
block that calls setIsDisconnecting(false) and preserve onDisconnect() and
setSuccessMessage("Telegram bot disconnected") on success.
- Around line 51-73: The fetch call in the connect handler can reject and
currently escapes the try/finally; wrap the await fetch("/api/telegram/connect",
...) (or the whole request+response.json sequence) with a .catch or an inner
try/catch so network failures are mapped to setError(...) before the finally;
specifically, catch errors from the fetch/response.json steps, call setError
with a user-friendly message (e.g. error.message or "Network error"), and return
early so onConfigChange, setValue, and setSuccessMessage are not invoked; keep
the existing finally that calls setIsConnecting(false).
In `@src/features/telegram/telegram-deploy-section.tsx`:
- Around line 29-55: The try/finally currently lets fetch() throw without
reporting an error to the user; wrap the await fetch and subsequent JSON parsing
in a try/catch (or add a catch block) so any thrown error sets setError(...)
with a clear message (include error.message) and returns before calling
onConfigChange and setSuccessMessage; ensure setIsDeploying(false) remains in
the finally. Update the block that uses fetch("/api/telegram/deploy") and the
JSON parsing so exceptions trigger setError instead of silently falling through.
In `@src/features/telegram/telegram-input-class.ts`:
- Around line 1-2: The class string exported as inputClassName currently uses
"outline-none" which removes native focus outlines even under forced-colors;
update the string to use "outline-hidden" instead of "outline-none" so the
custom focus ring (focus:ring-*) is preserved while allowing forced-colors focus
indication to remain; modify the inputClassName constant value to swap
"outline-none" for "outline-hidden".
In `@src/features/telegram/telegram-pairing-section.tsx`:
- Around line 56-76: The loadPairings() function currently only handles non-OK
HTTP responses but doesn't catch network/fetch rejections; wrap the fetch and
response.json() logic in a try/catch (or add a catch around the await fetch)
inside loadPairings() so any thrown errors are caught, call setError(...) with a
useful message (and include error.message if available), ensure
setIsLoadingPairings(false) runs in the finally block, and return early on
error; reference the existing loadPairings, setError, setIsLoadingPairings,
setPairings, and the fetch("/api/telegram/pairings")/response.json() calls when
applying the fix.
- Around line 90-113: The approve flow currently only handles non-2xx responses
but not network/fetch rejections; update handleApprovePairing to catch thrown
errors from fetch and call setError(...) with a helpful message (and optionally
the error.message) so rejected requests aren't silent. Specifically wrap the
await fetch/response.json logic in a try/catch (or add a catch after the
promise) and in the catch call setError(...) and return; keep the existing
finally block that calls setIsApprovingPairing(false) and ensure loadPairings,
setPairingCode, and setSuccessMessage are only called on successful approvals.
In `@src/features/telegram/telegram-test-section.tsx`:
- Around line 27-47: The try/finally around the fetch leaves network/fetch
rejections unreported; add a catch block around the fetch/response parsing to
call setTestError with the network error message (e.g., error?.message ||
String(error)) so broken tunnels/offline browsers surface an error to the user,
keep the existing response.ok handling and setTestResponse logic, and retain the
finally block that calls setIsTesting(false); update the block that performs
fetch(...) in telegram-test-section.tsx (the async block using setTestError,
setTestResponse, setIsTesting) to implement this catch.
---
Outside diff comments:
In `@server/servers.ts`:
- Around line 145-160: The audit-log insert (db.insert(auditLogs).values(...))
is making successful mutations into 500s when it fails; wrap the audit insert in
a try/catch so it becomes best-effort: in connectServer(), updateServer(), and
deleteServer() catch any errors from the audit write, log the error (don’t
rethrow) and continue; ensure clearDashboardCache() and the primary server
mutation response happen on the successful path (not inside the catch) so cache
invalidation and client success are preserved even if the audit write fails.
In `@server/telegram.ts`:
- Around line 360-365: The call to decryptApiServerKey(record.apiServerKey) can
throw and currently escapes the handler; wrap that call in a try/catch within
the same handler (before calling getProviderDeployConfig) and on failure return
the same JSON error response shape used elsewhere (e.g., status and message
fields) so corrupted/undecryptable apiServerKey yields a clear HTTP error
instead of crashing the request; reference decryptApiServerKey,
record.apiServerKey, and the surrounding handler so the catch returns the
consistent error payload and stops further processing.
- Around line 159-175: Perform the primary update (the db.update(...).set({
isActive: false }).where(eq(telegramConfigs.userId, session.user.id)) call) as
the authoritative operation and then run the audit insert in a separate
try/catch so an audit failure cannot turn a successful disconnect into an error;
specifically, keep the update as-is, then wrap the
db.insert(auditLogs).values({... userId: session.user.id, action:
"telegram.disconnected", details: { botUsername: record.botUsername }, ipAddress
}) call in its own try/catch, log the error but do not rethrow, and ensure
clearDashboardCache() is always called after the primary update regardless of
audit insert outcome.
- Around line 99-125: Wrap the primary config mutation (the
deactivate-then-insert sequence that calls db.update(...).set(...) on
telegramConfigs and db.insert(telegramConfigs).values(...)) inside a
db.transaction(...) so the deactivation and new config insert commit or roll
back together; keep clearDashboardCache() tied to the successful transaction
commit. Move the audit insert (db.insert(auditLogs).values(...)) outside the
transaction and make it best-effort by surrounding it with its own try/catch so
any failure only logs the error and does not throw or change the successful
outcome returned to the caller.
- Around line 262-300: Persist the intended deploy state before making the
remote SSH change instead of applying remote changes first: create a
staged/pending deploy record in the DB inside the existing transaction (e.g.,
add or set pendingServerId/pendingServerHost/pendingApiServerKey via the
telegramConfigs row and insert the auditLogs entry) so you atomically record the
planned key and target (use encryptSecret when storing the pending key), then
call deployComposeViaSsh; on SSH success, run a second transaction to promote
pendingServerId/pendingServerHost/pendingApiServerKey to
deployedServerId/deployedServerHost/apiServerKey (and clear pending fields), and
on SSH failure clear the pending fields or perform a compensating SSH rollback
using the previous deployedServerHost/apiServerKey; update code around
deployComposeViaSsh, db.transaction, telegramConfigs, auditLogs, encryptSecret,
and the session.user.id usage to implement this staged-deploy flow.
---
Nitpick comments:
In `@server/providers/connection.ts`:
- Around line 23-27: The fetch call can hang because it has no timeout; wrap the
request with an AbortSignal.timeout (or create an AbortController and Timer) and
attach the resulting signal to request.init before calling fetch in the same
block where fetch is invoked (replace the call in connection.ts that uses
fetch(request.url, request.init)); choose a sensible timeout (e.g. 10000 ms),
handle the abort case in the catch to throw a ProviderConnectionError with a
distinct code like "connection_timeout" while keeping the existing
"connection_failed" for non-timeout errors, and ensure any previously provided
request.init.signal is respected/merged or overridden explicitly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c57a8c05-2c09-4f45-a4ac-1dec0983f31d
📒 Files selected for processing (50)
AGENTS.mdCONTEXT.mdREADME.mddocs/adr/0009-single-instance-boundary-for-operational-state.mddocs/adr/0010-hermes-runtime-management-from-telegram-page.mddocs/api-reference.mdserver/app.tsserver/compose.tsserver/crypto.tsserver/dashboard.tsserver/dashboard/metrics.tsserver/dashboard/records.tsserver/dashboard/summaries.tsserver/deploy.tsserver/install.tsserver/install/records.tsserver/install/workflow.tsserver/lib/get-last-4.tsserver/providers.test.tsserver/providers.tsserver/providers/config.tsserver/providers/connection.tsserver/providers/records.tsserver/server-actions.test.tsserver/server-actions.tsserver/server-records.tsserver/servers.tsserver/servers/list.tsserver/servers/records.tsserver/ssh.test.tsserver/ssh.tsserver/ssh/connection.tsserver/ssh/errors.tsserver/ssh/os.tsserver/ssh/quoting.tsserver/telegram.test.tsserver/telegram.tsserver/telegram/config.tsserver/telegram/pairings.tsserver/telegram/records.tssrc/features/telegram/telegram-connect-section.tsxsrc/features/telegram/telegram-deploy-section.tsxsrc/features/telegram/telegram-input-class.tssrc/features/telegram/telegram-pairing-section.tsxsrc/features/telegram/telegram-settings.test.tsxsrc/features/telegram/telegram-settings.tsxsrc/features/telegram/telegram-sidebar.tsxsrc/features/telegram/telegram-test-section.tsxsrc/lib/ai-providers.test.tssrc/lib/ai-providers.ts
✅ Files skipped from review due to trivial changes (7)
- server/lib/get-last-4.ts
- docs/adr/0009-single-instance-boundary-for-operational-state.md
- CONTEXT.md
- AGENTS.md
- server/ssh/errors.ts
- docs/adr/0010-hermes-runtime-management-from-telegram-page.md
- README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- server/app.ts
| if (cachedMetrics && now - cachedMetrics.timestamp < METRICS_CACHE_TTL_MS) { | ||
| const metrics = cachedMetrics.data; | ||
| const status = getHealthTone(metrics); | ||
|
|
||
| return { | ||
| status, | ||
| updatedAt: new Date(now).toISOString(), | ||
| cpu: metrics.cpu, | ||
| memory: metrics.memory, | ||
| disk: metrics.disk, | ||
| uptime: metrics.uptime, | ||
| detail: | ||
| status === "warning" | ||
| ? "One or more VPS resources are running hot." | ||
| : "The connected VPS is responding to live health checks.", | ||
| error: null, | ||
| }; |
There was a problem hiding this comment.
Preserve the real sample time on cached metrics.
Using now here makes stale-but-cached metrics look freshly collected on every request. The dashboard should report when the cached sample was actually read from the VPS.
🩹 Suggested fix
return {
status,
- updatedAt: new Date(now).toISOString(),
+ updatedAt: new Date(cachedMetrics.timestamp).toISOString(),
cpu: metrics.cpu,
memory: metrics.memory,
disk: metrics.disk,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (cachedMetrics && now - cachedMetrics.timestamp < METRICS_CACHE_TTL_MS) { | |
| const metrics = cachedMetrics.data; | |
| const status = getHealthTone(metrics); | |
| return { | |
| status, | |
| updatedAt: new Date(now).toISOString(), | |
| cpu: metrics.cpu, | |
| memory: metrics.memory, | |
| disk: metrics.disk, | |
| uptime: metrics.uptime, | |
| detail: | |
| status === "warning" | |
| ? "One or more VPS resources are running hot." | |
| : "The connected VPS is responding to live health checks.", | |
| error: null, | |
| }; | |
| if (cachedMetrics && now - cachedMetrics.timestamp < METRICS_CACHE_TTL_MS) { | |
| const metrics = cachedMetrics.data; | |
| const status = getHealthTone(metrics); | |
| return { | |
| status, | |
| updatedAt: new Date(cachedMetrics.timestamp).toISOString(), | |
| cpu: metrics.cpu, | |
| memory: metrics.memory, | |
| disk: metrics.disk, | |
| uptime: metrics.uptime, | |
| detail: | |
| status === "warning" | |
| ? "One or more VPS resources are running hot." | |
| : "The connected VPS is responding to live health checks.", | |
| error: null, | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/dashboard/metrics.ts` around lines 61 - 77, The cached branch
currently sets updatedAt to the request time `now`, which hides the real sample
time; change the cached return in the metrics routine to use the cached sample
timestamp (use `cachedMetrics.timestamp`) for `updatedAt` (formatted with `new
Date(...).toISOString()`), keep computing `status` with `getHealthTone(metrics)`
and all other fields as-is, and ensure `cachedMetrics.data` and
`cachedMetrics.timestamp` are used together so stale-but-cached metrics show
their original sample time rather than the current `now`.
| export function toAgentSummary( | ||
| serverRecord: Pick<ServerRecord, "status"> | null, | ||
| installRecord: InstallRecord | null, | ||
| ): DashboardAgentSummary { |
There was a problem hiding this comment.
Don't synthesize a fresh updatedAt for the offline fallback.
This branch changes updatedAt on every request even when nothing changed server-side. That makes the agent card look recently updated forever.
🩹 Suggested fix
export function toAgentSummary(
- serverRecord: Pick<ServerRecord, "status"> | null,
+ serverRecord: Pick<ServerRecord, "status" | "updatedAt"> | null,
installRecord: InstallRecord | null,
): DashboardAgentSummary {
@@
return {
status: "offline",
- updatedAt: serverRecord.status ? new Date().toISOString() : null,
+ updatedAt: serverRecord.updatedAt.toISOString(),
detail: "The VPS is connected, but Hermes has not finished installing yet.",
};
}Also applies to: 95-99
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/dashboard/summaries.ts` around lines 50 - 53, The offline-fallback
branch in toAgentSummary (and the similar branch around lines 95-99) is
synthesizing a new updatedAt on every request; instead, stop generating a fresh
timestamp and preserve the existing updatedAt from installRecord (or leave it
undefined/null) when serverRecord is null. Locate the updatedAt assignment in
toAgentSummary (and the analogous code block later) and replace any fallback
like Date.now() / new Date() with installRecord?.updatedAt (or null) so we only
reflect true server-side updates.
| return context.json({ error: "API key is required." }, 500); | ||
| } | ||
|
|
||
| const decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey); |
There was a problem hiding this comment.
Catch decryptApiServerKey() failures here.
Unlike the bot token and provider key paths, this decryption can throw before the main deploy try, so a bad apiServerKey escapes the handler without the normal JSON error response.
🛡️ Proposed fix
- const decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);
+ let decryptedApiServerKey: string;
+ try {
+ decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);
+ } catch {
+ return context.json({ error: "Failed to decrypt API server key." }, 500);
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey); | |
| let decryptedApiServerKey: string; | |
| try { | |
| decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey); | |
| } catch { | |
| return context.json({ error: "Failed to decrypt API server key." }, 500); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/deploy.ts` at line 113, The call to
decryptApiServerKey(telegramInfo.apiServerKey) can throw outside the main deploy
try block, so wrap this call in a try/catch (or move it into the existing deploy
try) and handle failures the same way other credential errors are handled: catch
the error thrown by decryptApiServerKey, log the error, and return the JSON
error response (with the same status code and shape used elsewhere in this file)
instead of allowing the exception to escape; update references to
decryptedApiServerKey only after successful decryption.
| const configResult = await ssh.execCommand( | ||
| `docker exec hermes hermes config set model ${shellQuote(providerRecord.model)}`, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '--- surrounding deploy commands ---\n'
sed -n '144,152p' server/deploy.ts | cat -n
printf '\n--- docker command usage under server/ ---\n'
rg -n --type=ts 'sudo docker|docker exec|docker compose' serverRepository: jellydn/hermes-hub
Length of output: 158
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- surrounding deploy commands ---'
sed -n '130,175p' server/deploy.ts | cat -n
echo
echo '--- docker command usage under server/ ---'
rg -n --type=ts -S 'sudo docker exec|sudo docker compose|docker exec|docker compose' serverRepository: jellydn/hermes-hub
Length of output: 2922
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- deployComposeViaSsh definition ---'
rg -n --type=ts 'function deployComposeViaSsh|deployComposeViaSsh' server/deploy.ts server -SRepository: jellydn/hermes-hub
Length of output: 508
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- deployComposeViaSsh definition/usages ---'
rg -n --type=ts 'deployComposeViaSsh' server -SRepository: jellydn/hermes-hub
Length of output: 373
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- server/deploy.ts: deployComposeViaSsh body ---'
sed -n '1,120p' server/deploy.ts | cat -n
echo
echo '--- server/deploy.ts: around extraSshCommands usage ---'
sed -n '120,190p' server/deploy.ts | cat -n
echo
echo '--- any sudo docker exec usage under server/ ---'
rg -n --type=ts 'sudo docker exec' serverRepository: jellydn/hermes-hub
Length of output: 6872
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- locate ssh helpers ---'
rg -n --type=ts 'withSshConnection|execCommand|shellQuote|docker group|usermod -aG docker|sudo ' server/ssh.ts server -S
echo
echo '--- server/ssh.ts (head) ---'
sed -n '1,220p' server/ssh.ts | cat -nRepository: jellydn/hermes-hub
Length of output: 9906
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- server/ssh/connection.ts ---'
sed -n '1,200p' server/ssh/connection.ts | cat -n
echo
echo '--- any mention of sudo in server/ssh/connection.ts ---'
rg -n --type=ts 'sudo ' server/ssh/connection.tsRepository: jellydn/hermes-hub
Length of output: 2282
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- find buildHermesComposeContent definition ---'
rg -n --type=ts 'function buildHermesComposeContent|buildHermesComposeContent' server/compose.ts server -SRepository: jellydn/hermes-hub
Length of output: 1373
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- server/compose.ts ---'
sed -n '1,220p' server/compose.ts | cat -nRepository: jellydn/hermes-hub
Length of output: 1925
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- search for API_SERVER_MODEL_NAME usage in repo ---'
rg -n --type=ts 'API_SERVER_MODEL_NAME|config set model|hermes config set model' server src -SRepository: jellydn/hermes-hub
Length of output: 415
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo '--- server/providers.test.ts around sudo docker compose and docker exec expectations ---'
sed -n '540,640p' server/providers.test.ts | cat -nRepository: jellydn/hermes-hub
Length of output: 3699
Use sudo (or ensure Docker access) for docker exec in the deploy SSH flow
server/deploy.tsrestarts Hermes withsudo docker compose ...but then runsdocker exec hermes hermes config set model ...withoutsudo.- The compose file already sets
API_SERVER_MODEL_NAMEon container recreate, so a privilege-mismatch failure here can make the deploy endpoint throw and logprovider.deploy.failedeven though Hermes likely already restarted with the correct model. - Fix: make the
docker execconsistent with thesudo docker composeapproach (e.g.,sudo docker exec ...) or remove/soften this extraconfig set modelstep if it’s redundant.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/deploy.ts` around lines 148 - 149, The deploy flow in server/deploy.ts
calls ssh.execCommand to run `docker exec hermes hermes config set model ...`
without sudo, causing privilege mismatch versus the earlier `sudo docker compose
...`; update the ssh.execCommand invocation that runs `docker exec hermes hermes
config set model ${shellQuote(providerRecord.model)}` to be consistent with the
other commands (e.g., prefix with `sudo` like `sudo docker exec ...`) or
remove/soften this post-restart `config set model` step if it's redundant
because the compose restart already sets API_SERVER_MODEL_NAME; locate the call
using the `ssh.execCommand` invocation and adjust it to use `sudo` or eliminate
the extra command.
| await db.insert(auditLogs).values({ | ||
| userId: session.user.id, | ||
| action: "provider.deploy.succeeded", | ||
| details: { | ||
| provider: providerRecord.provider, | ||
| model: providerRecord.model, | ||
| serverId: serverRecord.id, | ||
| serverHost: serverRecord.host, | ||
| }, | ||
| ipAddress, | ||
| }); | ||
|
|
||
| return context.json({ | ||
| status: "deployed", | ||
| provider: providerRecord.provider, | ||
| model: providerRecord.model, | ||
| serverHost: serverRecord.host, | ||
| }); | ||
| } catch (error) { | ||
| const message = error instanceof Error ? error.message : "Deploy failed"; | ||
|
|
||
| await db.insert(auditLogs).values({ | ||
| userId: session.user.id, | ||
| action: "provider.deploy.failed", | ||
| details: { | ||
| provider: providerRecord.provider, | ||
| model: providerRecord.model, | ||
| serverId: serverRecord.id, | ||
| error: message, | ||
| }, | ||
| ipAddress, | ||
| }); | ||
|
|
||
| return context.json({ error: `Deploy failed: ${message}` }, 502); |
There was a problem hiding this comment.
Don't let audit inserts change the deploy outcome.
After deployComposeViaSsh() succeeds, the primary operation is already done. If either auditLogs insert throws, this handler returns 502 even though Hermes was updated, and the failure-log insert can also mask the original deploy error. Keep audit persistence best-effort and independent from the HTTP result.
As per coding guidelines, "Sequential writes are fine when primary data → audit log sequences are purely historical and the audit log's absence doesn't affect correctness. If the audit insert fails, the primary operation is still valid — no data divergence."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/deploy.ts` around lines 159 - 192, The audit log inserts around
deployComposeViaSsh() must be best-effort and must not change the HTTP outcome:
wrap the db.insert(auditLogs).values(...) calls (both the success-path insert
after deployComposeViaSsh() and the failure-path insert inside the catch) in
their own try/catch so any insert error is caught and logged (e.g., via
processLogger or console) but does not alter the response returned by
context.json; ensure the main success return (status: "deployed") is returned
even if the success audit insert fails, and ensure when handling a deploy error
you record the original deploy error message to the response and still attempt
the failed-audit insert but suppress/log any errors from that insert so they
never overwrite the original error handling.
| try { | ||
| const response = await fetch("/api/telegram/deploy", { | ||
| method: "POST", | ||
| }); | ||
|
|
||
| const payload = (await response.json().catch(() => null)) as { | ||
| error?: string; | ||
| status?: string; | ||
| serverHost?: string; | ||
| } | null; | ||
|
|
||
| if (!response.ok) { | ||
| setError(payload?.error ?? "Deploy failed"); | ||
| return; | ||
| } | ||
|
|
||
| const serverHost = payload?.serverHost ?? null; | ||
| onConfigChange({ | ||
| ...savedConfig, | ||
| deployedServerHost: serverHost, | ||
| }); | ||
| setSuccessMessage( | ||
| `Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`, | ||
| ); | ||
| } finally { | ||
| setIsDeploying(false); | ||
| } |
There was a problem hiding this comment.
Handle rejected deploy requests explicitly.
If fetch() throws here, the spinner clears but the user never gets an error message. This makes transient network failures look like a no-op.
🩹 Suggested fix
try {
const response = await fetch("/api/telegram/deploy", {
method: "POST",
});
@@
setSuccessMessage(
`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
);
+ } catch (error) {
+ setError(error instanceof Error ? error.message : "Deploy failed");
} finally {
setIsDeploying(false);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| const response = await fetch("/api/telegram/deploy", { | |
| method: "POST", | |
| }); | |
| const payload = (await response.json().catch(() => null)) as { | |
| error?: string; | |
| status?: string; | |
| serverHost?: string; | |
| } | null; | |
| if (!response.ok) { | |
| setError(payload?.error ?? "Deploy failed"); | |
| return; | |
| } | |
| const serverHost = payload?.serverHost ?? null; | |
| onConfigChange({ | |
| ...savedConfig, | |
| deployedServerHost: serverHost, | |
| }); | |
| setSuccessMessage( | |
| `Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`, | |
| ); | |
| } finally { | |
| setIsDeploying(false); | |
| } | |
| try { | |
| const response = await fetch("/api/telegram/deploy", { | |
| method: "POST", | |
| }); | |
| const payload = (await response.json().catch(() => null)) as { | |
| error?: string; | |
| status?: string; | |
| serverHost?: string; | |
| } | null; | |
| if (!response.ok) { | |
| setError(payload?.error ?? "Deploy failed"); | |
| return; | |
| } | |
| const serverHost = payload?.serverHost ?? null; | |
| onConfigChange({ | |
| ...savedConfig, | |
| deployedServerHost: serverHost, | |
| }); | |
| setSuccessMessage( | |
| `Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`, | |
| ); | |
| } catch (error) { | |
| setError(error instanceof Error ? error.message : "Deploy failed"); | |
| } finally { | |
| setIsDeploying(false); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/telegram/telegram-deploy-section.tsx` around lines 29 - 55, The
try/finally currently lets fetch() throw without reporting an error to the user;
wrap the await fetch and subsequent JSON parsing in a try/catch (or add a catch
block) so any thrown error sets setError(...) with a clear message (include
error.message) and returns before calling onConfigChange and setSuccessMessage;
ensure setIsDeploying(false) remains in the finally. Update the block that uses
fetch("/api/telegram/deploy") and the JSON parsing so exceptions trigger
setError instead of silently falling through.
| export const inputClassName = | ||
| "w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-none focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]"; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find Tailwind version and relevant config
rg -n --hidden --no-ignore -S '"tailwindcss"|tailwindcss' package.json pnpm-lock.yaml yarn.lock package-lock.json 2>/dev/null || true
ls -la
# Show tailwind config files if present
fd tailwind.config.* -t f . || true
fd postcss.config.* -t f . || true
# Inspect the target file around the class definition
fd telegram-input-class.ts -t f src || true
if [ -f src/features/telegram/telegram-input-class.ts ]; then
sed -n '1,120p' src/features/telegram/telegram-input-class.ts
fi
# Find other outline usage in the repo to gauge conventions
rg -n --hidden -S 'outline-none|outline-hidden|focus:ring' src || trueRepository: jellydn/hermes-hub
Length of output: 5571
🌐 Web query:
Tailwind CSS v4 outline-hidden outline-none focus ring double outline
💡 Result:
In Tailwind CSS v4, the utilities for managing outlines and focus indicators have been refined to clarify their behavior and improve accessibility [1][2]. Key Changes and Utilities - outline-none: In v4, this utility simply sets outline-style: none [3][1]. This is a departure from v3, where it was a more complex class that also set a 2px transparent outline for accessibility [1][2]. - outline-hidden: This is the v4 replacement for the "old" v3 behavior of outline-none [1][2]. It hides the default browser outline while preserving the outline in forced colors mode (e.g., Windows High Contrast Mode), making it the recommended choice for removing focus rings without sacrificing accessibility [3][2][4]. - outline-double: This is a standard utility class that sets the CSS outline-style to double [3]. Focus Rings and Accessibility Tailwind CSS provides two primary ways to create focus indicators: 1. Outline Utilities (e.g., outline-2, outline-blue-500, outline-offset-2): These use the native CSS outline property [3][5][6]. They are generally preferred for accessibility because they automatically respect forced-colors modes and user-defined system styles [5]. 2. Ring Utilities (e.g., ring-2, ring-offset-2): These are implemented using CSS box-shadows [7][5][6]. While highly customizable (supporting opacity, complex offsets, and inset shadows), they do not appear in forced-colors modes because box-shadows are typically ignored in those environments [5][2]. Recommendation For focus indicators, the official recommendation is to use the outline utilities if accessibility in forced-colors mode is a priority [5]. If you need to remove the default browser outline, use outline-hidden [2]. If you require a custom, highly styled focus indicator, the ring utilities are appropriate, provided you are aware of their limitations regarding forced-colors modes [5][2].
Citations:
- 1: [v4] :focus-visible no longer overrides default outline for some reason tailwindlabs/tailwindcss#15152
- 2: Replace
outline-nonewithoutline-hidden, add newoutline-nonetailwindlabs/tailwindcss#14926 - 3: https://tailwindcss.com/docs/outline-style
- 4:
outline-hiddenonly applies transparent outline whenforced-colorsis active? tailwindlabs/tailwindcss#16926 - 5: https://github.com/stevekinney/stevekinney.net/blob/main/courses/tailwind/border-outline-ring.md
- 6: https://stackoverflow.com/questions/75649221/whats-the-difference-between-outline-and-ring-in-tailwind
- 7: https://medium.com/@nagarajvela/tailwind-v4-rounded-corners-the-complete-utility-guide-5fcc9d761808
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find Tailwind version and relevant config
rg -n --hidden --no-ignore -S '"tailwindcss"|tailwindcss' package.json pnpm-lock.yaml yarn.lock package-lock.json 2>/dev/null || true
ls -la
# Show tailwind config files if present
fd tailwind.config.* -t f . || true
fd postcss.config.* -t f . || true
# Inspect the target file around the class definition
fd telegram-input-class.ts -t f src || true
if [ -f src/features/telegram/telegram-input-class.ts ]; then
sed -n '1,120p' src/features/telegram/telegram-input-class.ts
fi
# Find other outline usage in the repo to gauge conventions
rg -n --hidden -S 'outline-none|outline-hidden|focus:ring' src || trueRepository: jellydn/hermes-hub
Length of output: 5571
Use outline-hidden instead of outline-none for the custom focus ring
In Tailwind v4, outline-none sets outline-style: none and can hide the native outline even in forced-colors; outline-hidden hides the default outline while preserving forced-colors focus indication alongside your focus:ring-*.
🩹 Suggested fix
export const inputClassName =
- "w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-none focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";
+ "w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-hidden focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const inputClassName = | |
| "w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-none focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]"; | |
| export const inputClassName = | |
| "w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-hidden focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]"; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/telegram/telegram-input-class.ts` around lines 1 - 2, The class
string exported as inputClassName currently uses "outline-none" which removes
native focus outlines even under forced-colors; update the string to use
"outline-hidden" instead of "outline-none" so the custom focus ring
(focus:ring-*) is preserved while allowing forced-colors focus indication to
remain; modify the inputClassName constant value to swap "outline-none" for
"outline-hidden".
| try { | ||
| const response = await fetch("/api/telegram/pairings"); | ||
| const payload = (await response.json().catch(() => null)) as { | ||
| error?: string; | ||
| pairings?: TelegramPairingSummary; | ||
| } | null; | ||
|
|
||
| if (!response.ok || !payload?.pairings) { | ||
| if (!quiet) { | ||
| setError(payload?.error ?? "Unable to load pairings"); | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| setPairings(payload.pairings); | ||
| setLastLoadedAt(new Date()); | ||
| } finally { | ||
| if (!quiet) { | ||
| setIsLoadingPairings(false); | ||
| } | ||
| } |
There was a problem hiding this comment.
Catch polling and manual-refresh fetch failures.
loadPairings() only handles HTTP responses. If fetch("/api/telegram/pairings") rejects, that bubbles out of both the refresh button path and the interval callback, so users get no feedback and the poller throws unhandled rejections.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/telegram/telegram-pairing-section.tsx` around lines 56 - 76, The
loadPairings() function currently only handles non-OK HTTP responses but doesn't
catch network/fetch rejections; wrap the fetch and response.json() logic in a
try/catch (or add a catch around the await fetch) inside loadPairings() so any
thrown errors are caught, call setError(...) with a useful message (and include
error.message if available), ensure setIsLoadingPairings(false) runs in the
finally block, and return early on error; reference the existing loadPairings,
setError, setIsLoadingPairings, setPairings, and the
fetch("/api/telegram/pairings")/response.json() calls when applying the fix.
| try { | ||
| const response = await fetch("/api/telegram/pairings/approve", { | ||
| method: "POST", | ||
| headers: { "content-type": "application/json" }, | ||
| body: JSON.stringify({ code }), | ||
| }); | ||
|
|
||
| const payload = (await response.json().catch(() => null)) as { | ||
| error?: string; | ||
| approved?: { userId: string; userName: string }; | ||
| } | null; | ||
|
|
||
| if (!response.ok || !payload?.approved) { | ||
| setError(payload?.error ?? "Unable to approve pairing"); | ||
| return; | ||
| } | ||
|
|
||
| const displayName = payload.approved.userName || payload.approved.userId; | ||
| setPairingCode(""); | ||
| setSuccessMessage(`Approved ${displayName || "Telegram user"}.`); | ||
| void loadPairings({ quiet: true }); | ||
| } finally { | ||
| setIsApprovingPairing(false); | ||
| } |
There was a problem hiding this comment.
Catch rejected approve requests.
handleApprovePairing() has the same gap: rejected fetch("/api/telegram/pairings/approve") calls never hit setError(...), so approval failures can look like no-op network glitches. Handle rejected requests the same way as non-2xx responses.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/telegram/telegram-pairing-section.tsx` around lines 90 - 113,
The approve flow currently only handles non-2xx responses but not network/fetch
rejections; update handleApprovePairing to catch thrown errors from fetch and
call setError(...) with a helpful message (and optionally the error.message) so
rejected requests aren't silent. Specifically wrap the await fetch/response.json
logic in a try/catch (or add a catch after the promise) and in the catch call
setError(...) and return; keep the existing finally block that calls
setIsApprovingPairing(false) and ensure loadPairings, setPairingCode, and
setSuccessMessage are only called on successful approvals.
| try { | ||
| const response = await fetch("/api/telegram/test", { | ||
| method: "POST", | ||
| headers: { "content-type": "application/json" }, | ||
| body: JSON.stringify({ message: testMessage }), | ||
| }); | ||
|
|
||
| const payload = (await response.json().catch(() => null)) as { | ||
| error?: string; | ||
| response?: string; | ||
| } | null; | ||
|
|
||
| if (!response.ok) { | ||
| setTestError(payload?.error ?? "Test failed"); | ||
| return; | ||
| } | ||
|
|
||
| setTestResponse(payload?.response ?? "(empty response)"); | ||
| } finally { | ||
| setIsTesting(false); | ||
| } |
There was a problem hiding this comment.
Surface network failures in the test flow.
fetch() rejections skip testError, so a broken tunnel or offline browser just clears the loading state without telling the user what happened.
🩹 Suggested fix
try {
const response = await fetch("/api/telegram/test", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ message: testMessage }),
});
@@
setTestResponse(payload?.response ?? "(empty response)");
+ } catch (error) {
+ setTestError(error instanceof Error ? error.message : "Test failed");
} finally {
setIsTesting(false);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try { | |
| const response = await fetch("/api/telegram/test", { | |
| method: "POST", | |
| headers: { "content-type": "application/json" }, | |
| body: JSON.stringify({ message: testMessage }), | |
| }); | |
| const payload = (await response.json().catch(() => null)) as { | |
| error?: string; | |
| response?: string; | |
| } | null; | |
| if (!response.ok) { | |
| setTestError(payload?.error ?? "Test failed"); | |
| return; | |
| } | |
| setTestResponse(payload?.response ?? "(empty response)"); | |
| } finally { | |
| setIsTesting(false); | |
| } | |
| try { | |
| const response = await fetch("/api/telegram/test", { | |
| method: "POST", | |
| headers: { "content-type": "application/json" }, | |
| body: JSON.stringify({ message: testMessage }), | |
| }); | |
| const payload = (await response.json().catch(() => null)) as { | |
| error?: string; | |
| response?: string; | |
| } | null; | |
| if (!response.ok) { | |
| setTestError(payload?.error ?? "Test failed"); | |
| return; | |
| } | |
| setTestResponse(payload?.response ?? "(empty response)"); | |
| } catch (error) { | |
| setTestError(error instanceof Error ? error.message : "Test failed"); | |
| } finally { | |
| setIsTesting(false); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/features/telegram/telegram-test-section.tsx` around lines 27 - 47, The
try/finally around the fetch leaves network/fetch rejections unreported; add a
catch block around the fetch/response parsing to call setTestError with the
network error message (e.g., error?.message || String(error)) so broken
tunnels/offline browsers surface an error to the user, keep the existing
response.ok handling and setTestResponse logic, and retain the finally block
that calls setIsTesting(false); update the block that performs fetch(...) in
telegram-test-section.tsx (the async block using setTestError, setTestResponse,
setIsTesting) to implement this catch.
Older approval runs left pairing files owned by root inside the Hermes container, making them unreadable by Telegram polling which runs as the hermes user. Add a chown repair step before the Python pairing command and run the pairing command with --user hermes to ensure approved users stay readable.
…ship and ensure accessibility for Telegram gateway
…onnect Invert the requireHostKeyPin flag to allowMissingHostKeyPin. The secure behavior (fail closed when no fingerprint is stored) is now the default. Only verifyServerConnection (first-connect) opts out by passing allowMissingHostKeyPin: true. This eliminates requireHostKeyPin: true from 15 call sites and removes the field from DeployComposeInput, ManagedComposeDeployInput, and SshConnectionConfig intermediate types. A future caller that forgets the flag is safe by default. Also extracts parseOptionId helper and generic keepLatestBy to eliminate copy-paste. Uses composition for DeployComposeInput (extends SshConnectionInput instead of duplicating its fields). Review fixes #1, #2, #4, #5.
* feat: add Telegram model/provider switch endpoint
Add POST /api/telegram/model-switch endpoint that allows switching the
AI model and/or provider on a deployed Hermes server via SSH.
What:
- New switchModelProvider handler in server/telegram.ts
- Accepts { model?, provider? } in request body
- Validates provider against known ApiProviderId values
- Validates model string format and provider-model compatibility
- Uses SSH to execute hermes config set commands on the remote server
- Writes audit logs for success and failure cases
Why:
- Closes #34
- Users currently need to use the web UI to change model/provider
- This enables programmatic switching (e.g., from Telegram bot commands)
How:
- Reuses existing setProviderModel/setProviderInferenceProvider from runtime.ts
- Reuses PROVIDER_ENV_CONFIGS for provider-to-hermes-provider mapping
- Follows same auth + SSH + audit log pattern as other Telegram endpoints
- 8 new tests covering auth, validation, SSH execution, and error handling
* fix: address thermo-nuclear review findings for model-switch endpoint
Fixes:
- CRITICAL: Silent no-op when provider mapping missing — now validates
hermesProvider exists and returns 400 if not
- CRITICAL: Unsafe provider as ApiProviderId cast — now uses typed
validatedProvider variable throughout
- MAJOR: Extract resolveTelegramSshContext helper to eliminate 30-line
SSH boilerplate duplication
- MAJOR: Trim model before validation (not after) to handle whitespace
* fix: biome formatting in server/telegram.ts
* fix: biome formatting in renovate.json and server/telegram.test.ts
* fix: TS errors in switchModelProvider — SshAuthMethod type and status code cast
* fix: biome formatting — import order and multi-line context.json call
* fix: add switchModelProvider to telegram mock in app.test.ts
* chore: trigger CI
* fix: address PR review feedback for switchModelProvider
* feat(telegram): rewrite compose and restart container on model/provider switch
- Extract getModelValidationError helper for cleaner validation
- Validate model against subscription backends in addition to API providers
- Rewrite managed compose file and restart hermes container after SSH deploy
so new provider env vars take effect without manual intervention
- Use cached pre-switch backend for DB persistence instead of re-fetching,
ensuring DB is never updated if remote configuration fails
- Split subscription model persistence into credential vs oauth cases
* feat(telegram): add model-access-options API and rewrite model-switch to use optionId
- Add GET /api/telegram/model-access-options that returns the latest saved
deployable option per backend (API providers, credential subscriptions,
OAuth subscriptions) with opaque optionId: api-provider:<uuid>,
credential-subscription:<uuid>, oauth-subscription:<uuid>
- Rewrite POST /api/telegram/model-switch to accept { optionId, model }
payload, resolve the saved row, validate model against the backend, deploy
via SSH compose, and activate the selected row while deactivating others
in a DB transaction
- Add shared contracts for ModelAccessOption, ModelAccessOptionsResponse,
and ModelSwitchPayload
- Add query helpers for listing and resolving saved deployable options
- Update backend tests for the new option-based switch API
Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
* feat(telegram): add model-access switcher to Telegram page UI
- Add TelegramModelAccessSection component that fetches model access options,
shows active provider/model, lets users pick a saved backend and model
(fixed dropdown or custom input), and switches via the API
- Wire section into TelegramSettings, shown only when Telegram is deployed
- Empty state links users to /ai-provider to save a config first
- Update frontend tests to provide model-access-options fetch mock
Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
* refactor(telegram): fix model-access queries, type safety, and extract model-switch service
- Fix ResolvedOption.provider type lie: change from ApiProviderId to string,
remove unsafe as unknown as ApiProviderId casts
- Replace N+1 per-type DB queries with batch inArray queries in
getModelAccessOptions (4+ queries → 3 total)
- Remove isDeployable pre-filter; fold decrypt into builders via
decryptAndGetLast4 helper to avoid double decryption
- Change findActiveOptionIds to return typed { providerIds, subscriptionIds }
so deactivation uses single UPDATE WHERE id IN (...) per table
- Extract switchModelProvider orchestration (SSH + compose + DB transaction)
into server/telegram/model-switch.ts; handler is now a thin wrapper
- Refactor testTelegramBot to reuse resolveTelegramSshContext helper
- Replace dynamic import("../providers/config") with static import
- Remove unused ModelSwitchPayload export from shared contract
* refactor(telegram): replace useState with useReducer and fix react-doctor warnings
Convert TelegramModelAccessSection from 6 useState calls to a single
useReducer, which batches related state updates into one React render
instead of 6. The reducer groups form state (options, selection) and
UI state (loading, switching, messages) into typed actions.
Also switches from useEffect to useMountEffect for the initial data
fetch, matching the established pattern in telegram-pairing-section
and other components. This eliminates the react-doctor/no-event-handler
false positive — isDeployed is set by the parent based on external
server state, not a user event.
React Doctor score: 79 → 100/100.
* test(telegram): update switchModelProvider tests for extracted service
- Add executeModelSwitch mock for ./telegram/model-switch module
- Update activeOptionIds format from flat array to { providerIds, subscriptionIds }
- Update success assertions to verify executeModelSwitchMock args instead
of checking SSH runtime calls directly (those are now tested in the
service layer)
- Update SSH failure test to mock executeModelSwitch rejection
* refactor(dashboard): extract polling logic into useDashboardPolling hook
DashboardStatusOverview was 313 lines — just over the react-doctor
no-giant-component threshold of 300. The component already had good
subcomponent extraction (ServerInventoryCard, StatusCard, VpsHealthCard,
etc.), but ~125 lines of polling state machine logic (7 refs, 4 useState,
5 functions, 1 useMountEffect) was mixed into the component body.
Extract all polling machinery into a useDashboardPolling custom hook
that returns { snapshot, fetchState, fetchError, pollingPaused,
refreshStatus, handleManualRetry }. The main component drops to ~190
lines and focuses purely on rendering.
No behavior change — the polling logic is identical, just relocated.
React Doctor: no-giant-component warning resolved.
* refactor(providers): replace boolean props with explicit status enum in AccessSelectionActions
AccessSelectionActions took isSaving and isTesting as separate booleans,
creating an implicit 4-state system (idle, saving, testing, both) when
only 3 are valid. The \"both\" state is impossible in practice but the
type system allowed it.
Replace with status: \"idle\" | \"saving\" | \"testing\" — a single prop
that makes the valid states explicit and eliminates the dead 4th state
at the type level. Call sites derive the status from the existing
booleans with a ternary.
React Doctor: prefer-explicit-variants warning resolved.
Also: added react-doctor GitHub Actions CI workflow (npx react-doctor install).
* test(crypto): add unit tests for AES-256-GCM secret helpers
Pins round-trip, tamper rejection, malformed payload, missing key, and
all three decryptApiServerKey branches (empty, legacy plaintext,
encrypted round-trip, malformed-with-dot). Uses real node:crypto — no
mocking — since testing the actual implementation is the point.
Plan 001 from audit at 29c3b46.
* fix(agent-skills): fail skill install when remote download fails
The curl | sudo tee pipeline masked curl non-zero exit code because
in a shell pipeline only the last command exit status propagates. A
failed download (404, DNS, network drop) would write an empty SKILL.md
and report success.
Replace with download-to-temp + test -s + mv: curl writes to a .download
temp file, test -s verifies it is non-empty, then mv atomically moves it
to the final path. Any failure in the && chain aborts the install.
Plan 002 from audit at 29c3b46.
* fix(ssh): fail closed when host-key pin missing on credential ops
Add requireHostKeyPin flag to SshConnectionInput. When set, the
hostVerifier throws host_key_missing if no expectedFingerprint is
stored, instead of silently accepting any key (fail-open).
Every post-registration withSshConnection caller now passes
requireHostKeyPin: true. The first-connect flow (verifyServerConnection)
is unchanged — it still tolerates a missing fingerprint so the pin can
be stored on initial connection.
Covers 15 call sites across 13 files. Plan 003 from audit at 29c3b46.
* test(request-guards): cover auth and ownership boundary
Pins status codes for requireAuthSession (401), requireOwnedServer
(400 missing id, 401 unauthenticated, 404 not owned), and
requireOwnedServerSsh (400 SSH failure, 404 not owned). Asserts that
getOwnedServerRecord is called with the session user.id — the IDOR
guard. Safety net for the later guard-consolidation refactor (DEBT-01).
Plan 004 from audit at 29c3b46.
* test(server-records): cover credential resolution branches
Pins resolveServerCredential branches: stored-present, stored-missing,
session-missing-id, session-expired, session-valid. Covers
normalizeAuthMethod (password, ssh-key, unsupported) and both
resolveServerSshConfigOrError branches (ok + error). Mocks crypto and
credentials — no DB or real crypto needed.
Plan 005 from audit at 29c3b46.
* chore: mark all audit plans as DONE
* docs: add audit plan files
* refactor(ssh): default to requiring host-key pin, opt out for first-connect
Invert the requireHostKeyPin flag to allowMissingHostKeyPin. The secure
behavior (fail closed when no fingerprint is stored) is now the default.
Only verifyServerConnection (first-connect) opts out by passing
allowMissingHostKeyPin: true.
This eliminates requireHostKeyPin: true from 15 call sites and removes
the field from DeployComposeInput, ManagedComposeDeployInput, and
SshConnectionConfig intermediate types. A future caller that forgets
the flag is safe by default.
Also extracts parseOptionId helper and generic keepLatestBy to eliminate
copy-paste. Uses composition for DeployComposeInput (extends
SshConnectionInput instead of duplicating its fields).
Review fixes #1, #2, #4, #5.
* chore(plans): remove completed audit plan documents
All five audit plans (001-005) were fully implemented and verified:
- 001: crypto helper tests in server/crypto.test.ts
- 002: pipefail fix for skill install downloads
- 003: host-key pinning fail-closed on credential SSH ops
- 004: request guard tests in server/request-guards.test.ts
- 005: credential resolution tests in server/server-records.test.ts
These planning artifacts served their purpose and are now cleaned up.
* feat(server): handle SSH host-key errors on provider deploy endpoints
Catch SshConnectError with host_key_missing and host_key_mismatch
codes during deploy flow and return a structured 409 response with
observed fingerprint, algorithm, and (on mismatch) expected fingerprint.
Update test mocks to use importOriginal so SshConnectError is available
for new test assertions.
* feat(server): handle SSH host-key errors on Telegram model switch
Catch SshConnectError with host_key_missing and host_key_mismatch
during switchModelProvider and return a structured 409 response.
Add test coverage for both host-key error codes and the generic
SSH error fallback (502).
* feat(providers): add host-key trust UI for provider deploy
Surface host-key errors from the deploy endpoint as a warning panel
showing observed and expected fingerprints. Add "Trust host key and
retry" flow that POSTs to /api/servers/:id/host-key/accept then
re-attempts the deploy. Wire the new state, reducer actions, and
controller method into ProviderSettings.
* feat(server): extract shared host-key error response utility
Deduplicate the repetitive SSH host-key error handling that was inlined
across deploy endpoints. The new isRecoverableHostKeyError type guard and
hostKeyErrorResponse builder produce a consistent 409 JSON payload with
structured hostKey (observed fingerprint/algorithm, expected on mismatch).
* feat(client): add shared host-key recovery types and trust panel UI
Extract HostKeyErrorPayload type and parseHostKeyErrorPayload parser to a
shared module under features/servers. The HostKeyTrustPanel component
renders the host-key trust prompt (fingerprint display, trust/retry and
cancel buttons) so it can be reused across provider deploy and Telegram
model-switch views.
* refactor(server): deduplicate host-key error handling in deploy endpoints
Replace the duplicated host_key_missing / host_key_mismatch inline blocks
in deployProviderToHermes, deployToHermesAgent, and switchModelProvider
with the shared isRecoverableHostKeyError / hostKeyErrorResponse utilities.
On the client side, swap the inline host-key error parsing in
provider-access-actions for parseHostKeyErrorPayload and replace the
inline host-key alert UI in provider-settings-aside with the reusable
HostKeyTrustPanel component.
* refactor(telegram): extract ModelAccessForm sub-component
Pull the inline option selector, model picker, and switch/refresh buttons
into a self-contained ModelAccessForm component to reduce clutter in the
TelegramModelAccessSection render method.
* refactor(contracts,providers): extract shared host-key types and acceptHostKey action
Create shared/contracts/host-key-error.ts with canonical HostKeyErrorCode and HostKeyErrorResponsePayload types, replacing duplicate definitions on both sides of the boundary.\n\nAdd acceptHostKey() to provider-access-actions.ts as a reusable action for trusting server host keys, using the existing readJsonBody helper.\n\nRestore try/catch in handleTrustAndRetryDeploy to prevent unhandled rejections on network errors.\n\nUpdate server/lib/host-key-error-response.ts, src/features/servers/host-key-recovery.ts, and src/features/providers/provider-access-actions.ts to import from the shared contract.
* refactor(server): extract resolveSwitchOption branch resolvers
Split the large multi-branch resolveSwitchOption function in server/telegram/model-access.ts into three self-contained resolver functions (resolveApiProviderOption, resolveCredentialSubscriptionOption, resolveOAuthSubscriptionOption) with a switch-based dispatcher. Each resolver handles its own DB query, validation, and ResolvedOption construction. No behavioral changes.
* feat(telegram): add host-key trust-and-retry to model switch
Extract state machine and data fetching from telegram-model-access-section.tsx into use-model-access-controller.ts with host-key recovery support.\n\nThe hook now parses 409 host-key error responses via parseHostKeyErrorPayload and exposes a HostKeyTrustPanel flow (handleTrustAndRetrySwitch) instead of showing a generic error. This brings the Telegram model switch UX to parity with the AI provider deploy flow for host-key recovery.\n\nThe section component delegates to the hook and renders HostKeyTrustPanel when a host-key error is detected during model switching.
* chore(pre-commit): move typecheck hook to push stage
Move TypeScript typecheck from commit stage to push stage so partial staging is not blocked by type errors in unstaged files. Full typecheck still runs on every git push.
* fix(server): add host key error recovery to deploy and test endpoints
Add isRecoverableHostKeyError checks before generic 502 fallbacks in three server paths that were missing host key recovery:\n\n- deployTelegramToServer: initial Telegram bot deploy to a server\n- testTelegramBot: test-bot flow over SSH\n- deployToTelegramLinkedHermes: skills/agent deploy to Telegram-linked server\n\nWhen a server has hostKeyFingerprint NULL in the DB, SSH connections throw 'host key pin required but not stored'. These paths previously returned a generic 502; they now return a 409 with a hostKeyErrorResponse payload so the client can surface the trust-and-retry panel.\n\nPattern matches the existing handlers: switchModelProvider, deployToHermesAgent, deployProviderToHermes.
* feat(telegram): add host key trust panel to deploy and test sections
Add HostKeyTrustPanel with trust-and-retry flow to the Telegram deploy section (initial bot deploy) and test section (test-bot flow).\n\nBoth sections now parse 409 responses via parseHostKeyErrorPayload and surface a HostKeyTrustPanel when a host key error is detected. On trust, the host key is accepted via POST /api/servers/:id/host-key/accept and the operation retries automatically.\n\nState management consolidated into useReducer (with useRef for stale-closure avoidance) to match the pattern in use-model-access-controller.ts.\n\nBrings the Telegram deploy and test UIs to parity with the model switch and provider deploy flows for host key recovery.
* test(telegram): add host key error recovery integration tests
Add integration tests for host key recovery in deployTelegramToServer and testTelegramBot endpoints.\n\nVerifies that both endpoints return 409 with hostKey recovery payload when SSH fails with host_key_missing, instead of the previous generic 502. Confirms no audit log is written for recoverable errors and no deploy state is persisted.
* docs(api): document host key recovery flow and new endpoints
Add Host Key Recovery section to the API reference explaining the 409 response format, error codes (host_key_missing/host_key_mismatch), recovery flow, and list of affected endpoints.\n\nDocument new POST /api/servers/:id/host-key/accept endpoint.\n\nAdd previously undocumented Telegram endpoints: POST /api/telegram/deploy, POST /api/telegram/test, POST /api/telegram/model-switch.\n\nUpdate error response tables on all 8 deploy endpoints (telegram deploy/test/model-switch, providers deploy, persona/mcp/agent-skills deploy, web-ui deploy) to include 409 for host key errors.
* docs(glossary): add host key pin and recovery terminology
Add two glossary entries to CONTEXT.md:\n\n- Host Key Pin: explains the stored SHA256 fingerprint, its role in SSH verification, and the host_key_missing/host_key_mismatch error contract\n- Host Key Recovery: explains the 409 -> trust panel -> accept -> retry flow and lists all 8 endpoints that support it
---------
Co-authored-by: Hermes Agent <agent@nousresearch.com>
Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
What
Introduced Telegram bot deployment orchestration, testing functionality,
AI provider config deployment to Hermes, and expanded provider support
for Ollama (Local) and Custom (BYO) OpenAI-compatible endpoints.
Why
To allow users to:
How
ai_providerswithbase_urlcolumn;telegram_configswithdeployed_server_id,deployed_server_host,api_server_key./api/telegram/deploy— SSHs into VPS, writes customizeddocker-compose.ymlwith bot token + provider env vars, restarts Hermes gateway./api/providers/deploy— SSHs into the same VPS,rewrites compose with current provider env vars + model (
HERMES_DEFAULT_MODEL),restarts Hermes, runs
docker exec hermes hermes config set model <model>./api/telegram/test— SSH-tunneledcurlchat completions requestto verify bot-to-Hermes responsiveness.
TelegramSettingsandProviderSettingsfor deploy/test/custom modeloptions. New "Hermes deployment" aside section with deploy button (visible only when a
Telegram bot is on a VPS).
Checklist
Summary by CodeRabbit
New Features
UI
Tests
Documentation
Chores