Skip to content

feat(telegram): add Telegram bot deployment, provider deploy to Hermes, and Ollama/Custom provider support#5

Merged
jellydn merged 82 commits into
mainfrom
feat/telegram-provider
Jun 1, 2026
Merged

feat(telegram): add Telegram bot deployment, provider deploy to Hermes, and Ollama/Custom provider support#5
jellydn merged 82 commits into
mainfrom
feat/telegram-provider

Conversation

@jellydn

@jellydn jellydn commented May 31, 2026

Copy link
Copy Markdown
Owner

What

Introduced Telegram bot deployment orchestration, testing functionality,
AI provider config deployment to Hermes, and expanded provider support
for Ollama (Local) and Custom (BYO) OpenAI-compatible endpoints.

Why

To allow users to:

  1. Deploy their Telegram bot directly to their Hermes VPS instances
  2. Verify bot responsiveness using SSH-channeled testing endpoints
  3. Push saved AI provider configs (model + API key) to their Hermes VPS
  4. Hook Hermes up to local models (Ollama) or custom endpoints (DeepSeek, Together AI, etc.)

How

  • DB Schema: Extended ai_providers with base_url column; telegram_configs with
    deployed_server_id, deployed_server_host, api_server_key.
  • Telegram Deploy: /api/telegram/deploy — SSHs into VPS, writes customized
    docker-compose.yml with bot token + provider env vars, restarts Hermes gateway.
  • Provider Deploy to Hermes: /api/providers/deploy — SSHs into the same VPS,
    rewrites compose with current provider env vars + model (HERMES_DEFAULT_MODEL),
    restarts Hermes, runs docker exec hermes hermes config set model <model>.
  • Telegram Test: /api/telegram/test — SSH-tunneled curl chat completions request
    to verify bot-to-Hermes responsiveness.
  • Ollama & Custom Providers: Custom model ID and base URL form fields on the frontend.
  • UI: Refactored TelegramSettings and ProviderSettings for deploy/test/custom model
    options. New "Hermes deployment" aside section with deploy button (visible only when a
    Telegram bot is on a VPS).

Checklist

  • Tests added or updated
  • Documentation updated
  • No breaking changes (or breaking changes documented above)

Summary by CodeRabbit

  • New Features

    • Ollama & custom BYO AI providers (base URL support) and provider deploy-to-Hermes over SSH; remote model set.
    • Telegram: connect, deploy to a server, test messages, and list/approve pairings via the app.
  • UI

    • Provider settings: Base URL field, deploy button, updated model handling.
    • Telegram settings: separate connect/deploy/pair/test sections and sidebar deployment status.
  • Tests

    • Expanded provider/telegram/deploy test coverage.
  • Documentation

    • New ADRs and README Docker Compose full‑stack guidance.
  • Chores

    • CI TypeScript caching and .gitignore updates.

@changeset-bot

changeset-bot Bot commented May 31, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 2121751

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3b228e39-0a31-4fb7-a43a-41ffe517f2d9

📥 Commits

Reviewing files that changed from the base of the PR and between f132c61 and 2121751.

📒 Files selected for processing (4)
  • README.md
  • docs/adr/0010-hermes-runtime-management-from-telegram-page.md
  • server/telegram.test.ts
  • server/telegram/pairings.ts

Walkthrough

Adds provider baseUrl support (Ollama/custom), Telegram deploy/pairing/test flows wired over SSH with compose generation, SSH utility refactor, dashboard records/summaries/metrics, react-hook-form/zod migrations and UI components, TypeScript incremental builds with CI cache, extensive tests, and updated docs/ADRs.

Changes

Provider, Telegram, SSH, and Deployment

Layer / File(s) Summary
Database schema and migrations
server/db/schema.ts, drizzle/0001_overjoyed_annihilus.sql, drizzle/meta/*
Adds ai_providers.base_url and telegram_configs.deployed_server_id, deployed_server_host, api_server_key; updates Drizzle snapshot and journal.
Provider config and connection
server/providers/config.ts, server/providers/connection.ts, server/providers/records.ts, server/providers.ts
New provider types and env mapping, decrypt/get-last4 helpers, verifyProviderConnection supporting baseUrl, persist/return baseUrl in provider flows.
Provider Hermes deploy support
server/deploy.ts, server/compose.ts
Implements compose content builder and deployProviderToHermes handler that builds env vars (including derived custom API var), deploys via SSH, sets model, and writes audit logs.
Provider frontend
src/features/providers/provider-settings.tsx, src/features/providers/*
Adds baseUrl field, validation, UI, deploy button posting to /api/providers/deploy, and tests for Ollama/custom providers.
Telegram config, deploy, test, pairing
server/telegram/*.ts, server/telegram/*, src/features/telegram/*
Reorgs Telegram modules, adds deployTelegramToServer and testTelegramBot handlers (SSH deploy, apiServerKey), pairing list/approve endpoints executing Hermes Python inside container, and corresponding frontend components and tests.
SSH utilities and quoting
server/ssh/*
Splits SSH code into submodules: connection wrapper, errors, OS parsing/validation, shellQuote utility; re-exports via barrel.
Audit/records/helpers & dashboard
server/*/records.ts, server/dashboard/*
Adds record helpers for dashboard, servers, installs, providers, telegram, plus summaries and getVpsSummary with short in-memory caching for VPS health.
Tests
server/*.test.ts, src/**/*.test.tsx
Expands tests for provider deploys, Telegram pairing/deploy/test, SSH quoting, model validation regex, and adapts existing tests to new shapes and mocks.

Forms, UI, and UX

Layer / File(s) Summary
Form libraries and recipes
package.json, justfile
Adds react-hook-form, zod, @hookform/resolvers; updates just recipes (parallel check, lint, format, ci).
Login and connection wizard
src/routes/login.tsx, src/features/servers/connection-wizard.tsx
Migrates login and ConnectionWizard to react-hook-form + zod (login) and useForm step validation (connection wizard).
Telegram UI
src/features/telegram/*
Adds modular Telegram UI sections: connect/deploy/pairing/test/sidebar; TelegramSettings delegates to these components and surfaces deployed-host state; tests added/updated.

Build, tooling, and docs

Layer / File(s) Summary
TypeScript incremental and CI cache
tsconfig.json, .github/workflows/*, .gitignore
Enables incremental builds (.tsbuildinfo), caches .tsbuildinfo in CI, and ignores .tsbuildinfo.
Vite async config and tests
vite.config.ts
Converts to async config to skip plugins during Vitest, sets Vitest env to node, expands test include globs.
Docker/compose/CSS/documentation
compose.yaml, Dockerfile, src/styles.css, README.md, docs/adr/*, .planning/*
Adds Mailpit, tightens compose env checks, removes runtime tsconfig copy and a footer CSS rule, updates README, API reference, ADRs, and planning docs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • jellydn/hermes-hub#3: Overlaps with HTTPS middleware and new server endpoints that previously wrapped handlers with requireHttps.

"🐰 In the burrow I stitched a deploy and a test,
Base-URLs, pairings, and SSH done best.
Forms hop to order, builds cached and spry,
Hermes wakes up — the bots say hi! 🥕"

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/telegram-provider

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Ollama and Custom AI providers by adding a base_url column to the database and updating the provider settings UI. It also adds features to deploy and test Telegram bots on a VPS running Hermes. The review feedback highlights several critical and high-severity issues: a missing database migration for the new telegram_configs columns, potential key corruption in decryptApiKey when parsing keys starting with {, hardcoded Docker image names instead of using the defaultHermesImage constant, and logical bugs where old deployment status is incorrectly preserved when connecting a new bot. Additionally, the Telegram bot testing endpoint needs more robust error handling for API and parsing failures.

Comment thread server/db/schema.ts
Comment on lines +184 to +186
deployedServerId: text("deployed_server_id"),
deployedServerHost: text("deployed_server_host"),
apiServerKey: text("api_server_key"),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The database migration for the new columns added to telegram_configs (deployed_server_id, deployed_server_host, and api_server_key) is missing. Only the migration for ai_providers.base_url was added in drizzle/0001_ai_providers_base_url.sql. Please run drizzle-kit generate to generate the missing migration, or manually add the statements to a new migration file.

Comment thread server/providers.ts Outdated
Comment on lines +40 to +55
function decryptApiKey(encryptedStr: string): string {
try {
const decrypted = decryptSecret(encryptedStr);
// Backward-compatibility: keys saved before the explicit baseUrl column
// may have been stored as JSON {apiKey, baseUrl}. Unwrap if so.
if (decrypted.startsWith("{")) {
const parsed = JSON.parse(decrypted) as Record<string, unknown>;
if (typeof parsed.apiKey === "string") {
return parsed.apiKey;
}
}
return decrypted;
} catch {
return "";
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If JSON.parse fails because the decrypted key starts with { but is not valid JSON, the entire function will throw an error and return an empty string "". This will corrupt valid API keys that happen to start with {. Wrap the JSON.parse call in its own try-catch block so that if parsing fails, the function falls back to returning the raw decrypted string.

function decryptApiKey(encryptedStr: string): string {
	try {
		const decrypted = decryptSecret(encryptedStr);
		// Backward-compatibility: keys saved before the explicit baseUrl column
		// may have been stored as JSON {apiKey, baseUrl}. Unwrap if so.
		if (decrypted.startsWith("{")) {
			try {
				const parsed = JSON.parse(decrypted) as Record<string, unknown>;
				if (typeof parsed.apiKey === "string") {
					return parsed.apiKey;
				}
			} catch {
				// If parsing fails, treat it as a raw key starting with '{'
			}
		}
		return decrypted;
	} catch {
		return "";
	}
}

Comment thread server/telegram.ts Outdated
Comment on lines +11 to +12
import { normalizeAuthMethod, resolveServerCredential } from "./server-records";
import { withSshConnection } from "./ssh";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Import defaultHermesImage from ./constants to avoid hardcoding the Docker image name in the deployment step.

Suggested change
import { normalizeAuthMethod, resolveServerCredential } from "./server-records";
import { withSshConnection } from "./ssh";
import { defaultHermesImage } from "./constants";
import { normalizeAuthMethod, resolveServerCredential } from "./server-records";
import { withSshConnection } from "./ssh";

Comment thread server/telegram.ts Outdated
Comment on lines +141 to +143
deployedServerId: existing?.deployedServerId ?? null,
deployedServerHost: existing?.deployedServerHost ?? null,
apiServerKey: existing?.apiServerKey ?? null,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When connecting a new Telegram bot, copying the active deployment status (deployedServerId, deployedServerHost, apiServerKey) from the existing config creates a split-brain scenario. The UI/database will claim the new bot is deployed, but the VPS is actually still running the old bot token. These fields should be reset to null when a new bot token is connected, forcing the user to redeploy the new token.

Suggested change
deployedServerId: existing?.deployedServerId ?? null,
deployedServerHost: existing?.deployedServerHost ?? null,
apiServerKey: existing?.apiServerKey ?? null,
deployedServerId: null,
deployedServerHost: null,
apiServerKey: null,

Comment thread server/telegram.ts Outdated
botUsername: bot.username,
botTokenLast4: getTokenLast4(botToken),
isActive: true,
deployedServerHost: existing?.deployedServerHost ?? null,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Reset the deployedServerHost to null in the returned summary when a new bot is connected to ensure the frontend is aware that the new bot token has not been deployed yet.

Suggested change
deployedServerHost: existing?.deployedServerHost ?? null,
deployedServerHost: null,

Comment thread server/telegram.ts Outdated
Comment on lines +263 to +279
const composeContent = [
"services:",
" hermes:",
" image: nousresearch/hermes-agent:latest",
" container_name: hermes",
" restart: unless-stopped",
" command: gateway run",
" ports:",
' - "8642:8642"',
" volumes:",
" - ~/.hermes:/opt/data",
" environment:",
" - API_SERVER_ENABLED=true",
" - API_SERVER_HOST=0.0.0.0",
` - API_SERVER_KEY=${apiServerKey}`,
` - TELEGRAM_BOT_TOKEN=${decryptedToken}`,
].join("\n");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Docker image name nousresearch/hermes-agent:latest is hardcoded here, whereas server/install.ts uses the defaultHermesImage constant imported from ./constants. Use defaultHermesImage here as well to maintain consistency.

Suggested change
const composeContent = [
"services:",
" hermes:",
" image: nousresearch/hermes-agent:latest",
" container_name: hermes",
" restart: unless-stopped",
" command: gateway run",
" ports:",
' - "8642:8642"',
" volumes:",
" - ~/.hermes:/opt/data",
" environment:",
" - API_SERVER_ENABLED=true",
" - API_SERVER_HOST=0.0.0.0",
` - API_SERVER_KEY=${apiServerKey}`,
` - TELEGRAM_BOT_TOKEN=${decryptedToken}`,
].join("\n");
const composeContent = [
"services:",
" hermes:",
` image: ${defaultHermesImage}`,
" container_name: hermes",
" restart: unless-stopped",
" command: gateway run",
" ports:",
' - "8642:8642"',
" volumes:",
" - ~/.hermes:/opt/data",
" environment:",
" - API_SERVER_ENABLED=true",
" - API_SERVER_HOST=0.0.0.0",
` - API_SERVER_KEY=${apiServerKey}`,
` - TELEGRAM_BOT_TOKEN=${decryptedToken}`,
].join("\n");

Comment thread server/telegram.ts Outdated
Comment on lines +450 to +464
let parsed: { choices?: Array<{ message?: { content?: string } }> };
try {
parsed = JSON.parse(stdout) as {
choices?: Array<{ message?: { content?: string } }>;
};
} catch {
throw new Error(
`Invalid JSON from Hermes API: ${stdout.slice(0, 200)}`,
);
}

const content = parsed.choices?.[0]?.message?.content;
if (!content) {
throw new Error("No response content from Hermes");
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the Hermes API returns an error response (e.g., 401 Unauthorized), curl will still exit with code 0 (since --fail is not used). The response JSON will be parsed successfully, but parsed.choices will be undefined, resulting in a generic 'No response content from Hermes' error. Additionally, if JSON.parse returns null, accessing parsed.choices will throw a TypeError. Check if parsed is a valid object, and check for any error returned by the API to propagate a more helpful error message.

				let parsed: any;
				try {
					parsed = JSON.parse(stdout);
				} catch {
					throw new Error(
						`Invalid JSON from Hermes API: ${stdout.slice(0, 200)}`,
					);
				}

				if (parsed && typeof parsed === "object" && parsed.error) {
					const errMsg = typeof parsed.error === "object" && parsed.error.message
						? parsed.error.message
						: JSON.stringify(parsed.error);
					throw new Error(`Hermes API returned error: ${errMsg}`);
				}

				const content = parsed && typeof parsed === "object" ? parsed.choices?.[0]?.message?.content : null;
				if (!content) {
					throw new Error("No response content from Hermes");
				}

Comment on lines +69 to +75
setSavedConfig({
...payload.telegram,
deployedServerHost:
payload.telegram.deployedServerHost ??
savedConfig?.deployedServerHost ??
null,
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If a user connects a new bot token, the old deployment is no longer valid for this new token. Preserving the old deployedServerHost here will make the UI falsely claim that the new bot is already deployed, and enable the test section even though the VPS is still running the old bot token. Directly use the payload.telegram returned by the backend (which should have deployedServerHost reset to null for a new connection).

			setSavedConfig(payload.telegram);

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/telegram.ts (1)

129-143: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clear deploy metadata when the bot token changes.

This keeps the new token marked as deployed even though the remote docker-compose.yml still has the previous TELEGRAM_BOT_TOKEN until /api/telegram/deploy runs again. The result is a false deployed state and misleading test behavior for the newly connected bot.

Suggested fix
-		const existing = await getLatestTelegramRecord(session.user.id);
-
 		await db
 			.update(telegramConfigs)
 			.set({ isActive: false })
 			.where(eq(telegramConfigs.userId, session.user.id));

 		await db.insert(telegramConfigs).values({
 			userId: session.user.id,
 			botToken: encryptSecret(botToken),
 			botUsername: bot.username,
 			isActive: true,
-			deployedServerId: existing?.deployedServerId ?? null,
-			deployedServerHost: existing?.deployedServerHost ?? null,
-			apiServerKey: existing?.apiServerKey ?? null,
+			deployedServerId: null,
+			deployedServerHost: null,
+			apiServerKey: null,
 		});
...
 		return context.json({
 			telegram: {
 				botUsername: bot.username,
 				botTokenLast4: getTokenLast4(botToken),
 				isActive: true,
-				deployedServerHost: existing?.deployedServerHost ?? null,
+				deployedServerHost: null,
 			} satisfies TelegramConfigSummary,
 		});

Also applies to: 156-162

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.ts` around lines 129 - 143, When inserting the new
telegramConfigs row, clear deployedServerId/deployedServerHost/apiServerKey
whenever the bot token changed so we don't preserve a "deployed" state for a
different token; use getLatestTelegramRecord and compare existing?.botToken to
encryptSecret(botToken) (or decrypt then compare) to decide whether to carry
over those fields, and set them to null when they differ. Update the same logic
in the second insert block that covers lines 156-162 to match.
🧹 Nitpick comments (1)
server/telegram.test.ts (1)

43-159: ⚡ Quick win

Add coverage for the new deploy/test handlers and reconnect state.

This suite still only exercises connect/disconnect. The new server-side behavior needs regression coverage for deploy success/failure, undeployed test rejection, and reconnect clearing deployment metadata, otherwise the stale-deployment path can slip through unchanged.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.test.ts` around lines 43 - 159, Add tests in telegram.test.ts
for the new deploy/test and reconnect flows: write unit tests that call the
exported handlers (e.g., deployTelegram, testTelegram, reconnectTelegram)
similar to connectTelegram/disconnectTelegram; mock fetch responses to simulate
deploy success and failure and assert response status/body and that
updateSet/insertValues are called with expected fields (deployedServerHost,
isActive, botToken encryption via encryptSecret), add a test that ensures
testTelegram rejects when the bot is not deployed (selectLimit returning empty)
and returns the proper error, and add a reconnectTelegram test that verifies
reconnect clears stale deployment metadata (deployedServerHost -> null) and sets
isActive appropriately by inspecting updateSet/insertValues/selectLimit/mock
calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@drizzle/0001_ai_providers_base_url.sql`:
- Line 1: The migration currently only adds the "base_url" column to the
"ai_providers" table but must also add the Telegram-related columns declared in
server/db/schema.ts: add nullable text columns "deployed_server_id",
"deployed_server_host", and "api_server_key" to the same ALTER TABLE statement
so upgraded DBs include those fields that server/telegram.ts reads/writes;
update drizzle/0001_ai_providers_base_url.sql to ALTER TABLE "ai_providers" ADD
COLUMN for each of those three columns (nullable text) alongside "base_url".

In `@server/providers.ts`:
- Around line 183-187: The code currently passes user-supplied
resolvedApiKey.baseUrl into verifyProviderConnection (and similar call sites
around the other ranges) allowing SSRF; add a pre-check that parses and
constrains the baseUrl before any outbound fetch: create/ reuse a helper like
validateOutboundUrl(url, provider) and call it where verifyProviderConnection is
invoked (e.g., before verifyProviderConnection in the block using
parsed.provider and resolvedApiKey.baseUrl and the other call sites at ~278-288
and ~304-368); the helper must (1) require http(s) scheme, (2) normalize/resolve
host and reject localhost/127.0.0.0/8, ::1, link-local and RFC1918/private IP
ranges, and (3) enforce provider-specific allowlists/denylists (for 'custom'
only allow pre-configured domains or explicitly whitelisted hosts), and
throw/return an error if validation fails so the code never fetches an arbitrary
user-supplied URL.

In `@server/telegram.ts`:
- Around line 283-306: The SSH exec calls inside the withSshConnection callback
(the ssh.execCommand invocations that produce writeResult and restartResult when
using writeCmd and "cd ~/hermes && sudo docker compose up -d") need to be
bounded with a timeout like testTelegramBot does; update both ssh.execCommand
calls to pass execOptions.timeout (or an object with timeout set) so the remote
write and restart won't hang the HTTP request, and ensure errors still check
result.code as before.
- Around line 263-281: The inline composeContent and writeCmd in
server/telegram.ts overwrite ~/hermes/docker-compose.yml with a hardcoded
nousresearch/hermes-agent:latest which can drift from the install-time compose
built from defaultHermesImage in server/install.ts; instead, reuse the canonical
compose template or image value by importing/reading the same source (the
defaultHermesImage or the compose-builder used in server/install.ts) and
generate the composeContent dynamically from that shared value so the deploy
path does not hardcode the image/version or duplicate the template logic (update
references in composeContent and writeCmd to consume the shared
defaultHermesImage/compose builder).
- Around line 261-315: Persisting apiServerKey in plaintext is unsafe; before
saving the generated apiServerKey (the variable apiServerKey used in the
db.update(telegramConfigs) call) call encryptSecret() and store the encrypted
blob in the DB field (e.g., apiServerKey) instead of raw text, and when you
later need to send Authorization headers decrypt it with decryptSecret()
immediately before constructing the header (replace any direct reads of
apiServerKey used to build Authorization with a decryptSecret call); ensure all
other places that persist or read apiServerKey follow the same pattern (use
encryptSecret() on save and decryptSecret() on use).

In `@src/features/providers/provider-settings.tsx`:
- Around line 57-69: The updateProvider function always resets model to
getDefaultAiModel(provider), which drops a previously saved model when toggling
providers back to the active one; change the model field in setForm inside
updateProvider to use savedConfig.model when the selected provider matches
savedConfig.provider and savedConfig.model exists (falling back to
getDefaultAiModel(provider) otherwise) so that providers like "openrouter",
"ollama", and "custom" preserve their saved model; locate updateProvider (and
helper getAiProviderOption) and update the model assignment logic accordingly.

In `@src/features/telegram/telegram-settings.tsx`:
- Around line 67-75: The code currently preserves deployedServerHost across
reconnects by falling back to savedConfig?.deployedServerHost when
payload.telegram.deployedServerHost is absent; change this so deployedServerHost
is taken only from payload.telegram (or null) to force redeploy on token
replace. Update the setSavedConfig call in Telegram settings (where
setSavedConfig({...payload.telegram, deployedServerHost: ...})) to remove the
savedConfig fallback and set deployedServerHost to
payload.telegram.deployedServerHost ?? null so a replaced token does not inherit
an old host.
- Around line 111-143: The handleDeploy function currently only handles non-2xx
responses but doesn't catch network/fetch rejections, so wrap the fetch call in
a try/catch (or catch the fetch promise) and on any thrown error call
setDeployError(error.message ?? String(error)) before returning; ensure
setIsDeploying is still cleared in the existing finally block. Do the same
defensive pattern for the equivalent test flow (the block handling test API
calls around lines 146-176) so both fetch rejections set testError/deployError
and prevent unhandled promise rejections.

---

Outside diff comments:
In `@server/telegram.ts`:
- Around line 129-143: When inserting the new telegramConfigs row, clear
deployedServerId/deployedServerHost/apiServerKey whenever the bot token changed
so we don't preserve a "deployed" state for a different token; use
getLatestTelegramRecord and compare existing?.botToken to
encryptSecret(botToken) (or decrypt then compare) to decide whether to carry
over those fields, and set them to null when they differ. Update the same logic
in the second insert block that covers lines 156-162 to match.

---

Nitpick comments:
In `@server/telegram.test.ts`:
- Around line 43-159: Add tests in telegram.test.ts for the new deploy/test and
reconnect flows: write unit tests that call the exported handlers (e.g.,
deployTelegram, testTelegram, reconnectTelegram) similar to
connectTelegram/disconnectTelegram; mock fetch responses to simulate deploy
success and failure and assert response status/body and that
updateSet/insertValues are called with expected fields (deployedServerHost,
isActive, botToken encryption via encryptSecret), add a test that ensures
testTelegram rejects when the bot is not deployed (selectLimit returning empty)
and returns the proper error, and add a reconnectTelegram test that verifies
reconnect clears stale deployment metadata (deployedServerHost -> null) and sets
isActive appropriately by inspecting updateSet/insertValues/selectLimit/mock
calls.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 62f36388-112a-4baf-a661-11ac0d36b00c

📥 Commits

Reviewing files that changed from the base of the PR and between c5c5bad and 15ff553.

📒 Files selected for processing (15)
  • drizzle/0001_ai_providers_base_url.sql
  • drizzle/meta/_journal.json
  • server/app.ts
  • server/db/schema.ts
  • server/install.ts
  • server/providers.test.ts
  • server/providers.ts
  • server/telegram.test.ts
  • server/telegram.ts
  • src/features/providers/provider-settings.test.tsx
  • src/features/providers/provider-settings.tsx
  • src/features/telegram/telegram-settings.test.tsx
  • src/features/telegram/telegram-settings.tsx
  • src/lib/ai-providers.ts
  • src/routes/telegram.tsx

Comment thread drizzle/0001_ai_providers_base_url.sql Outdated
@@ -0,0 +1 @@
ALTER TABLE "ai_providers" ADD COLUMN "base_url" text;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Add the telegram_configs columns in this migration too.

server/db/schema.ts now declares deployed_server_id, deployed_server_host, and api_server_key, and server/telegram.ts already reads/writes them. With only this ALTER TABLE "ai_providers" statement, upgraded databases will be missing those Telegram columns and the new deploy/test flows will fail at runtime.

Suggested migration addition
 ALTER TABLE "ai_providers" ADD COLUMN "base_url" text;
+ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_id" text;
+ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_host" text;
+ALTER TABLE "telegram_configs" ADD COLUMN "api_server_key" text;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ALTER TABLE "ai_providers" ADD COLUMN "base_url" text;
ALTER TABLE "ai_providers" ADD COLUMN "base_url" text;
ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_id" text;
ALTER TABLE "telegram_configs" ADD COLUMN "deployed_server_host" text;
ALTER TABLE "telegram_configs" ADD COLUMN "api_server_key" text;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@drizzle/0001_ai_providers_base_url.sql` at line 1, The migration currently
only adds the "base_url" column to the "ai_providers" table but must also add
the Telegram-related columns declared in server/db/schema.ts: add nullable text
columns "deployed_server_id", "deployed_server_host", and "api_server_key" to
the same ALTER TABLE statement so upgraded DBs include those fields that
server/telegram.ts reads/writes; update drizzle/0001_ai_providers_base_url.sql
to ALTER TABLE "ai_providers" ADD COLUMN for each of those three columns
(nullable text) alongside "base_url".

Comment thread server/providers.ts
Comment on lines 183 to 187
await verifyProviderConnection({
provider: parsed.provider,
apiKey: resolvedApiKey.apiKey,
baseUrl: resolvedApiKey.baseUrl,
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Constrain baseUrl before using it for backend fetches.

/api/providers/test now lets the caller choose the outbound URL that the server fetches. For the new custom path in particular, this becomes an SSRF primitive against internal hosts/services. Please validate the URL up front and enforce provider-specific allowlists/denylists instead of fetching any user-supplied target.

Also applies to: 278-288, 304-368

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/providers.ts` around lines 183 - 187, The code currently passes
user-supplied resolvedApiKey.baseUrl into verifyProviderConnection (and similar
call sites around the other ranges) allowing SSRF; add a pre-check that parses
and constrains the baseUrl before any outbound fetch: create/ reuse a helper
like validateOutboundUrl(url, provider) and call it where
verifyProviderConnection is invoked (e.g., before verifyProviderConnection in
the block using parsed.provider and resolvedApiKey.baseUrl and the other call
sites at ~278-288 and ~304-368); the helper must (1) require http(s) scheme, (2)
normalize/resolve host and reject localhost/127.0.0.0/8, ::1, link-local and
RFC1918/private IP ranges, and (3) enforce provider-specific
allowlists/denylists (for 'custom' only allow pre-configured domains or
explicitly whitelisted hosts), and throw/return an error if validation fails so
the code never fetches an arbitrary user-supplied URL.

Comment thread server/telegram.ts Outdated
Comment thread server/telegram.ts Outdated
Comment thread server/telegram.ts Outdated
Comment on lines 57 to 69
function updateProvider(provider: AiProviderId) {
const option = getAiProviderOption(provider);
setForm({
provider,
model: getDefaultAiModel(provider),
apiKey: "",
baseUrl:
option?.id === savedConfig?.provider && savedConfig?.baseUrl
? savedConfig.baseUrl
: provider === "ollama"
? "http://localhost:11434/v1"
: "",
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve the saved model when switching back to the active provider.

You already restore savedConfig.baseUrl here, but model always resets to the provider default. For openrouter, ollama, and custom, a quick provider toggle drops the saved custom model from the form and makes the next save easy to get wrong.

Suggested state fix
 	function updateProvider(provider: AiProviderId) {
 		const option = getAiProviderOption(provider);
+		const isSavedProvider = option?.id === savedConfig?.provider;
 		setForm({
 			provider,
-			model: getDefaultAiModel(provider),
+			model: isSavedProvider
+				? savedConfig.model
+				: getDefaultAiModel(provider),
 			apiKey: "",
 			baseUrl:
-				option?.id === savedConfig?.provider && savedConfig?.baseUrl
+				isSavedProvider && savedConfig?.baseUrl
 					? savedConfig.baseUrl
 					: provider === "ollama"
 						? "http://localhost:11434/v1"
 						: "",
 		});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function updateProvider(provider: AiProviderId) {
const option = getAiProviderOption(provider);
setForm({
provider,
model: getDefaultAiModel(provider),
apiKey: "",
baseUrl:
option?.id === savedConfig?.provider && savedConfig?.baseUrl
? savedConfig.baseUrl
: provider === "ollama"
? "http://localhost:11434/v1"
: "",
});
function updateProvider(provider: AiProviderId) {
const option = getAiProviderOption(provider);
const isSavedProvider = option?.id === savedConfig?.provider;
setForm({
provider,
model: isSavedProvider
? savedConfig.model
: getDefaultAiModel(provider),
apiKey: "",
baseUrl:
isSavedProvider && savedConfig?.baseUrl
? savedConfig.baseUrl
: provider === "ollama"
? "http://localhost:11434/v1"
: "",
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/providers/provider-settings.tsx` around lines 57 - 69, The
updateProvider function always resets model to getDefaultAiModel(provider),
which drops a previously saved model when toggling providers back to the active
one; change the model field in setForm inside updateProvider to use
savedConfig.model when the selected provider matches savedConfig.provider and
savedConfig.model exists (falling back to getDefaultAiModel(provider) otherwise)
so that providers like "openrouter", "ollama", and "custom" preserve their saved
model; locate updateProvider (and helper getAiProviderOption) and update the
model assignment logic accordingly.

Comment thread src/features/telegram/telegram-settings.tsx Outdated
Comment on lines +111 to +143
async function handleDeploy() {
setIsDeploying(true);
setDeployError(null);
setDeploySuccess(null);

try {
const response = await fetch("/api/telegram/deploy", {
method: "POST",
});

const payload = (await response.json().catch(() => null)) as {
error?: string;
status?: string;
serverHost?: string;
} | null;

if (!response.ok) {
setDeployError(payload?.error ?? "Deploy failed");
return;
}

const serverHost = payload?.serverHost ?? null;
// Update savedConfig so isDeployed / deployedHost derive from a single
// source of truth rather than a separate currentDeployStatus atom.
setSavedConfig((prev) =>
prev ? { ...prev, deployedServerHost: serverHost } : prev,
);
setDeploySuccess(
`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
);
} finally {
setIsDeploying(false);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle thrown fetch() failures in the new deploy/test flows.

These handlers only set UI errors for non-2xx responses. If fetch rejects, the promise bubbles out and the user gets no deployError/testError feedback.

Suggested pattern
 	try {
 		const response = await fetch("/api/telegram/deploy", {
 			method: "POST",
 		});
 ...
 			`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
 		);
+	} catch (error) {
+		setDeployError(error instanceof Error ? error.message : "Deploy failed");
 	} finally {
 		setIsDeploying(false);
 	}
 	try {
 		const response = await fetch("/api/telegram/test", {
 			method: "POST",
 			headers: { "content-type": "application/json" },
 			body: JSON.stringify({ message: testMessage }),
 		});
 ...
 
 			setTestResponse(payload?.response ?? "(empty response)");
+	} catch (error) {
+		setTestError(error instanceof Error ? error.message : "Test failed");
 	} finally {
 		setIsTesting(false);
 	}

Also applies to: 146-176

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/telegram/telegram-settings.tsx` around lines 111 - 143, The
handleDeploy function currently only handles non-2xx responses but doesn't catch
network/fetch rejections, so wrap the fetch call in a try/catch (or catch the
fetch promise) and on any thrown error call setDeployError(error.message ??
String(error)) before returning; ensure setIsDeploying is still cleared in the
existing finally block. Do the same defensive pattern for the equivalent test
flow (the block handling test API calls around lines 146-176) so both fetch
rejections set testError/deployError and prevent unhandled promise rejections.

jellydn added 12 commits May 31, 2026 10:13
- Add buildHermesComposeContent() to server/compose.ts for the
  duplicated docker-compose template between install and telegram
- Add resolveServerSshConfig() to server/server-records.ts to DRY
  up the normalizeAuthMethod + resolveServerCredential pattern
- Apply both helpers across install, server-actions, servers, and
  dashboard; remove local duplicate functions (getInstallCredential,
  getServerCredential, normalizeAuthMethod)
…n, handle API errors

- Reset deployedServerId/Host/apiServerKey to null when connecting a
  new bot token to prevent split-brain (old token still on VPS)
- Return deployedServerHost: null in connect response for consistency
- Wrap telegramConfigs update + auditLog insert in db.transaction()
- Handle Hermes API error responses in testTelegramBot curl results
  (curl does not use --fail, so error JSON was silently swallowed)
- Add null safety to JSON.parse result in testTelegramBot
- Remove frontend code that preserved stale deployedServerHost on reconnect
- Add DB migration for telegram_configs columns
… ternaries

- Merge identical ollama and custom provider test branches using the
  requiresBaseUrl flag from AiProviderOption
- Add defaultBaseUrl field to AiProviderOption metadata
- Replace 7 scattered form.provider === 'ollama' conditionals in
  provider-settings.tsx with provider metadata lookups
- Wrap JSON.parse in decryptApiKey with its own try-catch so keys
  starting with '{' that are not valid JSON fall back to raw text
globalThis.process guard prevents tree-shaking of dev-only early
return in requireHttps and the Resend API key check when Dockerfile
bakes NODE_ENV=production at build time.
Swap NODE_ENV to development so hot-reload works in local dev.
Mailpit provides a catch-all SMTP server with a web UI on port 8025,
replacing the need for a real Resend API key during development.
Covers TanStack Start + Hono dispatch, PostgreSQL + Drizzle ORM, Better
Auth with magic links, Docker multi-stage build, AES-256-GCM credential
encryption, TanStack Router file-based routing, and Tailwind v4 with
island components.
Covers four scenarios: selecting custom provider from empty state,
loading saved config with model/baseUrl, empty model/url, and undefined
baseUrl edge case.
feat(ui): migrate ConnectionWizard and login to react-hook-form + zod
Result: {"status":"keep","test_time_ms":7200}
Result: {"status":"keep","test_time_ms":7300}
@socket-security

socket-security Bot commented May 31, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​hookform/​resolvers@​5.4.09910010090100
Addedreact-hook-form@​7.76.19710010095100

View full report

jellydn added 12 commits May 31, 2026 12:32
Result: {"status":"keep","test_time_ms":7900}
Result: {"status":"keep","test_time_ms":9200}
…art plugins in test environment

Result: {"status":"keep","test_time_ms":7300}
…hecking

Result: {"status":"keep","test_time_ms":7500}
Result: {"status":"keep","test_time_ms":4800}
Result: {"status":"keep","test_time_ms":4600}
…pilation

Result: {"status":"keep","test_time_ms":4600}
…time

Result: {"status":"keep","test_time_ms":5100}
… for component tests

Result: {"status":"keep","metric":5200}

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/telegram.ts (1)

295-308: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add timeouts to SSH exec commands to prevent request hangs.

Both ssh.execCommand calls here run without explicit timeouts, while testTelegramBot (line 441) and runHermesPairingJsonCommand (line 725) correctly use execOptions: { timeout: ... }. A hung SSH session or slow container restart can pin the HTTP request indefinitely.

🛡️ Proposed fix to add timeouts
 			async (ssh) => {
-				const writeResult = await ssh.execCommand(writeCmd);
+				const writeResult = await ssh.execCommand(writeCmd, {
+					execOptions: { timeout: 30_000 },
+				});
 				if (writeResult.code !== 0) {
 					throw new Error(
 						writeResult.stderr || "Failed to write docker-compose.yml",
 					);
 				}

 				const restartResult = await ssh.execCommand(
 					"cd ~/hermes && sudo docker compose up -d --force-recreate",
+					{ execOptions: { timeout: 120_000 } },
 				);
 				if (restartResult.code !== 0) {
 					throw new Error(restartResult.stderr || "Failed to restart Hermes");
 				}
 			},
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.ts` around lines 295 - 308, The two ssh.execCommand calls
inside the async callback (the one executing writeCmd and the subsequent restart
command) lack execOptions timeouts and can hang; update both calls (the
execCommand invoking writeCmd and the execCommand running "cd ~/hermes && sudo
docker compose up -d --force-recreate") to pass execOptions: { timeout: <ms> }
(use the same timeout constant used by
testTelegramBot/runHermesPairingJsonCommand or a sensible value like 120000ms),
propagate/handle a timeout by including the stderr/timeout info in the thrown
Error (e.g., throw new Error(restartResult.stderr || '... timed out')). Ensure
you reference these exact execCommand calls so both are updated.
♻️ Duplicate comments (1)
server/providers.ts (1)

616-623: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use decryptApiKey() for backward-compatible JSON unwrapping.

Same issue as getProviderDeployConfig: this uses decryptSecret() directly rather than decryptApiKey(), so legacy JSON-wrapped keys will be written verbatim to the Hermes compose file instead of the actual API key.

🐛 Proposed fix
 	let decryptedApiKey = "";
 	if (providerRecord.encryptedApiKey) {
-		try {
-			decryptedApiKey = decryptSecret(providerRecord.encryptedApiKey);
-		} catch {
-			return context.json({ error: "Failed to decrypt API key." }, 500);
-		}
+		decryptedApiKey = decryptApiKey(providerRecord.encryptedApiKey);
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/providers.ts` around lines 616 - 623, The code is calling
decryptSecret() directly which fails to unwrap legacy JSON-wrapped API keys;
replace the decryptSecret() usage with decryptApiKey() when populating
decryptedApiKey (i.e. where providerRecord.encryptedApiKey is handled) so legacy
JSON format is normalized the same way as getProviderDeployConfig; ensure you
import/keep reference to decryptApiKey() and use it in place of decryptSecret()
before returning or writing the Hermes compose file.
🧹 Nitpick comments (1)
server/app.test.ts (1)

19-27: ⚡ Quick win

Consider adding test cases for the new mocked routes.

The file adds mock declarations for five new route handlers but does not include test cases to verify that the routes are wired correctly. All existing routes in this file have corresponding tests (lines 86–464), so the new deployment and testing routes should follow the same pattern for consistency and coverage.

Missing route tests:

  • POST /api/providers/deploy
  • POST /api/telegram/deploy
  • POST /api/telegram/test
  • GET /api/telegram/pairings
  • POST /api/telegram/pairings/approve
Example test skeleton for POST /api/providers/deploy
+	it("routes provider deploy requests through the deploy handler", async () => {
+		deployProviderToHermes.mockResolvedValueOnce(
+			new Response(JSON.stringify({ deployedModel: "gpt-4o-mini" }), {
+				status: 200,
+				headers: { "content-type": "application/json" },
+			}),
+		);
+
+		const { apiApp } = await import("./app");
+		const response = await apiApp.request(
+			"http://localhost/api/providers/deploy",
+			{
+				method: "POST",
+			},
+		);
+
+		expect(response.status).toBe(200);
+		expect(deployProviderToHermes).toHaveBeenCalledTimes(1);
+	});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/app.test.ts` around lines 19 - 27, Add unit tests that exercise and
assert the five new mocked route handlers are wired like the existing tests:
create tests for POST /api/providers/deploy (assert deployProviderToHermes is
called with expected body and responds with correct status/body), POST
/api/telegram/deploy (assert deployTelegramToServer called), POST
/api/telegram/test (assert testTelegramBot called), GET /api/telegram/pairings
(assert listTelegramPairings called and response forwarded), and POST
/api/telegram/pairings/approve (assert approveTelegramPairing called with
correct payload); use the same test structure and fixtures as the other route
tests, stub the mock functions deployProviderToHermes, deployTelegramToServer,
testTelegramBot, listTelegramPairings, and approveTelegramPairing, and verify
both invocation and HTTP response handling for each route.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.planning/codebase/INTEGRATIONS.md:
- Around line 7-26: The four duplicated section headers in INTEGRATIONS.md
should be replaced with unique provider-specific headers so each block is
clearly identifiable; update the repeated "**AI Providers:**" headings to e.g.
"**OpenAI Models API:**", "**Anthropic Models API:**", "**OpenRouter API:**" and
"**Ollama / OpenAI-compatible endpoint:**" respectively, leaving the body text
intact and keeping existing references to server/providers.ts,
src/lib/ai-providers.ts, server/crypto.ts, and server/db/schema.ts so links
remain accurate.

In `@server/crypto.ts`:
- Around line 56-65: The decryptApiServerKey function currently swallows
decryption errors; update it to log a warning (including the caught error) when
decryptSecret throws and we fall back to returning plaintext, add a short
comment above decryptApiServerKey describing the backward-compatibility contract
(plaintext fallback for legacy or unencrypted values and that ENCRYPTION_KEY
misconfiguration will trigger warnings), and emit or increment an observability
metric/audit event when the fallback occurs (call your existing metric/audit
helper such as recordMetric/incrementCounter/auditLog or create one if missing)
so fallback usage is tracked.

In `@server/providers.ts`:
- Around line 544-551: In getProviderDeployConfig, replace the direct call to
decryptSecret(record.encryptedApiKey) with decryptApiKey(record.encryptedApiKey)
so legacy JSON-wrapped keys are unwrapped (matching behavior in
getCurrentProviderConfig and resolveProviderApiKey); preserve the existing
try/catch and error message behavior so decryptedApiKey remains set for
subsequent calls to buildProviderEnvMap and avoid writing raw JSON to Hermes env
vars.

---

Outside diff comments:
In `@server/telegram.ts`:
- Around line 295-308: The two ssh.execCommand calls inside the async callback
(the one executing writeCmd and the subsequent restart command) lack execOptions
timeouts and can hang; update both calls (the execCommand invoking writeCmd and
the execCommand running "cd ~/hermes && sudo docker compose up -d
--force-recreate") to pass execOptions: { timeout: <ms> } (use the same timeout
constant used by testTelegramBot/runHermesPairingJsonCommand or a sensible value
like 120000ms), propagate/handle a timeout by including the stderr/timeout info
in the thrown Error (e.g., throw new Error(restartResult.stderr || '... timed
out')). Ensure you reference these exact execCommand calls so both are updated.

---

Duplicate comments:
In `@server/providers.ts`:
- Around line 616-623: The code is calling decryptSecret() directly which fails
to unwrap legacy JSON-wrapped API keys; replace the decryptSecret() usage with
decryptApiKey() when populating decryptedApiKey (i.e. where
providerRecord.encryptedApiKey is handled) so legacy JSON format is normalized
the same way as getProviderDeployConfig; ensure you import/keep reference to
decryptApiKey() and use it in place of decryptSecret() before returning or
writing the Hermes compose file.

---

Nitpick comments:
In `@server/app.test.ts`:
- Around line 19-27: Add unit tests that exercise and assert the five new mocked
route handlers are wired like the existing tests: create tests for POST
/api/providers/deploy (assert deployProviderToHermes is called with expected
body and responds with correct status/body), POST /api/telegram/deploy (assert
deployTelegramToServer called), POST /api/telegram/test (assert testTelegramBot
called), GET /api/telegram/pairings (assert listTelegramPairings called and
response forwarded), and POST /api/telegram/pairings/approve (assert
approveTelegramPairing called with correct payload); use the same test structure
and fixtures as the other route tests, stub the mock functions
deployProviderToHermes, deployTelegramToServer, testTelegramBot,
listTelegramPairings, and approveTelegramPairing, and verify both invocation and
HTTP response handling for each route.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: debea91c-f7dd-433d-af21-9dac719df63c

📥 Commits

Reviewing files that changed from the base of the PR and between 4eef659 and 2bb2873.

📒 Files selected for processing (20)
  • .planning/codebase/ARCHITECTURE.md
  • .planning/codebase/CONCERNS.md
  • .planning/codebase/CONVENTIONS.md
  • .planning/codebase/INTEGRATIONS.md
  • .planning/codebase/STACK.md
  • .planning/codebase/STRUCTURE.md
  • .planning/codebase/TESTING.md
  • README.md
  • drizzle/0001_overjoyed_annihilus.sql
  • drizzle/meta/0001_snapshot.json
  • drizzle/meta/_journal.json
  • server/app.test.ts
  • server/app.ts
  • server/compose.ts
  • server/crypto.ts
  • server/providers.test.ts
  • server/providers.ts
  • server/telegram.ts
  • src/features/telegram/telegram-settings.test.tsx
  • src/features/telegram/telegram-settings.tsx
✅ Files skipped from review due to trivial changes (5)
  • drizzle/0001_overjoyed_annihilus.sql
  • drizzle/meta/0001_snapshot.json
  • README.md
  • .planning/codebase/STRUCTURE.md
  • .planning/codebase/ARCHITECTURE.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • server/compose.ts
  • server/app.ts
  • src/features/telegram/telegram-settings.tsx

Comment thread .planning/codebase/INTEGRATIONS.md
Comment thread server/crypto.ts
Comment on lines +56 to +65
export function decryptApiServerKey(payload: string): string {
if (!payload) {
return "";
}
try {
return decryptSecret(payload);
} catch {
return payload;
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add logging and documentation for decryption fallback.

The silent catch block makes it impossible to detect when decryption fails. If the ENCRYPTION_KEY is incorrect or if data is corrupted, all encrypted keys will silently fall back to being treated as plaintext, causing downstream authentication failures with no debugging breadcrumbs.

Consider:

  1. Log a warning when falling back to plaintext (helps track migration progress and detect misconfiguration)
  2. Document the backward-compatibility contract in a comment
  3. Add a metric or audit log entry to track fallback usage
🔍 Proposed fix to add observability
 export function decryptApiServerKey(payload: string): string {
 	if (!payload) {
 		return "";
 	}
 	try {
 		return decryptSecret(payload);
-	} catch {
+	} catch (error) {
+		console.warn(
+			"[crypto] Failed to decrypt apiServerKey, treating as plaintext (legacy format). " +
+			"Consider re-encrypting this key.",
+			{ error: error instanceof Error ? error.message : String(error) }
+		);
 		return payload;
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/crypto.ts` around lines 56 - 65, The decryptApiServerKey function
currently swallows decryption errors; update it to log a warning (including the
caught error) when decryptSecret throws and we fall back to returning plaintext,
add a short comment above decryptApiServerKey describing the
backward-compatibility contract (plaintext fallback for legacy or unencrypted
values and that ENCRYPTION_KEY misconfiguration will trigger warnings), and emit
or increment an observability metric/audit event when the fallback occurs (call
your existing metric/audit helper such as recordMetric/incrementCounter/auditLog
or create one if missing) so fallback usage is tracked.

Comment thread server/providers.ts
jellydn added 19 commits June 1, 2026 01:57
Harden provider/telegram deployment paths by validating model strings,\nquoting shell-sensitive values, and sharing token-last4 logic.\n\nAlso clear dashboard cache after mutating server/provider/telegram actions\nand expand tests around deploy, quoting, and model validation paths.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Server actions now execute DB writes through transactions.\n\nUpdate the test harness to expose db.transaction and run handlers\nwith a transaction-scoped DB object so tests exercise the same path.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Decompose the large telegram settings component into dedicated\nconnect, deploy, pairing, test, and sidebar sections with shared\ninput styling.\n\nThis reduces coupling and keeps each workflow isolated without\nchanging API behavior.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document the Telegram pairing management APIs and troubleshooting\nflow, and record architecture decisions for single-instance state\nand runtime management from the Telegram page.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop the dead updateLatestInstallVersion helper now that version updates\nare handled transactionally inside runServerAction.\n\nThis keeps server-actions focused on active execution paths and\navoids stale helper drift.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Capture when to use DB transactions for coupled multi-write flows\nand when sequential primary+audit writes are intentionally acceptable.\n\nThis makes consistency expectations explicit for future changes in\nTelegram deploy and server action paths.\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The 177-line ssh.ts mixed connection lifecycle, error normalization,
OS detection, and shell quoting. Split into four focused modules
under server/ssh/ so each concern lives in its own file:

- connection.ts owns withSshConnection, verifyServerConnection,
  and the SshConnectionInput type. It owns the NodeSSH lifecycle
  and the execStrict helper.
- errors.ts owns UnsupportedOsError, SshConnectError, and
  normalizeSshError which translates raw node-ssh errors into the
  two user-facing categories (invalid credentials, host unreachable).
- os.ts owns parseAndValidateOs and VerifiedServerInfo. It reads
  /etc/os-release, classifies Ubuntu/Debian support, and returns
  supportLevel for the dashboard.
- quoting.ts owns shellQuote, a 3-line helper that doesn't deserve
  to live in the connection module.

server/ssh.ts now re-exports the same public surface (SshAuthMethod,
SshConnectionInput, verifyServerConnection, withSshConnection,
VerifiedServerInfo, parseAndValidateOs, UnsupportedOsError,
SshConnectError, normalizeSshError, shellQuote) so existing
imports keep working unchanged.
…elper

Two helpers are needed by every code path that SSHes into a server
without first verifying ownership: telegram deploy, telegram test,
telegram pairings, and provider deploy. They used to reimplement
the same try/catch around resolveServerSshConfig and re-query the
servers table to find the deployed server.

- ServerConnectionRecord is the minimal projection (no ownership,
  no status) used when the server is already trusted via a previous
  deployment record.
- getServerById looks up a server by ID without an ownership check.
  This is safe for telegram/provider deploys because the caller
  already validated the deployment state via telegramConfigs rows.
- resolveServerSshConfigOrError returns a discriminated result
  instead of throwing. Callers map the failure to a 400 response,
  which removes four near-identical try/catch blocks across the
  telegram and server-actions handlers.

This commit only adds the helpers. Callers will adopt them in
follow-up commits so each commit stays reviewable.
…d modules

The 437-line servers.ts mixed HTTP handlers (connect/update/delete/
list) with the read-side query logic that powers the dashboard list
view. The query logic — including the SQL that joins audit_logs,
servers, and installs to surface the latest server action per row —
was hard to test in isolation and pulled in drizzle-orm symbols
that the HTTP layer doesn't need.

- servers/records.ts owns the three read helpers
  (getOwnedServerListRecords, getLatestInstallRecords,
  getLatestServerActionRecords), the audit-log name allowlist,
  and the record types. The SQL filter that pulls only the
  relevant server.* audit actions is colocated with the helper
  that uses it.
- servers/list.ts owns getServerListSnapshot, the orchestration
  that combines the three record helpers, walks the latest
  install/action per server, and computes lastActivityAt.
- servers.ts keeps the HTTP handlers and re-exports
  getServerListSnapshot so the server-fn in
  src/routes/servers.index.tsx keeps importing it from
  ./servers.
…o focused modules

The 434-line dashboard.ts mixed three concerns: SQL lookups for
the latest server/install/provider/telegram rows, pure summary
transformation (toAgentSummary, toProviderSummary, etc.), and
the live VPS metrics pipeline that shells out via SSH. That
made it impossible to unit-test the transformations without
mocking drizzle, and pulled a 15-second metrics cache into a
file that mostly deals with static snapshot composition.

- dashboard/records.ts owns the four getLatest* DB helpers and
  getServerCount. Pure SQL, no transformation.
- dashboard/summaries.ts owns ServerRecord/InstallRecord/Provider-
  Record/TelegramRecord types, the to*Summary mappers, getHealthTone,
  and parsePercentValue. Pure functions, no DB, no SSH.
- dashboard/metrics.ts owns the 15-second metrics cache and
  readServerMetrics, the only function in this slice that talks
  to SSH. clearMetricsCache is the explicit invalidation hook.
- dashboard.ts becomes the composition layer: cache the static
  snapshot, compose the summaries, call getVpsSummary. It also
  re-exports getHealthTone and the to*Summary functions so
  existing test imports keep working.
…odules

The 260-line install.ts mixed the Hono HTTP/SSE handler with two
pieces of business logic: the install step list (six apt-get/
docker compose commands with progress percentages) and the
workflow runner that shells out, persists audit logs, and emits
SSE events. This made the install pipeline hard to read end-to-
end and forced the SSE handler to import a long list of types
that the HTTP layer doesn't care about.

- install/workflow.ts owns InstallStep, the installSteps
  array, runInstallWorkflow, the SSH-driven command loop, and
  the audit-log writes for succeeded/failed states. Pure
  business logic, no Hono context.
- install/records.ts owns upsertInstallRecord (idempotent
  create-or-reset of the install row) and getServerForInstall
  (the minimal server projection needed to SSH, with the
  ownership filter).
- install.ts becomes the HTTP shell: startServerInstall claims
  the SSE slot, upserts the install row, then fires
  runInstallWorkflow in the background. streamServerInstallEvents
  and getLatestServerInstallLog stay here because they own the
  Hono/streamSSE context.

The startServerInstall handler also adopts
resolveServerSshConfigOrError to drop the inline try/catch
around the credential resolve.
…o focused modules

The 510-line providers.ts mixed four concerns: provider env-var
config (PROVIDER_ENV_CONFIGS, buildProviderEnvMap, custom-vendor
API key derivation), the verify-the-key live API call
(createProviderTestRequest, verifyProviderConnection, the
ProviderConnectionError class), DB lookups (getLatestProvider-
Record, getTelegramDeployInfo, decryptApiKey, getApiKeyLast4), and
the HTTP handlers + the deploy handler.

Each helper pulls in its own set of imports and is meaningful on
its own — extract-and-test for the env-var derivation in
isolation is a long-standing gap.

- providers/config.ts owns ProviderRequest, StoredProviderRecord,
  ProviderConfigSummary, the PROVIDER_ENV_CONFIGS table,
  buildProviderEnvMap, isApiKeyRequired, and
  deriveCustomProviderApiKeyEnvVar. Pure logic, no DB, no HTTP.
- providers/connection.ts owns ProviderConnectionError,
  verifyProviderConnection, and createProviderTestRequest. The
  per-provider URL/header builder is private to the module since
  no caller needs to override it.
- providers/records.ts owns decryptApiKey (with the legacy JSON
  unwrap), getApiKeyLast4, getLatestProviderRecord, and
  getTelegramDeployInfo. The deploy function still reads from
  here — the SQL naturally lives next to the records it queries.

providers.ts becomes a thin HTTP layer: getCurrentProviderConfig,
saveProviderConfig, testProviderConfig, parseProviderRequest,
resolveProviderApiKey, getProviderDeployConfig, and
deployProviderToHermes. The deploy handler stays in providers.ts
for now; the follow-up commit moves it (and the shared
docker-compose deploy helper) into server/deploy.ts so the
provider and telegram deploy paths can share a single SSH
pipeline.
…bmodules

The provider and telegram deploy handlers both wrote a
docker-compose.yml to the remote server and ran
`docker compose up -d --force-recreate`. The telegram version
also ran `hermes config set model` after a 2-second sleep so the
new container would be ready. The duplication drifted: the
provider version re-declared the writeCmd template inline, the
telegram version kept the shellQuote import only because of a
single command, and audit-log writes were copy-pasted.

- server/deploy.ts owns deployComposeViaSsh (the SSH pipeline:
  write compose file, force-recreate the stack, optionally run
  extra commands inside the same connection) and
  deployProviderToHermes (moved verbatim from providers.ts so
  the two deploy paths share one SSH codepath).
- server/telegram.ts now calls deployComposeViaSsh with no
  extraSshCommands and writes the audit log inside a
  transaction, exactly as before.
- server/providers.ts no longer owns deployProviderToHermes.
  It keeps the public surface (saveProviderConfig,
  testProviderConfig, getCurrentProviderConfig,
  getProviderDeployConfig) that the dashboard/settings routes
  depend on.
- server/app.ts imports deployProviderToHermes from ./deploy.

The 547-line telegram.ts also split into focused submodules:

- telegram/config.ts owns TelegramConnectRequest,
  TelegramConfigSummary, TelegramPairingSummary, the
  TelegramConnectionError class, verifyTelegramToken, and
  getTokenLast4. The Telegram getMe response shape stays
  private to the verifier.
- telegram/pairings.ts owns listTelegramPairings and
  approveTelegramPairing. Both share a getDeployedTelegramServer
  resolver that returns either an HTTP response or the
  (serverRecord, authMethod, credential) triple, eliminating the
  repeated 401/404/400 branches. The pairing JSON parsing lives
  here too — Hermes returns snake_case keys, we project to
  camelCase before crossing the boundary.
- telegram/records.ts owns getCurrentTelegramConfig,
  getLatestTelegramRecord, and findServerForDeploy (the join
  between installs and servers that returns a server whose most
  recent install succeeded). decryptSecret is used locally; the
  botToken is decrypted in this module so the HTTP layer never
  touches the encrypted form.
- telegram.ts becomes the HTTP shell: connectTelegram,
  disconnectTelegram, deployTelegramToServer, testTelegramBot,
  and parseChatCompletion. The two SSH handlers adopt
  resolveServerSshConfigOrError and getServerById from
  server-records so the duplicated try/catch around
  resolveServerSshConfig disappears.
The runServerAction handler declared mutable authMethod/credential
variables and wrapped resolveServerSshConfig in a try/catch to map
any thrown error to a 400 response. The new
resolveServerSshConfigOrError returns a discriminated result that
removes the try/catch and the temporary let bindings.

The import switch and the call site are the only changes — the
SshAuthMethod type import stays because it appears later in
resolveServerSshConfigOrError's return type for the rollback
target.
decryptApiServerKey used to swallow every decryption failure and
return the raw payload. The intent was to support legacy keys
that were never encrypted, but the same path also masked real
failures: a corrupted ciphertext or wrong key was indistinguishable
from "this key was stored in plaintext".

AES-GCM uses iv:tag:cipher and we join the three parts with a
dot. An unencrypted key has no dots, so the presence of a dot is
a reliable signal that the payload was encrypted. If decrypt
fails AND the payload contains a dot, surface the error so the
caller can prompt for a new key. If decrypt fails AND the payload
has no dot, fall back to returning the raw value as before.
The provider tests now mock the new server-records helpers
(getServerById, resolveServerSshConfigOrError) and import
deployProviderToHermes from ./deploy instead of ./providers.

The three 'returns 404 when deployed server is not found',
'returns 500 when bot token decryption fails', and 'returns 502
when the SSH pipeline fails' cases drop one selectLimit chain
link each: the third .limit() call used to stand in for the
servers lookup that getServerById now owns. The
beforeEach setup installs the getServerById mock as the default
and the 404 test overrides it to null.

The shellQuote mock is now set up at module load time rather
than lazily so the deploy compose tests can run before any
handler is invoked.
The telegram tests mock the new server-records helpers
(getServerById, resolveServerSshConfigOrError) so the deploy
and test-bot paths can run without a live database or real
credential resolution.

The 'deploy writes compose + restarts containers' and
'deploys when API key contains shell metacharacters' cases
lose their second selectLimit chain link each: the second
.limit() used to mock findServerById, which is now
getServerById and resolves from the new server-records
mock rather than a chained select. The two chain links collapse
to one (getLatestTelegramRecord) and getServerById is set up
explicitly alongside it.

resolveServerSshConfigOrError is mocked to return
{ok: true, ...} in both happy-path cases and the SSH-error
case, matching the new discriminated return shape the
handlers expect.
TelegramSettings mounts TelegramPairingSection with a key bound
to savedConfig.deployedServerHost. The key remounts the section
when the deploy target changes, so a section that started in
"not deployed" state cleanly re-initializes when a new server
is selected.

TelegramPairingSection now polls /api/telegram/pairings every
10 seconds via useMountEffect. The 10s interval matches the
Hermes bot's typical pairing-code lifetime; users no longer have
to manually click "Refresh" while a code is being typed in
Telegram. loadPairings takes a quiet flag so the polling path
can update the list without flashing the loading spinner or
overwriting an in-flight user-facing error.

The approve handler now takes an optional selectedCode so the
new list-row approve buttons can pass the code directly. The
text input flow still passes the trimmed input value, preserving
the existing manual-entry path.

The settings test adds a 'loads pending Telegram pairing
requests and approves one from the list' case that mounts the
component with a deployed config, waits for the polling fetch
to land, and clicks the new row-level approve button.
The split-into-submodules commits landed with single-line imports
where biome prefers the type/value grouped form, and the type
imports in dashboard/summaries.ts and ssh/connection.ts were
not sorted. This is a pure formatting pass — no behavioral
changes.

- server/ssh.ts: type group stays at the top of each export
  statement, re-exports are wrapped to the multi-line form biome
  picks.
- server/dashboard/metrics.ts and summaries.ts: imports are
  split into one-symbol-per-line groups.
- server/ssh/connection.ts: type-only imports move to the top.
- server/servers.ts: the servers/list import is reordered so
  the new-module imports sit together with server-records.
- server/dashboard.ts: type imports precede the value imports
  inside the { } group, matching the import-sort Biome config.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
server/servers.ts (1)

145-160: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't let audit-log failures flip successful server mutations into 500s.

These handlers commit the primary servers change first, then await a historical audit insert. If the audit write fails, the request can return failure even though the server mutation already succeeded, which is especially risky for connectServer() because retries can create duplicates.

🩹 Suggested fix
-		await db.insert(auditLogs).values({
-			userId: session.user.id,
-			action: "server.connect.succeeded",
-			details: {
-				serverId: serverRecord.id,
-				host: parsed.host,
-				osName: verified.osName,
-				osVersion: verified.osVersion,
-				architecture: verified.architecture,
-				supportLevel: verified.supportLevel,
-			},
-			ipAddress,
-		});
-
-		clearDashboardCache();
+		try {
+			await db.insert(auditLogs).values({
+				userId: session.user.id,
+				action: "server.connect.succeeded",
+				details: {
+					serverId: serverRecord.id,
+					host: parsed.host,
+					osName: verified.osName,
+					osVersion: verified.osVersion,
+					architecture: verified.architecture,
+					supportLevel: verified.supportLevel,
+				},
+				ipAddress,
+			});
+		} finally {
+			clearDashboardCache();
+		}

Apply the same pattern to updateServer() and deleteServer(): treat audit logging as best-effort and keep cache invalidation on the successful primary path.

As per coding guidelines, "Sequential writes are fine when primary data → audit log sequences are purely historical and the audit log's absence doesn't affect correctness."

Also applies to: 287-299, 327-338

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/servers.ts` around lines 145 - 160, The audit-log insert
(db.insert(auditLogs).values(...)) is making successful mutations into 500s when
it fails; wrap the audit insert in a try/catch so it becomes best-effort: in
connectServer(), updateServer(), and deleteServer() catch any errors from the
audit write, log the error (don’t rethrow) and continue; ensure
clearDashboardCache() and the primary server mutation response happen on the
successful path (not inside the catch) so cache invalidation and client success
are preserved even if the audit write fails.
server/telegram.ts (4)

360-365: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle apiServerKey decryption failures explicitly.

decryptApiServerKey(record.apiServerKey) runs before the endpoint's main error path. If that ciphertext is corrupted or no longer decryptable, this throws out of the handler and skips the normal JSON error response shape. Wrap it and return a clear failure response here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.ts` around lines 360 - 365, The call to
decryptApiServerKey(record.apiServerKey) can throw and currently escapes the
handler; wrap that call in a try/catch within the same handler (before calling
getProviderDeployConfig) and on failure return the same JSON error response
shape used elsewhere (e.g., status and message fields) so
corrupted/undecryptable apiServerKey yields a clear HTTP error instead of
crashing the request; reference decryptApiServerKey, record.apiServerKey, and
the surrounding handler so the catch returns the consistent error payload and
stops further processing.

159-175: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't let an audit-log failure turn a successful disconnect into a 500.

Lines 160-175 flip isActive before writing the audit row. If auditLogs fails, the bot is already disconnected but the handler reports failure and skips the cache clear, which leaves the client and dashboard stale. Make the primary update authoritative and handle the audit insert separately.

As per coding guidelines, "Sequential writes are fine when primary data → audit log sequences are purely historical and the audit log's absence doesn't affect correctness. If the audit insert fails, the primary operation is still valid — no data divergence."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.ts` around lines 159 - 175, Perform the primary update (the
db.update(...).set({ isActive: false }).where(eq(telegramConfigs.userId,
session.user.id)) call) as the authoritative operation and then run the audit
insert in a separate try/catch so an audit failure cannot turn a successful
disconnect into an error; specifically, keep the update as-is, then wrap the
db.insert(auditLogs).values({... userId: session.user.id, action:
"telegram.disconnected", details: { botUsername: record.botUsername }, ipAddress
}) call in its own try/catch, log the error but do not rethrow, and ensure
clearDashboardCache() is always called after the primary update regardless of
audit insert outcome.

99-125: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make the config swap atomic and treat the audit insert as best-effort.

Lines 100-125 can already save the new bot config and still return 500 if the audit insert fails, and the preceding deactivate+insert sequence is not atomic. Put the primary config mutation in db.transaction() and keep telegram.connected from changing the outcome of an otherwise successful connect.

As per coding guidelines, "Use db.transaction() when a write path touches multiple Drizzle statements that must commit or roll back together to maintain consistency" and "If the audit insert fails, the primary operation is still valid — no data divergence."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.ts` around lines 99 - 125, Wrap the primary config mutation
(the deactivate-then-insert sequence that calls db.update(...).set(...) on
telegramConfigs and db.insert(telegramConfigs).values(...)) inside a
db.transaction(...) so the deactivation and new config insert commit or roll
back together; keep clearDashboardCache() tied to the successful transaction
commit. Move the audit insert (db.insert(auditLogs).values(...)) outside the
transaction and make it best-effort by surrounding it with its own try/catch so
any failure only logs the error and does not throw or change the successful
outcome returned to the caller.

262-300: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The post-SSH transaction does not keep local and remote deploy state consistent.

Lines 263-300 mutate Hermes first and persist deployedServerId / apiServerKey afterward. If the DB transaction fails after SSH succeeds, the server is already reconfigured with a key that was never saved locally, so the app reports a failed deploy but cannot talk to the deployed instance anymore. This needs a compensating rollback or a staged/persisted deploy record before the remote change.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/telegram.ts` around lines 262 - 300, Persist the intended deploy state
before making the remote SSH change instead of applying remote changes first:
create a staged/pending deploy record in the DB inside the existing transaction
(e.g., add or set pendingServerId/pendingServerHost/pendingApiServerKey via the
telegramConfigs row and insert the auditLogs entry) so you atomically record the
planned key and target (use encryptSecret when storing the pending key), then
call deployComposeViaSsh; on SSH success, run a second transaction to promote
pendingServerId/pendingServerHost/pendingApiServerKey to
deployedServerId/deployedServerHost/apiServerKey (and clear pending fields), and
on SSH failure clear the pending fields or perform a compensating SSH rollback
using the previous deployedServerHost/apiServerKey; update code around
deployComposeViaSsh, db.transaction, telegramConfigs, auditLogs, encryptSecret,
and the session.user.id usage to implement this staged-deploy flow.
🧹 Nitpick comments (1)
server/providers/connection.ts (1)

23-27: ⚡ Quick win

Add a timeout to prevent hung requests.

The fetch call has no timeout, so requests to slow or unresponsive endpoints can hang indefinitely. Consider using AbortSignal.timeout() to bound the request duration.

♻️ Proposed fix
 	try {
-		response = await fetch(request.url, request.init);
+		response = await fetch(request.url, {
+			...request.init,
+			signal: AbortSignal.timeout(10_000),
+		});
 	} catch {
 		throw new ProviderConnectionError("Connection failed", "connection_failed");
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/providers/connection.ts` around lines 23 - 27, The fetch call can hang
because it has no timeout; wrap the request with an AbortSignal.timeout (or
create an AbortController and Timer) and attach the resulting signal to
request.init before calling fetch in the same block where fetch is invoked
(replace the call in connection.ts that uses fetch(request.url, request.init));
choose a sensible timeout (e.g. 10000 ms), handle the abort case in the catch to
throw a ProviderConnectionError with a distinct code like "connection_timeout"
while keeping the existing "connection_failed" for non-timeout errors, and
ensure any previously provided request.init.signal is respected/merged or
overridden explicitly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/dashboard/metrics.ts`:
- Around line 61-77: The cached branch currently sets updatedAt to the request
time `now`, which hides the real sample time; change the cached return in the
metrics routine to use the cached sample timestamp (use
`cachedMetrics.timestamp`) for `updatedAt` (formatted with `new
Date(...).toISOString()`), keep computing `status` with `getHealthTone(metrics)`
and all other fields as-is, and ensure `cachedMetrics.data` and
`cachedMetrics.timestamp` are used together so stale-but-cached metrics show
their original sample time rather than the current `now`.

In `@server/dashboard/summaries.ts`:
- Around line 50-53: The offline-fallback branch in toAgentSummary (and the
similar branch around lines 95-99) is synthesizing a new updatedAt on every
request; instead, stop generating a fresh timestamp and preserve the existing
updatedAt from installRecord (or leave it undefined/null) when serverRecord is
null. Locate the updatedAt assignment in toAgentSummary (and the analogous code
block later) and replace any fallback like Date.now() / new Date() with
installRecord?.updatedAt (or null) so we only reflect true server-side updates.

In `@server/deploy.ts`:
- Around line 159-192: The audit log inserts around deployComposeViaSsh() must
be best-effort and must not change the HTTP outcome: wrap the
db.insert(auditLogs).values(...) calls (both the success-path insert after
deployComposeViaSsh() and the failure-path insert inside the catch) in their own
try/catch so any insert error is caught and logged (e.g., via processLogger or
console) but does not alter the response returned by context.json; ensure the
main success return (status: "deployed") is returned even if the success audit
insert fails, and ensure when handling a deploy error you record the original
deploy error message to the response and still attempt the failed-audit insert
but suppress/log any errors from that insert so they never overwrite the
original error handling.
- Line 113: The call to decryptApiServerKey(telegramInfo.apiServerKey) can throw
outside the main deploy try block, so wrap this call in a try/catch (or move it
into the existing deploy try) and handle failures the same way other credential
errors are handled: catch the error thrown by decryptApiServerKey, log the
error, and return the JSON error response (with the same status code and shape
used elsewhere in this file) instead of allowing the exception to escape; update
references to decryptedApiServerKey only after successful decryption.
- Around line 148-149: The deploy flow in server/deploy.ts calls ssh.execCommand
to run `docker exec hermes hermes config set model ...` without sudo, causing
privilege mismatch versus the earlier `sudo docker compose ...`; update the
ssh.execCommand invocation that runs `docker exec hermes hermes config set model
${shellQuote(providerRecord.model)}` to be consistent with the other commands
(e.g., prefix with `sudo` like `sudo docker exec ...`) or remove/soften this
post-restart `config set model` step if it's redundant because the compose
restart already sets API_SERVER_MODEL_NAME; locate the call using the
`ssh.execCommand` invocation and adjust it to use `sudo` or eliminate the extra
command.

In `@server/install/workflow.ts`:
- Around line 122-160: The success audit-log write using
getDb().insert(auditLogs).values({...}) in workflow.ts must be made best-effort
so a failing audit insert cannot trigger the outer catch and flip a completed
install to "failed"; wrap that success insert in its own try/catch, log the
insert error (but do not rethrow) and continue, leaving the existing error path
(the outer catch that emits "failed" and writes the failure audit) unchanged.

In `@server/providers/records.ts`:
- Around line 65-75: The current return narrows record to include
deployedServerHost as a string but only checks apiServerKey and
deployedServerId; update the function to either (A) add a null/undefined check
for record.deployedServerHost and return null if it's missing so the cast to {
botToken, apiServerKey, deployedServerId, deployedServerHost: string } is safe,
or (B) change the asserted/written return type to allow deployedServerHost:
string | null (i.e., widen the return type) so callers handle a nullable
deployedServerHost; reference the existing record variable and the returned
fields apiServerKey, deployedServerId, and deployedServerHost when making the
change.

In `@server/server-actions.ts`:
- Around line 146-182: The transaction that writes auditLogs and updates
installs (db.transaction block) can fail after the remote command already
succeeded, but currently any error bubbles up and the outer catch records
server.action.${action}.failed; to fix, make the DB writes best-effort: wrap the
db.transaction(...) in its own try/catch inside executeServerAction (or the
surrounding function), catch any error from the transaction and log it (e.g.,
processLogger.warn or similar) including context (action, serverId,
versionTarget, error) but do not rethrow or return a failure response; keep the
auditLogs/installs logic (auditLogs, installs, actionSuccessMessage,
versionTarget) unchanged inside the transaction so it still runs when possible.

In `@server/servers/records.ts`:
- Around line 94-112: The current query applies a global .limit(100) before
collectLatestActions(), which can omit the newest row for some serverIds;
replace the global limit with a per-server newest-row filter by using a window
function or DISTINCT ON: build a subquery or use sql`ROW_NUMBER() OVER
(PARTITION BY (auditLogs.details->>'serverId') ORDER BY auditLogs.createdAt
DESC)` (or DISTINCT ON (auditLogs.details->>'serverId') with ORDER BY createdAt
DESC) to mark the latest row per serverId, then select only rows where that
row_number = 1; keep the same selected columns (auditLogs.action,
auditLogs.details, auditLogs.createdAt) and the same predicates
(eq(auditLogs.userId, userId), inArray(...), serverId IN ...), and remove the
global .limit(100); reference getDb(), auditLogs, and the surrounding query
block/collectLatestActions().

In `@server/ssh/os.ts`:
- Around line 34-39: The code currently only throws for missing or "unknown"
osId, allowing non-Linux IDs (e.g. "freebsd") to pass; update the check around
osId (the same block that throws UnsupportedOsError using prettyName) to reject
explicit non-Linux IDs by verifying osId is a Linux identifier (e.g., require
osId === "linux" or check against a small allowlist of known Linux distro IDs)
and throw UnsupportedOsError for any osId that is present but not a Linux value;
reference the osId, prettyName and UnsupportedOsError symbols when making the
change so the parser rejects non-Linux hosts before returning
VerifiedServerInfo.

In `@src/features/telegram/telegram-connect-section.tsx`:
- Around line 81-99: The disconnect flow currently only handles non-2xx
responses and ignores fetch rejections; wrap the await
fetch("/api/telegram/disconnect", ...) in a try/catch inside handleDisconnect
(or the enclosing async function) and in the catch call setError with a helpful
message (e.g., payload from the thrown error or a fallback like "Unable to
disconnect Telegram") so the user sees the failure; keep the existing finally
block that calls setIsDisconnecting(false) and preserve onDisconnect() and
setSuccessMessage("Telegram bot disconnected") on success.
- Around line 51-73: The fetch call in the connect handler can reject and
currently escapes the try/finally; wrap the await fetch("/api/telegram/connect",
...) (or the whole request+response.json sequence) with a .catch or an inner
try/catch so network failures are mapped to setError(...) before the finally;
specifically, catch errors from the fetch/response.json steps, call setError
with a user-friendly message (e.g. error.message or "Network error"), and return
early so onConfigChange, setValue, and setSuccessMessage are not invoked; keep
the existing finally that calls setIsConnecting(false).

In `@src/features/telegram/telegram-deploy-section.tsx`:
- Around line 29-55: The try/finally currently lets fetch() throw without
reporting an error to the user; wrap the await fetch and subsequent JSON parsing
in a try/catch (or add a catch block) so any thrown error sets setError(...)
with a clear message (include error.message) and returns before calling
onConfigChange and setSuccessMessage; ensure setIsDeploying(false) remains in
the finally. Update the block that uses fetch("/api/telegram/deploy") and the
JSON parsing so exceptions trigger setError instead of silently falling through.

In `@src/features/telegram/telegram-input-class.ts`:
- Around line 1-2: The class string exported as inputClassName currently uses
"outline-none" which removes native focus outlines even under forced-colors;
update the string to use "outline-hidden" instead of "outline-none" so the
custom focus ring (focus:ring-*) is preserved while allowing forced-colors focus
indication to remain; modify the inputClassName constant value to swap
"outline-none" for "outline-hidden".

In `@src/features/telegram/telegram-pairing-section.tsx`:
- Around line 56-76: The loadPairings() function currently only handles non-OK
HTTP responses but doesn't catch network/fetch rejections; wrap the fetch and
response.json() logic in a try/catch (or add a catch around the await fetch)
inside loadPairings() so any thrown errors are caught, call setError(...) with a
useful message (and include error.message if available), ensure
setIsLoadingPairings(false) runs in the finally block, and return early on
error; reference the existing loadPairings, setError, setIsLoadingPairings,
setPairings, and the fetch("/api/telegram/pairings")/response.json() calls when
applying the fix.
- Around line 90-113: The approve flow currently only handles non-2xx responses
but not network/fetch rejections; update handleApprovePairing to catch thrown
errors from fetch and call setError(...) with a helpful message (and optionally
the error.message) so rejected requests aren't silent. Specifically wrap the
await fetch/response.json logic in a try/catch (or add a catch after the
promise) and in the catch call setError(...) and return; keep the existing
finally block that calls setIsApprovingPairing(false) and ensure loadPairings,
setPairingCode, and setSuccessMessage are only called on successful approvals.

In `@src/features/telegram/telegram-test-section.tsx`:
- Around line 27-47: The try/finally around the fetch leaves network/fetch
rejections unreported; add a catch block around the fetch/response parsing to
call setTestError with the network error message (e.g., error?.message ||
String(error)) so broken tunnels/offline browsers surface an error to the user,
keep the existing response.ok handling and setTestResponse logic, and retain the
finally block that calls setIsTesting(false); update the block that performs
fetch(...) in telegram-test-section.tsx (the async block using setTestError,
setTestResponse, setIsTesting) to implement this catch.

---

Outside diff comments:
In `@server/servers.ts`:
- Around line 145-160: The audit-log insert (db.insert(auditLogs).values(...))
is making successful mutations into 500s when it fails; wrap the audit insert in
a try/catch so it becomes best-effort: in connectServer(), updateServer(), and
deleteServer() catch any errors from the audit write, log the error (don’t
rethrow) and continue; ensure clearDashboardCache() and the primary server
mutation response happen on the successful path (not inside the catch) so cache
invalidation and client success are preserved even if the audit write fails.

In `@server/telegram.ts`:
- Around line 360-365: The call to decryptApiServerKey(record.apiServerKey) can
throw and currently escapes the handler; wrap that call in a try/catch within
the same handler (before calling getProviderDeployConfig) and on failure return
the same JSON error response shape used elsewhere (e.g., status and message
fields) so corrupted/undecryptable apiServerKey yields a clear HTTP error
instead of crashing the request; reference decryptApiServerKey,
record.apiServerKey, and the surrounding handler so the catch returns the
consistent error payload and stops further processing.
- Around line 159-175: Perform the primary update (the db.update(...).set({
isActive: false }).where(eq(telegramConfigs.userId, session.user.id)) call) as
the authoritative operation and then run the audit insert in a separate
try/catch so an audit failure cannot turn a successful disconnect into an error;
specifically, keep the update as-is, then wrap the
db.insert(auditLogs).values({... userId: session.user.id, action:
"telegram.disconnected", details: { botUsername: record.botUsername }, ipAddress
}) call in its own try/catch, log the error but do not rethrow, and ensure
clearDashboardCache() is always called after the primary update regardless of
audit insert outcome.
- Around line 99-125: Wrap the primary config mutation (the
deactivate-then-insert sequence that calls db.update(...).set(...) on
telegramConfigs and db.insert(telegramConfigs).values(...)) inside a
db.transaction(...) so the deactivation and new config insert commit or roll
back together; keep clearDashboardCache() tied to the successful transaction
commit. Move the audit insert (db.insert(auditLogs).values(...)) outside the
transaction and make it best-effort by surrounding it with its own try/catch so
any failure only logs the error and does not throw or change the successful
outcome returned to the caller.
- Around line 262-300: Persist the intended deploy state before making the
remote SSH change instead of applying remote changes first: create a
staged/pending deploy record in the DB inside the existing transaction (e.g.,
add or set pendingServerId/pendingServerHost/pendingApiServerKey via the
telegramConfigs row and insert the auditLogs entry) so you atomically record the
planned key and target (use encryptSecret when storing the pending key), then
call deployComposeViaSsh; on SSH success, run a second transaction to promote
pendingServerId/pendingServerHost/pendingApiServerKey to
deployedServerId/deployedServerHost/apiServerKey (and clear pending fields), and
on SSH failure clear the pending fields or perform a compensating SSH rollback
using the previous deployedServerHost/apiServerKey; update code around
deployComposeViaSsh, db.transaction, telegramConfigs, auditLogs, encryptSecret,
and the session.user.id usage to implement this staged-deploy flow.

---

Nitpick comments:
In `@server/providers/connection.ts`:
- Around line 23-27: The fetch call can hang because it has no timeout; wrap the
request with an AbortSignal.timeout (or create an AbortController and Timer) and
attach the resulting signal to request.init before calling fetch in the same
block where fetch is invoked (replace the call in connection.ts that uses
fetch(request.url, request.init)); choose a sensible timeout (e.g. 10000 ms),
handle the abort case in the catch to throw a ProviderConnectionError with a
distinct code like "connection_timeout" while keeping the existing
"connection_failed" for non-timeout errors, and ensure any previously provided
request.init.signal is respected/merged or overridden explicitly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c57a8c05-2c09-4f45-a4ac-1dec0983f31d

📥 Commits

Reviewing files that changed from the base of the PR and between 2bb2873 and f132c61.

📒 Files selected for processing (50)
  • AGENTS.md
  • CONTEXT.md
  • README.md
  • docs/adr/0009-single-instance-boundary-for-operational-state.md
  • docs/adr/0010-hermes-runtime-management-from-telegram-page.md
  • docs/api-reference.md
  • server/app.ts
  • server/compose.ts
  • server/crypto.ts
  • server/dashboard.ts
  • server/dashboard/metrics.ts
  • server/dashboard/records.ts
  • server/dashboard/summaries.ts
  • server/deploy.ts
  • server/install.ts
  • server/install/records.ts
  • server/install/workflow.ts
  • server/lib/get-last-4.ts
  • server/providers.test.ts
  • server/providers.ts
  • server/providers/config.ts
  • server/providers/connection.ts
  • server/providers/records.ts
  • server/server-actions.test.ts
  • server/server-actions.ts
  • server/server-records.ts
  • server/servers.ts
  • server/servers/list.ts
  • server/servers/records.ts
  • server/ssh.test.ts
  • server/ssh.ts
  • server/ssh/connection.ts
  • server/ssh/errors.ts
  • server/ssh/os.ts
  • server/ssh/quoting.ts
  • server/telegram.test.ts
  • server/telegram.ts
  • server/telegram/config.ts
  • server/telegram/pairings.ts
  • server/telegram/records.ts
  • src/features/telegram/telegram-connect-section.tsx
  • src/features/telegram/telegram-deploy-section.tsx
  • src/features/telegram/telegram-input-class.ts
  • src/features/telegram/telegram-pairing-section.tsx
  • src/features/telegram/telegram-settings.test.tsx
  • src/features/telegram/telegram-settings.tsx
  • src/features/telegram/telegram-sidebar.tsx
  • src/features/telegram/telegram-test-section.tsx
  • src/lib/ai-providers.test.ts
  • src/lib/ai-providers.ts
✅ Files skipped from review due to trivial changes (7)
  • server/lib/get-last-4.ts
  • docs/adr/0009-single-instance-boundary-for-operational-state.md
  • CONTEXT.md
  • AGENTS.md
  • server/ssh/errors.ts
  • docs/adr/0010-hermes-runtime-management-from-telegram-page.md
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/app.ts

Comment on lines +61 to +77
if (cachedMetrics && now - cachedMetrics.timestamp < METRICS_CACHE_TTL_MS) {
const metrics = cachedMetrics.data;
const status = getHealthTone(metrics);

return {
status,
updatedAt: new Date(now).toISOString(),
cpu: metrics.cpu,
memory: metrics.memory,
disk: metrics.disk,
uptime: metrics.uptime,
detail:
status === "warning"
? "One or more VPS resources are running hot."
: "The connected VPS is responding to live health checks.",
error: null,
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Preserve the real sample time on cached metrics.

Using now here makes stale-but-cached metrics look freshly collected on every request. The dashboard should report when the cached sample was actually read from the VPS.

🩹 Suggested fix
 	return {
 		status,
-		updatedAt: new Date(now).toISOString(),
+		updatedAt: new Date(cachedMetrics.timestamp).toISOString(),
 		cpu: metrics.cpu,
 		memory: metrics.memory,
 		disk: metrics.disk,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (cachedMetrics && now - cachedMetrics.timestamp < METRICS_CACHE_TTL_MS) {
const metrics = cachedMetrics.data;
const status = getHealthTone(metrics);
return {
status,
updatedAt: new Date(now).toISOString(),
cpu: metrics.cpu,
memory: metrics.memory,
disk: metrics.disk,
uptime: metrics.uptime,
detail:
status === "warning"
? "One or more VPS resources are running hot."
: "The connected VPS is responding to live health checks.",
error: null,
};
if (cachedMetrics && now - cachedMetrics.timestamp < METRICS_CACHE_TTL_MS) {
const metrics = cachedMetrics.data;
const status = getHealthTone(metrics);
return {
status,
updatedAt: new Date(cachedMetrics.timestamp).toISOString(),
cpu: metrics.cpu,
memory: metrics.memory,
disk: metrics.disk,
uptime: metrics.uptime,
detail:
status === "warning"
? "One or more VPS resources are running hot."
: "The connected VPS is responding to live health checks.",
error: null,
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/dashboard/metrics.ts` around lines 61 - 77, The cached branch
currently sets updatedAt to the request time `now`, which hides the real sample
time; change the cached return in the metrics routine to use the cached sample
timestamp (use `cachedMetrics.timestamp`) for `updatedAt` (formatted with `new
Date(...).toISOString()`), keep computing `status` with `getHealthTone(metrics)`
and all other fields as-is, and ensure `cachedMetrics.data` and
`cachedMetrics.timestamp` are used together so stale-but-cached metrics show
their original sample time rather than the current `now`.

Comment on lines +50 to +53
export function toAgentSummary(
serverRecord: Pick<ServerRecord, "status"> | null,
installRecord: InstallRecord | null,
): DashboardAgentSummary {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don't synthesize a fresh updatedAt for the offline fallback.

This branch changes updatedAt on every request even when nothing changed server-side. That makes the agent card look recently updated forever.

🩹 Suggested fix
 export function toAgentSummary(
-	serverRecord: Pick<ServerRecord, "status"> | null,
+	serverRecord: Pick<ServerRecord, "status" | "updatedAt"> | null,
 	installRecord: InstallRecord | null,
 ): DashboardAgentSummary {
@@
 	return {
 		status: "offline",
-		updatedAt: serverRecord.status ? new Date().toISOString() : null,
+		updatedAt: serverRecord.updatedAt.toISOString(),
 		detail: "The VPS is connected, but Hermes has not finished installing yet.",
 	};
 }

Also applies to: 95-99

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/dashboard/summaries.ts` around lines 50 - 53, The offline-fallback
branch in toAgentSummary (and the similar branch around lines 95-99) is
synthesizing a new updatedAt on every request; instead, stop generating a fresh
timestamp and preserve the existing updatedAt from installRecord (or leave it
undefined/null) when serverRecord is null. Locate the updatedAt assignment in
toAgentSummary (and the analogous code block later) and replace any fallback
like Date.now() / new Date() with installRecord?.updatedAt (or null) so we only
reflect true server-side updates.

Comment thread server/deploy.ts
return context.json({ error: "API key is required." }, 500);
}

const decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Catch decryptApiServerKey() failures here.

Unlike the bot token and provider key paths, this decryption can throw before the main deploy try, so a bad apiServerKey escapes the handler without the normal JSON error response.

🛡️ Proposed fix
-	const decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);
+	let decryptedApiServerKey: string;
+	try {
+		decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);
+	} catch {
+		return context.json({ error: "Failed to decrypt API server key." }, 500);
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);
let decryptedApiServerKey: string;
try {
decryptedApiServerKey = decryptApiServerKey(telegramInfo.apiServerKey);
} catch {
return context.json({ error: "Failed to decrypt API server key." }, 500);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/deploy.ts` at line 113, The call to
decryptApiServerKey(telegramInfo.apiServerKey) can throw outside the main deploy
try block, so wrap this call in a try/catch (or move it into the existing deploy
try) and handle failures the same way other credential errors are handled: catch
the error thrown by decryptApiServerKey, log the error, and return the JSON
error response (with the same status code and shape used elsewhere in this file)
instead of allowing the exception to escape; update references to
decryptedApiServerKey only after successful decryption.

Comment thread server/deploy.ts
Comment on lines +148 to +149
const configResult = await ssh.execCommand(
`docker exec hermes hermes config set model ${shellQuote(providerRecord.model)}`,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '--- surrounding deploy commands ---\n'
sed -n '144,152p' server/deploy.ts | cat -n

printf '\n--- docker command usage under server/ ---\n'
rg -n --type=ts 'sudo docker|docker exec|docker compose' server

Repository: jellydn/hermes-hub

Length of output: 158


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- surrounding deploy commands ---'
sed -n '130,175p' server/deploy.ts | cat -n

echo
echo '--- docker command usage under server/ ---'
rg -n --type=ts -S 'sudo docker exec|sudo docker compose|docker exec|docker compose' server

Repository: jellydn/hermes-hub

Length of output: 2922


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- deployComposeViaSsh definition ---'
rg -n --type=ts 'function deployComposeViaSsh|deployComposeViaSsh' server/deploy.ts server -S

Repository: jellydn/hermes-hub

Length of output: 508


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- deployComposeViaSsh definition/usages ---'
rg -n --type=ts 'deployComposeViaSsh' server -S

Repository: jellydn/hermes-hub

Length of output: 373


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- server/deploy.ts: deployComposeViaSsh body ---'
sed -n '1,120p' server/deploy.ts | cat -n

echo
echo '--- server/deploy.ts: around extraSshCommands usage ---'
sed -n '120,190p' server/deploy.ts | cat -n

echo
echo '--- any sudo docker exec usage under server/ ---'
rg -n --type=ts 'sudo docker exec' server

Repository: jellydn/hermes-hub

Length of output: 6872


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- locate ssh helpers ---'
rg -n --type=ts 'withSshConnection|execCommand|shellQuote|docker group|usermod -aG docker|sudo ' server/ssh.ts server -S

echo
echo '--- server/ssh.ts (head) ---'
sed -n '1,220p' server/ssh.ts | cat -n

Repository: jellydn/hermes-hub

Length of output: 9906


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- server/ssh/connection.ts ---'
sed -n '1,200p' server/ssh/connection.ts | cat -n

echo
echo '--- any mention of sudo in server/ssh/connection.ts ---'
rg -n --type=ts 'sudo ' server/ssh/connection.ts

Repository: jellydn/hermes-hub

Length of output: 2282


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- find buildHermesComposeContent definition ---'
rg -n --type=ts 'function buildHermesComposeContent|buildHermesComposeContent' server/compose.ts server -S

Repository: jellydn/hermes-hub

Length of output: 1373


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- server/compose.ts ---'
sed -n '1,220p' server/compose.ts | cat -n

Repository: jellydn/hermes-hub

Length of output: 1925


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- search for API_SERVER_MODEL_NAME usage in repo ---'
rg -n --type=ts 'API_SERVER_MODEL_NAME|config set model|hermes config set model' server src -S

Repository: jellydn/hermes-hub

Length of output: 415


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo '--- server/providers.test.ts around sudo docker compose and docker exec expectations ---'
sed -n '540,640p' server/providers.test.ts | cat -n

Repository: jellydn/hermes-hub

Length of output: 3699


Use sudo (or ensure Docker access) for docker exec in the deploy SSH flow

  • server/deploy.ts restarts Hermes with sudo docker compose ... but then runs docker exec hermes hermes config set model ... without sudo.
  • The compose file already sets API_SERVER_MODEL_NAME on container recreate, so a privilege-mismatch failure here can make the deploy endpoint throw and log provider.deploy.failed even though Hermes likely already restarted with the correct model.
  • Fix: make the docker exec consistent with the sudo docker compose approach (e.g., sudo docker exec ...) or remove/soften this extra config set model step if it’s redundant.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/deploy.ts` around lines 148 - 149, The deploy flow in server/deploy.ts
calls ssh.execCommand to run `docker exec hermes hermes config set model ...`
without sudo, causing privilege mismatch versus the earlier `sudo docker compose
...`; update the ssh.execCommand invocation that runs `docker exec hermes hermes
config set model ${shellQuote(providerRecord.model)}` to be consistent with the
other commands (e.g., prefix with `sudo` like `sudo docker exec ...`) or
remove/soften this post-restart `config set model` step if it's redundant
because the compose restart already sets API_SERVER_MODEL_NAME; locate the call
using the `ssh.execCommand` invocation and adjust it to use `sudo` or eliminate
the extra command.

Comment thread server/deploy.ts
Comment on lines +159 to +192
await db.insert(auditLogs).values({
userId: session.user.id,
action: "provider.deploy.succeeded",
details: {
provider: providerRecord.provider,
model: providerRecord.model,
serverId: serverRecord.id,
serverHost: serverRecord.host,
},
ipAddress,
});

return context.json({
status: "deployed",
provider: providerRecord.provider,
model: providerRecord.model,
serverHost: serverRecord.host,
});
} catch (error) {
const message = error instanceof Error ? error.message : "Deploy failed";

await db.insert(auditLogs).values({
userId: session.user.id,
action: "provider.deploy.failed",
details: {
provider: providerRecord.provider,
model: providerRecord.model,
serverId: serverRecord.id,
error: message,
},
ipAddress,
});

return context.json({ error: `Deploy failed: ${message}` }, 502);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't let audit inserts change the deploy outcome.

After deployComposeViaSsh() succeeds, the primary operation is already done. If either auditLogs insert throws, this handler returns 502 even though Hermes was updated, and the failure-log insert can also mask the original deploy error. Keep audit persistence best-effort and independent from the HTTP result.

As per coding guidelines, "Sequential writes are fine when primary data → audit log sequences are purely historical and the audit log's absence doesn't affect correctness. If the audit insert fails, the primary operation is still valid — no data divergence."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@server/deploy.ts` around lines 159 - 192, The audit log inserts around
deployComposeViaSsh() must be best-effort and must not change the HTTP outcome:
wrap the db.insert(auditLogs).values(...) calls (both the success-path insert
after deployComposeViaSsh() and the failure-path insert inside the catch) in
their own try/catch so any insert error is caught and logged (e.g., via
processLogger or console) but does not alter the response returned by
context.json; ensure the main success return (status: "deployed") is returned
even if the success audit insert fails, and ensure when handling a deploy error
you record the original deploy error message to the response and still attempt
the failed-audit insert but suppress/log any errors from that insert so they
never overwrite the original error handling.

Comment on lines +29 to +55
try {
const response = await fetch("/api/telegram/deploy", {
method: "POST",
});

const payload = (await response.json().catch(() => null)) as {
error?: string;
status?: string;
serverHost?: string;
} | null;

if (!response.ok) {
setError(payload?.error ?? "Deploy failed");
return;
}

const serverHost = payload?.serverHost ?? null;
onConfigChange({
...savedConfig,
deployedServerHost: serverHost,
});
setSuccessMessage(
`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
);
} finally {
setIsDeploying(false);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle rejected deploy requests explicitly.

If fetch() throws here, the spinner clears but the user never gets an error message. This makes transient network failures look like a no-op.

🩹 Suggested fix
 		try {
 			const response = await fetch("/api/telegram/deploy", {
 				method: "POST",
 			});
@@
 			setSuccessMessage(
 				`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
 			);
+		} catch (error) {
+			setError(error instanceof Error ? error.message : "Deploy failed");
 		} finally {
 			setIsDeploying(false);
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try {
const response = await fetch("/api/telegram/deploy", {
method: "POST",
});
const payload = (await response.json().catch(() => null)) as {
error?: string;
status?: string;
serverHost?: string;
} | null;
if (!response.ok) {
setError(payload?.error ?? "Deploy failed");
return;
}
const serverHost = payload?.serverHost ?? null;
onConfigChange({
...savedConfig,
deployedServerHost: serverHost,
});
setSuccessMessage(
`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
);
} finally {
setIsDeploying(false);
}
try {
const response = await fetch("/api/telegram/deploy", {
method: "POST",
});
const payload = (await response.json().catch(() => null)) as {
error?: string;
status?: string;
serverHost?: string;
} | null;
if (!response.ok) {
setError(payload?.error ?? "Deploy failed");
return;
}
const serverHost = payload?.serverHost ?? null;
onConfigChange({
...savedConfig,
deployedServerHost: serverHost,
});
setSuccessMessage(
`Bot token deployed to ${serverHost ?? "server"}. Hermes is restarting...`,
);
} catch (error) {
setError(error instanceof Error ? error.message : "Deploy failed");
} finally {
setIsDeploying(false);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/telegram/telegram-deploy-section.tsx` around lines 29 - 55, The
try/finally currently lets fetch() throw without reporting an error to the user;
wrap the await fetch and subsequent JSON parsing in a try/catch (or add a catch
block) so any thrown error sets setError(...) with a clear message (include
error.message) and returns before calling onConfigChange and setSuccessMessage;
ensure setIsDeploying(false) remains in the finally. Update the block that uses
fetch("/api/telegram/deploy") and the JSON parsing so exceptions trigger
setError instead of silently falling through.

Comment on lines +1 to +2
export const inputClassName =
"w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-none focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Find Tailwind version and relevant config
rg -n --hidden --no-ignore -S '"tailwindcss"|tailwindcss' package.json pnpm-lock.yaml yarn.lock package-lock.json 2>/dev/null || true
ls -la
# Show tailwind config files if present
fd tailwind.config.* -t f . || true
fd postcss.config.* -t f . || true
# Inspect the target file around the class definition
fd telegram-input-class.ts -t f src || true
if [ -f src/features/telegram/telegram-input-class.ts ]; then
  sed -n '1,120p' src/features/telegram/telegram-input-class.ts
fi
# Find other outline usage in the repo to gauge conventions
rg -n --hidden -S 'outline-none|outline-hidden|focus:ring' src || true

Repository: jellydn/hermes-hub

Length of output: 5571


🌐 Web query:

Tailwind CSS v4 outline-hidden outline-none focus ring double outline

💡 Result:

In Tailwind CSS v4, the utilities for managing outlines and focus indicators have been refined to clarify their behavior and improve accessibility [1][2]. Key Changes and Utilities - outline-none: In v4, this utility simply sets outline-style: none [3][1]. This is a departure from v3, where it was a more complex class that also set a 2px transparent outline for accessibility [1][2]. - outline-hidden: This is the v4 replacement for the "old" v3 behavior of outline-none [1][2]. It hides the default browser outline while preserving the outline in forced colors mode (e.g., Windows High Contrast Mode), making it the recommended choice for removing focus rings without sacrificing accessibility [3][2][4]. - outline-double: This is a standard utility class that sets the CSS outline-style to double [3]. Focus Rings and Accessibility Tailwind CSS provides two primary ways to create focus indicators: 1. Outline Utilities (e.g., outline-2, outline-blue-500, outline-offset-2): These use the native CSS outline property [3][5][6]. They are generally preferred for accessibility because they automatically respect forced-colors modes and user-defined system styles [5]. 2. Ring Utilities (e.g., ring-2, ring-offset-2): These are implemented using CSS box-shadows [7][5][6]. While highly customizable (supporting opacity, complex offsets, and inset shadows), they do not appear in forced-colors modes because box-shadows are typically ignored in those environments [5][2]. Recommendation For focus indicators, the official recommendation is to use the outline utilities if accessibility in forced-colors mode is a priority [5]. If you need to remove the default browser outline, use outline-hidden [2]. If you require a custom, highly styled focus indicator, the ring utilities are appropriate, provided you are aware of their limitations regarding forced-colors modes [5][2].

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Find Tailwind version and relevant config
rg -n --hidden --no-ignore -S '"tailwindcss"|tailwindcss' package.json pnpm-lock.yaml yarn.lock package-lock.json 2>/dev/null || true
ls -la
# Show tailwind config files if present
fd tailwind.config.* -t f . || true
fd postcss.config.* -t f . || true
# Inspect the target file around the class definition
fd telegram-input-class.ts -t f src || true
if [ -f src/features/telegram/telegram-input-class.ts ]; then
  sed -n '1,120p' src/features/telegram/telegram-input-class.ts
fi
# Find other outline usage in the repo to gauge conventions
rg -n --hidden -S 'outline-none|outline-hidden|focus:ring' src || true

Repository: jellydn/hermes-hub

Length of output: 5571


Use outline-hidden instead of outline-none for the custom focus ring
In Tailwind v4, outline-none sets outline-style: none and can hide the native outline even in forced-colors; outline-hidden hides the default outline while preserving forced-colors focus indication alongside your focus:ring-*.

🩹 Suggested fix
 export const inputClassName =
-	"w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-none focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";
+	"w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-hidden focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export const inputClassName =
"w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-none focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";
export const inputClassName =
"w-full rounded-full border border-[var(--chip-line)] bg-white/80 px-4 py-3 text-sm text-[var(--sea-ink)] outline-hidden focus:border-[color:var(--lagoon)] focus:ring-2 focus:ring-[rgba(79,184,178,0.18)]";
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/telegram/telegram-input-class.ts` around lines 1 - 2, The class
string exported as inputClassName currently uses "outline-none" which removes
native focus outlines even under forced-colors; update the string to use
"outline-hidden" instead of "outline-none" so the custom focus ring
(focus:ring-*) is preserved while allowing forced-colors focus indication to
remain; modify the inputClassName constant value to swap "outline-none" for
"outline-hidden".

Comment on lines +56 to +76
try {
const response = await fetch("/api/telegram/pairings");
const payload = (await response.json().catch(() => null)) as {
error?: string;
pairings?: TelegramPairingSummary;
} | null;

if (!response.ok || !payload?.pairings) {
if (!quiet) {
setError(payload?.error ?? "Unable to load pairings");
}
return;
}

setPairings(payload.pairings);
setLastLoadedAt(new Date());
} finally {
if (!quiet) {
setIsLoadingPairings(false);
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Catch polling and manual-refresh fetch failures.

loadPairings() only handles HTTP responses. If fetch("/api/telegram/pairings") rejects, that bubbles out of both the refresh button path and the interval callback, so users get no feedback and the poller throws unhandled rejections.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/telegram/telegram-pairing-section.tsx` around lines 56 - 76, The
loadPairings() function currently only handles non-OK HTTP responses but doesn't
catch network/fetch rejections; wrap the fetch and response.json() logic in a
try/catch (or add a catch around the await fetch) inside loadPairings() so any
thrown errors are caught, call setError(...) with a useful message (and include
error.message if available), ensure setIsLoadingPairings(false) runs in the
finally block, and return early on error; reference the existing loadPairings,
setError, setIsLoadingPairings, setPairings, and the
fetch("/api/telegram/pairings")/response.json() calls when applying the fix.

Comment on lines +90 to +113
try {
const response = await fetch("/api/telegram/pairings/approve", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ code }),
});

const payload = (await response.json().catch(() => null)) as {
error?: string;
approved?: { userId: string; userName: string };
} | null;

if (!response.ok || !payload?.approved) {
setError(payload?.error ?? "Unable to approve pairing");
return;
}

const displayName = payload.approved.userName || payload.approved.userId;
setPairingCode("");
setSuccessMessage(`Approved ${displayName || "Telegram user"}.`);
void loadPairings({ quiet: true });
} finally {
setIsApprovingPairing(false);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Catch rejected approve requests.

handleApprovePairing() has the same gap: rejected fetch("/api/telegram/pairings/approve") calls never hit setError(...), so approval failures can look like no-op network glitches. Handle rejected requests the same way as non-2xx responses.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/telegram/telegram-pairing-section.tsx` around lines 90 - 113,
The approve flow currently only handles non-2xx responses but not network/fetch
rejections; update handleApprovePairing to catch thrown errors from fetch and
call setError(...) with a helpful message (and optionally the error.message) so
rejected requests aren't silent. Specifically wrap the await fetch/response.json
logic in a try/catch (or add a catch after the promise) and in the catch call
setError(...) and return; keep the existing finally block that calls
setIsApprovingPairing(false) and ensure loadPairings, setPairingCode, and
setSuccessMessage are only called on successful approvals.

Comment on lines +27 to +47
try {
const response = await fetch("/api/telegram/test", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ message: testMessage }),
});

const payload = (await response.json().catch(() => null)) as {
error?: string;
response?: string;
} | null;

if (!response.ok) {
setTestError(payload?.error ?? "Test failed");
return;
}

setTestResponse(payload?.response ?? "(empty response)");
} finally {
setIsTesting(false);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface network failures in the test flow.

fetch() rejections skip testError, so a broken tunnel or offline browser just clears the loading state without telling the user what happened.

🩹 Suggested fix
 		try {
 			const response = await fetch("/api/telegram/test", {
 				method: "POST",
 				headers: { "content-type": "application/json" },
 				body: JSON.stringify({ message: testMessage }),
 			});
@@
 
 			setTestResponse(payload?.response ?? "(empty response)");
+		} catch (error) {
+			setTestError(error instanceof Error ? error.message : "Test failed");
 		} finally {
 			setIsTesting(false);
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try {
const response = await fetch("/api/telegram/test", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ message: testMessage }),
});
const payload = (await response.json().catch(() => null)) as {
error?: string;
response?: string;
} | null;
if (!response.ok) {
setTestError(payload?.error ?? "Test failed");
return;
}
setTestResponse(payload?.response ?? "(empty response)");
} finally {
setIsTesting(false);
}
try {
const response = await fetch("/api/telegram/test", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ message: testMessage }),
});
const payload = (await response.json().catch(() => null)) as {
error?: string;
response?: string;
} | null;
if (!response.ok) {
setTestError(payload?.error ?? "Test failed");
return;
}
setTestResponse(payload?.response ?? "(empty response)");
} catch (error) {
setTestError(error instanceof Error ? error.message : "Test failed");
} finally {
setIsTesting(false);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/features/telegram/telegram-test-section.tsx` around lines 27 - 47, The
try/finally around the fetch leaves network/fetch rejections unreported; add a
catch block around the fetch/response parsing to call setTestError with the
network error message (e.g., error?.message || String(error)) so broken
tunnels/offline browsers surface an error to the user, keep the existing
response.ok handling and setTestResponse logic, and retain the finally block
that calls setIsTesting(false); update the block that performs fetch(...) in
telegram-test-section.tsx (the async block using setTestError, setTestResponse,
setIsTesting) to implement this catch.

jellydn added 2 commits June 1, 2026 14:42
Older approval runs left pairing files owned by root inside the Hermes
container, making them unreadable by Telegram polling which runs as the
hermes user. Add a chown repair step before the Python pairing command
and run the pairing command with --user hermes to ensure approved users
stay readable.
…ship and ensure accessibility for Telegram gateway
@jellydn jellydn merged commit 2765fa1 into main Jun 1, 2026
6 of 7 checks passed
jellydn added a commit that referenced this pull request Jun 14, 2026
…onnect

Invert the requireHostKeyPin flag to allowMissingHostKeyPin. The secure
behavior (fail closed when no fingerprint is stored) is now the default.
Only verifyServerConnection (first-connect) opts out by passing
allowMissingHostKeyPin: true.

This eliminates requireHostKeyPin: true from 15 call sites and removes
the field from DeployComposeInput, ManagedComposeDeployInput, and
SshConnectionConfig intermediate types. A future caller that forgets
the flag is safe by default.

Also extracts parseOptionId helper and generic keepLatestBy to eliminate
copy-paste. Uses composition for DeployComposeInput (extends
SshConnectionInput instead of duplicating its fields).

Review fixes #1, #2, #4, #5.
jellydn added a commit that referenced this pull request Jun 16, 2026
* feat: add Telegram model/provider switch endpoint

Add POST /api/telegram/model-switch endpoint that allows switching the
AI model and/or provider on a deployed Hermes server via SSH.

What:
- New switchModelProvider handler in server/telegram.ts
- Accepts { model?, provider? } in request body
- Validates provider against known ApiProviderId values
- Validates model string format and provider-model compatibility
- Uses SSH to execute hermes config set commands on the remote server
- Writes audit logs for success and failure cases

Why:
- Closes #34
- Users currently need to use the web UI to change model/provider
- This enables programmatic switching (e.g., from Telegram bot commands)

How:
- Reuses existing setProviderModel/setProviderInferenceProvider from runtime.ts
- Reuses PROVIDER_ENV_CONFIGS for provider-to-hermes-provider mapping
- Follows same auth + SSH + audit log pattern as other Telegram endpoints
- 8 new tests covering auth, validation, SSH execution, and error handling

* fix: address thermo-nuclear review findings for model-switch endpoint

Fixes:
- CRITICAL: Silent no-op when provider mapping missing — now validates
  hermesProvider exists and returns 400 if not
- CRITICAL: Unsafe provider as ApiProviderId cast — now uses typed
  validatedProvider variable throughout
- MAJOR: Extract resolveTelegramSshContext helper to eliminate 30-line
  SSH boilerplate duplication
- MAJOR: Trim model before validation (not after) to handle whitespace

* fix: biome formatting in server/telegram.ts

* fix: biome formatting in renovate.json and server/telegram.test.ts

* fix: TS errors in switchModelProvider — SshAuthMethod type and status code cast

* fix: biome formatting — import order and multi-line context.json call

* fix: add switchModelProvider to telegram mock in app.test.ts

* chore: trigger CI

* fix: address PR review feedback for switchModelProvider

* feat(telegram): rewrite compose and restart container on model/provider switch

- Extract getModelValidationError helper for cleaner validation
- Validate model against subscription backends in addition to API providers
- Rewrite managed compose file and restart hermes container after SSH deploy
  so new provider env vars take effect without manual intervention
- Use cached pre-switch backend for DB persistence instead of re-fetching,
  ensuring DB is never updated if remote configuration fails
- Split subscription model persistence into credential vs oauth cases

* feat(telegram): add model-access-options API and rewrite model-switch to use optionId

- Add GET /api/telegram/model-access-options that returns the latest saved
  deployable option per backend (API providers, credential subscriptions,
  OAuth subscriptions) with opaque optionId: api-provider:<uuid>,
  credential-subscription:<uuid>, oauth-subscription:<uuid>
- Rewrite POST /api/telegram/model-switch to accept { optionId, model }
  payload, resolve the saved row, validate model against the backend, deploy
  via SSH compose, and activate the selected row while deactivating others
  in a DB transaction
- Add shared contracts for ModelAccessOption, ModelAccessOptionsResponse,
  and ModelSwitchPayload
- Add query helpers for listing and resolving saved deployable options
- Update backend tests for the new option-based switch API

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>

* feat(telegram): add model-access switcher to Telegram page UI

- Add TelegramModelAccessSection component that fetches model access options,
  shows active provider/model, lets users pick a saved backend and model
  (fixed dropdown or custom input), and switches via the API
- Wire section into TelegramSettings, shown only when Telegram is deployed
- Empty state links users to /ai-provider to save a config first
- Update frontend tests to provide model-access-options fetch mock

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>

* refactor(telegram): fix model-access queries, type safety, and extract model-switch service

- Fix ResolvedOption.provider type lie: change from ApiProviderId to string,
  remove unsafe as unknown as ApiProviderId casts
- Replace N+1 per-type DB queries with batch inArray queries in
  getModelAccessOptions (4+ queries → 3 total)
- Remove isDeployable pre-filter; fold decrypt into builders via
  decryptAndGetLast4 helper to avoid double decryption
- Change findActiveOptionIds to return typed { providerIds, subscriptionIds }
  so deactivation uses single UPDATE WHERE id IN (...) per table
- Extract switchModelProvider orchestration (SSH + compose + DB transaction)
  into server/telegram/model-switch.ts; handler is now a thin wrapper
- Refactor testTelegramBot to reuse resolveTelegramSshContext helper
- Replace dynamic import("../providers/config") with static import
- Remove unused ModelSwitchPayload export from shared contract

* refactor(telegram): replace useState with useReducer and fix react-doctor warnings

Convert TelegramModelAccessSection from 6 useState calls to a single
useReducer, which batches related state updates into one React render
instead of 6. The reducer groups form state (options, selection) and
UI state (loading, switching, messages) into typed actions.

Also switches from useEffect to useMountEffect for the initial data
fetch, matching the established pattern in telegram-pairing-section
and other components. This eliminates the react-doctor/no-event-handler
false positive — isDeployed is set by the parent based on external
server state, not a user event.

React Doctor score: 79 → 100/100.

* test(telegram): update switchModelProvider tests for extracted service

- Add executeModelSwitch mock for ./telegram/model-switch module
- Update activeOptionIds format from flat array to { providerIds, subscriptionIds }
- Update success assertions to verify executeModelSwitchMock args instead
  of checking SSH runtime calls directly (those are now tested in the
  service layer)
- Update SSH failure test to mock executeModelSwitch rejection

* refactor(dashboard): extract polling logic into useDashboardPolling hook

DashboardStatusOverview was 313 lines — just over the react-doctor
no-giant-component threshold of 300. The component already had good
subcomponent extraction (ServerInventoryCard, StatusCard, VpsHealthCard,
etc.), but ~125 lines of polling state machine logic (7 refs, 4 useState,
5 functions, 1 useMountEffect) was mixed into the component body.

Extract all polling machinery into a useDashboardPolling custom hook
that returns { snapshot, fetchState, fetchError, pollingPaused,
refreshStatus, handleManualRetry }. The main component drops to ~190
lines and focuses purely on rendering.

No behavior change — the polling logic is identical, just relocated.

React Doctor: no-giant-component warning resolved.

* refactor(providers): replace boolean props with explicit status enum in AccessSelectionActions

AccessSelectionActions took isSaving and isTesting as separate booleans,
creating an implicit 4-state system (idle, saving, testing, both) when
only 3 are valid. The \"both\" state is impossible in practice but the
type system allowed it.

Replace with status: \"idle\" | \"saving\" | \"testing\" — a single prop
that makes the valid states explicit and eliminates the dead 4th state
at the type level. Call sites derive the status from the existing
booleans with a ternary.

React Doctor: prefer-explicit-variants warning resolved.

Also: added react-doctor GitHub Actions CI workflow (npx react-doctor install).

* test(crypto): add unit tests for AES-256-GCM secret helpers

Pins round-trip, tamper rejection, malformed payload, missing key, and
all three decryptApiServerKey branches (empty, legacy plaintext,
encrypted round-trip, malformed-with-dot). Uses real node:crypto — no
mocking — since testing the actual implementation is the point.

Plan 001 from audit at 29c3b46.

* fix(agent-skills): fail skill install when remote download fails

The curl | sudo tee pipeline masked curl non-zero exit code because
in a shell pipeline only the last command exit status propagates. A
failed download (404, DNS, network drop) would write an empty SKILL.md
and report success.

Replace with download-to-temp + test -s + mv: curl writes to a .download
temp file, test -s verifies it is non-empty, then mv atomically moves it
to the final path. Any failure in the && chain aborts the install.

Plan 002 from audit at 29c3b46.

* fix(ssh): fail closed when host-key pin missing on credential ops

Add requireHostKeyPin flag to SshConnectionInput. When set, the
hostVerifier throws host_key_missing if no expectedFingerprint is
stored, instead of silently accepting any key (fail-open).

Every post-registration withSshConnection caller now passes
requireHostKeyPin: true. The first-connect flow (verifyServerConnection)
is unchanged — it still tolerates a missing fingerprint so the pin can
be stored on initial connection.

Covers 15 call sites across 13 files. Plan 003 from audit at 29c3b46.

* test(request-guards): cover auth and ownership boundary

Pins status codes for requireAuthSession (401), requireOwnedServer
(400 missing id, 401 unauthenticated, 404 not owned), and
requireOwnedServerSsh (400 SSH failure, 404 not owned). Asserts that
getOwnedServerRecord is called with the session user.id — the IDOR
guard. Safety net for the later guard-consolidation refactor (DEBT-01).

Plan 004 from audit at 29c3b46.

* test(server-records): cover credential resolution branches

Pins resolveServerCredential branches: stored-present, stored-missing,
session-missing-id, session-expired, session-valid. Covers
normalizeAuthMethod (password, ssh-key, unsupported) and both
resolveServerSshConfigOrError branches (ok + error). Mocks crypto and
credentials — no DB or real crypto needed.

Plan 005 from audit at 29c3b46.

* chore: mark all audit plans as DONE

* docs: add audit plan files

* refactor(ssh): default to requiring host-key pin, opt out for first-connect

Invert the requireHostKeyPin flag to allowMissingHostKeyPin. The secure
behavior (fail closed when no fingerprint is stored) is now the default.
Only verifyServerConnection (first-connect) opts out by passing
allowMissingHostKeyPin: true.

This eliminates requireHostKeyPin: true from 15 call sites and removes
the field from DeployComposeInput, ManagedComposeDeployInput, and
SshConnectionConfig intermediate types. A future caller that forgets
the flag is safe by default.

Also extracts parseOptionId helper and generic keepLatestBy to eliminate
copy-paste. Uses composition for DeployComposeInput (extends
SshConnectionInput instead of duplicating its fields).

Review fixes #1, #2, #4, #5.

* chore(plans): remove completed audit plan documents

All five audit plans (001-005) were fully implemented and verified:
- 001: crypto helper tests in server/crypto.test.ts
- 002: pipefail fix for skill install downloads
- 003: host-key pinning fail-closed on credential SSH ops
- 004: request guard tests in server/request-guards.test.ts
- 005: credential resolution tests in server/server-records.test.ts

These planning artifacts served their purpose and are now cleaned up.

* feat(server): handle SSH host-key errors on provider deploy endpoints

Catch SshConnectError with host_key_missing and host_key_mismatch
codes during deploy flow and return a structured 409 response with
observed fingerprint, algorithm, and (on mismatch) expected fingerprint.
Update test mocks to use importOriginal so SshConnectError is available
for new test assertions.

* feat(server): handle SSH host-key errors on Telegram model switch

Catch SshConnectError with host_key_missing and host_key_mismatch
during switchModelProvider and return a structured 409 response.
Add test coverage for both host-key error codes and the generic
SSH error fallback (502).

* feat(providers): add host-key trust UI for provider deploy

Surface host-key errors from the deploy endpoint as a warning panel
showing observed and expected fingerprints. Add "Trust host key and
retry" flow that POSTs to /api/servers/:id/host-key/accept then
re-attempts the deploy. Wire the new state, reducer actions, and
controller method into ProviderSettings.

* feat(server): extract shared host-key error response utility

Deduplicate the repetitive SSH host-key error handling that was inlined
across deploy endpoints. The new isRecoverableHostKeyError type guard and
hostKeyErrorResponse builder produce a consistent 409 JSON payload with
structured hostKey (observed fingerprint/algorithm, expected on mismatch).

* feat(client): add shared host-key recovery types and trust panel UI

Extract HostKeyErrorPayload type and parseHostKeyErrorPayload parser to a
shared module under features/servers. The HostKeyTrustPanel component
renders the host-key trust prompt (fingerprint display, trust/retry and
cancel buttons) so it can be reused across provider deploy and Telegram
model-switch views.

* refactor(server): deduplicate host-key error handling in deploy endpoints

Replace the duplicated host_key_missing / host_key_mismatch inline blocks
in deployProviderToHermes, deployToHermesAgent, and switchModelProvider
with the shared isRecoverableHostKeyError / hostKeyErrorResponse utilities.

On the client side, swap the inline host-key error parsing in
provider-access-actions for parseHostKeyErrorPayload and replace the
inline host-key alert UI in provider-settings-aside with the reusable
HostKeyTrustPanel component.

* refactor(telegram): extract ModelAccessForm sub-component

Pull the inline option selector, model picker, and switch/refresh buttons
into a self-contained ModelAccessForm component to reduce clutter in the
TelegramModelAccessSection render method.

* refactor(contracts,providers): extract shared host-key types and acceptHostKey action

Create shared/contracts/host-key-error.ts with canonical HostKeyErrorCode and HostKeyErrorResponsePayload types, replacing duplicate definitions on both sides of the boundary.\n\nAdd acceptHostKey() to provider-access-actions.ts as a reusable action for trusting server host keys, using the existing readJsonBody helper.\n\nRestore try/catch in handleTrustAndRetryDeploy to prevent unhandled rejections on network errors.\n\nUpdate server/lib/host-key-error-response.ts, src/features/servers/host-key-recovery.ts, and src/features/providers/provider-access-actions.ts to import from the shared contract.

* refactor(server): extract resolveSwitchOption branch resolvers

Split the large multi-branch resolveSwitchOption function in server/telegram/model-access.ts into three self-contained resolver functions (resolveApiProviderOption, resolveCredentialSubscriptionOption, resolveOAuthSubscriptionOption) with a switch-based dispatcher. Each resolver handles its own DB query, validation, and ResolvedOption construction. No behavioral changes.

* feat(telegram): add host-key trust-and-retry to model switch

Extract state machine and data fetching from telegram-model-access-section.tsx into use-model-access-controller.ts with host-key recovery support.\n\nThe hook now parses 409 host-key error responses via parseHostKeyErrorPayload and exposes a HostKeyTrustPanel flow (handleTrustAndRetrySwitch) instead of showing a generic error. This brings the Telegram model switch UX to parity with the AI provider deploy flow for host-key recovery.\n\nThe section component delegates to the hook and renders HostKeyTrustPanel when a host-key error is detected during model switching.

* chore(pre-commit): move typecheck hook to push stage

Move TypeScript typecheck from commit stage to push stage so partial staging is not blocked by type errors in unstaged files. Full typecheck still runs on every git push.

* fix(server): add host key error recovery to deploy and test endpoints

Add isRecoverableHostKeyError checks before generic 502 fallbacks in three server paths that were missing host key recovery:\n\n- deployTelegramToServer: initial Telegram bot deploy to a server\n- testTelegramBot: test-bot flow over SSH\n- deployToTelegramLinkedHermes: skills/agent deploy to Telegram-linked server\n\nWhen a server has hostKeyFingerprint NULL in the DB, SSH connections throw 'host key pin required but not stored'. These paths previously returned a generic 502; they now return a 409 with a hostKeyErrorResponse payload so the client can surface the trust-and-retry panel.\n\nPattern matches the existing handlers: switchModelProvider, deployToHermesAgent, deployProviderToHermes.

* feat(telegram): add host key trust panel to deploy and test sections

Add HostKeyTrustPanel with trust-and-retry flow to the Telegram deploy section (initial bot deploy) and test section (test-bot flow).\n\nBoth sections now parse 409 responses via parseHostKeyErrorPayload and surface a HostKeyTrustPanel when a host key error is detected. On trust, the host key is accepted via POST /api/servers/:id/host-key/accept and the operation retries automatically.\n\nState management consolidated into useReducer (with useRef for stale-closure avoidance) to match the pattern in use-model-access-controller.ts.\n\nBrings the Telegram deploy and test UIs to parity with the model switch and provider deploy flows for host key recovery.

* test(telegram): add host key error recovery integration tests

Add integration tests for host key recovery in deployTelegramToServer and testTelegramBot endpoints.\n\nVerifies that both endpoints return 409 with hostKey recovery payload when SSH fails with host_key_missing, instead of the previous generic 502. Confirms no audit log is written for recoverable errors and no deploy state is persisted.

* docs(api): document host key recovery flow and new endpoints

Add Host Key Recovery section to the API reference explaining the 409 response format, error codes (host_key_missing/host_key_mismatch), recovery flow, and list of affected endpoints.\n\nDocument new POST /api/servers/:id/host-key/accept endpoint.\n\nAdd previously undocumented Telegram endpoints: POST /api/telegram/deploy, POST /api/telegram/test, POST /api/telegram/model-switch.\n\nUpdate error response tables on all 8 deploy endpoints (telegram deploy/test/model-switch, providers deploy, persona/mcp/agent-skills deploy, web-ui deploy) to include 409 for host key errors.

* docs(glossary): add host key pin and recovery terminology

Add two glossary entries to CONTEXT.md:\n\n- Host Key Pin: explains the stored SHA256 fingerprint, its role in SSH verification, and the host_key_missing/host_key_mismatch error contract\n- Host Key Recovery: explains the 409 -> trust panel -> accept -> retry flow and lists all 8 endpoints that support it

---------

Co-authored-by: Hermes Agent <agent@nousresearch.com>
Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
@jellydn jellydn deleted the feat/telegram-provider branch June 16, 2026 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant