Skip to content

Commit 6332c79

Browse files
authored
fix: disable standalone SSE stream in streamable HTTP transport to prevent HTTP MCP server startup failures (#4428)
HTTP MCP servers (e.g. Unwrap) show `status: error` at gateway startup and advertise no tools to Copilot CLI, despite the endpoint being reachable. ## Root cause The SDK's `StreamableClientTransport` automatically issues a **GET request for a persistent SSE stream** synchronously inside `client.Connect()`, immediately after the `initialize` POST succeeds: 1. POST `initialize` → server responds ✓ 2. `sessionUpdated()` → `connectStandaloneSSE()` → GET request 3. Server returns 5xx (or any non-4xx/non-405) → `c.fail()` marks connection as broken 4. `client.Connect()` tries to send `initialized` notification → `c.Write()` sees failure → returns error 5. All three transports fail → `recordError()` → `/health` reports `status: error` Additionally, `MaxRetries: 0` in the SDK defaults to **5 retries**, not 0, contrary to the existing comment. ## Changes - **`internal/mcp/http_transport.go` / `connection.go`**: Set `DisableStandaloneSSE: true` and `MaxRetries: -1` on `StreamableClientTransport` in both `tryStreamableHTTPTransport` and `reconnectSDKTransport`. The gateway is a request-response proxy and has no use for server-initiated SSE notifications from backends. ```go return &sdk.StreamableClientTransport{ Endpoint: url, HTTPClient: httpClient, MaxRetries: -1, // Disable retries (-1 = 0 retries; SDK treats 0 as "use default: 5") DisableStandaloneSSE: true, } ``` - **`internal/mcp/http_connection_test.go`**: Adds regression test `TestNewHTTPConnection_StreamableTransport_BadSSEEndpoint` — server returns valid initialize result for POST but 500 for GET; connection must succeed via streamable transport with zero GET requests issued. > [!WARNING] > > <details> > <summary>Firewall rules blocked me from connecting to one or more addresses (expand for details)</summary> > > #### I tried to connect to the following addresses, but was blocked by firewall rules: > > - `example.com` > - Triggering command: `/tmp/go-build900524329/b513/launcher.test /tmp/go-build900524329/b513/launcher.test -test.testlogfile=/tmp/go-build900524329/b513/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build900524329/b410/vet.cfg g_.a o_.o x_amd64/vet 01.o roundrobin 03.o x_amd64/vet -W wkO_kTNLT .cfg x_amd64/vet . --gdwarf2 --64 x_amd64/vet` (dns block) > - `invalid-host-that-does-not-exist-12345.com` > - Triggering command: `/tmp/go-build900524329/b495/config.test /tmp/go-build900524329/b495/config.test -test.testlogfile=/tmp/go-build900524329/b495/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build900524329/b399/vet.cfg g_.a 9742532/b151/ x_amd64/vet --gdwarf-5 grpcsync p=/opt/hostedtoolcache/go/1.25.9/tmp/go-build511290287/b107/vet.cfg x_amd64/vet -W ABSZSCNGR .cfg x_amd64/vet . --gdwarf2 --64 x_amd64/vet` (dns block) > - `nonexistent.local` > - Triggering command: `/tmp/go-build900524329/b513/launcher.test /tmp/go-build900524329/b513/launcher.test -test.testlogfile=/tmp/go-build900524329/b513/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build900524329/b410/vet.cfg g_.a o_.o x_amd64/vet 01.o roundrobin 03.o x_amd64/vet -W wkO_kTNLT .cfg x_amd64/vet . --gdwarf2 --64 x_amd64/vet` (dns block) > - `slow.example.com` > - Triggering command: `/tmp/go-build900524329/b513/launcher.test /tmp/go-build900524329/b513/launcher.test -test.testlogfile=/tmp/go-build900524329/b513/testlog.txt -test.paniconexit0 -test.timeout=10m0s /tmp/go-build900524329/b410/vet.cfg g_.a o_.o x_amd64/vet 01.o roundrobin 03.o x_amd64/vet -W wkO_kTNLT .cfg x_amd64/vet . --gdwarf2 --64 x_amd64/vet` (dns block) > - `this-host-does-not-exist-12345.com` > - Triggering command: `/tmp/go-build3852249779/b001/mcp.test /tmp/go-build3852249779/b001/mcp.test -test.testlogfile=/tmp/go-build3852249779/b001/testlog.txt -test.paniconexit0 -test.run=TestNewHTTPConnection|TestHTTPConnection -test.v=true -test.timeout=1m0s 0/internal/tracetransform/instrumentation.go x_amd64/compile OUTPUT 9742532/b015/ 168.63.129.16 x_amd64/compile -w p/go-build security 64=/_/GOROOT OUTPUT 64/src/runtime/c-unsafeptr=false 168.63.129.16 git` (dns block) > - Triggering command: `/tmp/go-build900524329/b522/mcp.test /tmp/go-build900524329/b522/mcp.test -test.testlogfile=/tmp/go-build900524329/b522/testlog.txt -test.paniconexit0 -test.timeout=10m0s 9742�� olang.org/protob-errorsas .cfg x_amd64/vet --gdwarf-5 --64 -o x_amd64/vet -o 9742532/b408/_pk-s -trimpath x_amd64/vet -p g/grpc/binarylog/usr/bin/runc -lang=go1.25 x_amd64/vet` (dns block) > - Triggering command: `/tmp/go-build3289280713/b001/mcp.test /tmp/go-build3289280713/b001/mcp.test -test.testlogfile=/tmp/go-build3289280713/b001/testlog.txt -test.paniconexit0 -test.run=TestNewHTTPConnection -test.v=true -test.timeout=1m0s y bin/bash ntime.v2.task/mogit .cfg x_amd64/vet /usr/bin/runc.original --ve�� 30977974af8f49f3088e082efdf88695 runtime-runc/moby /usr/local/bin/iptables io.containerd.ru/bin/sh .cfg x_amd64/vet b83/log.json` (dns block) > > If you need me to access, download, or install something from one of these locations, you can either: > > - Configure [Actions setup steps](https://gh.io/copilot/actions-setup-steps) to set up my environment, which run before the firewall is enabled > - Add the appropriate URLs or hosts to the custom allowlist in this repository's [Copilot coding agent settings](https://github.com/github/gh-aw-mcpg/settings/copilot/coding_agent) (admins only) > > </details>
2 parents e9c3673 + fc18b1e commit 6332c79

3 files changed

Lines changed: 72 additions & 4 deletions

File tree

internal/mcp/connection.go

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -377,9 +377,10 @@ func (c *Connection) reconnectSDKTransport() error {
377377
switch c.httpTransportType {
378378
case HTTPTransportStreamable:
379379
transport = &sdk.StreamableClientTransport{
380-
Endpoint: c.httpURL,
381-
HTTPClient: headerClient,
382-
MaxRetries: 0,
380+
Endpoint: c.httpURL,
381+
HTTPClient: headerClient,
382+
MaxRetries: -1, // Disable retries; reconnect logic is handled by the gateway.
383+
DisableStandaloneSSE: true,
383384
}
384385
case HTTPTransportSSE:
385386
transport = &sdk.SSEClientTransport{

internal/mcp/http_connection_test.go

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ import (
55
"encoding/json"
66
"net/http"
77
"net/http/httptest"
8+
"sync/atomic"
89
"testing"
910
"time"
1011

@@ -582,3 +583,61 @@ func TestNewHTTPConnection_GettersAfterCreation(t *testing.T) {
582583
assert.Equal(t, expectedValue, returnedHeaders[key], "Header %s should match", key)
583584
}
584585
}
586+
587+
// TestNewHTTPConnection_StreamableTransport_BadSSEEndpoint verifies that the streamable
588+
// HTTP transport succeeds even when the server's SSE endpoint (GET) returns errors.
589+
//
590+
// This tests the fix for the Unwrap MCP server issue: some cloud API MCP servers
591+
// correctly respond to POST (initialize) but return 5xx or unexpected responses to GET
592+
// requests. Before the fix, the SDK's standalone SSE stream would call c.fail() on the
593+
// connection, causing the initialized notification to fail and the connection to be
594+
// reported as "error". With DisableStandaloneSSE: true, the GET is never issued and
595+
// the connection succeeds on the POST-only path.
596+
func TestNewHTTPConnection_StreamableTransport_BadSSEEndpoint(t *testing.T) {
597+
require := require.New(t)
598+
assert := assert.New(t)
599+
600+
var getMethodCount atomic.Int32
601+
602+
testServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
603+
if r.Method == http.MethodGet {
604+
// Simulate a server that returns 500 for the SSE GET stream.
605+
// Before the fix this would call c.fail() and break the connection;
606+
// after the fix the GET is never issued.
607+
getMethodCount.Add(1)
608+
w.WriteHeader(http.StatusInternalServerError)
609+
w.Write([]byte(`{"error":"internal server error"}`)) //nolint:errcheck
610+
return
611+
}
612+
613+
// POST: respond with a valid JSON-RPC initialize result.
614+
response := map[string]interface{}{
615+
"jsonrpc": "2.0",
616+
"id": 1,
617+
"result": map[string]interface{}{
618+
"protocolVersion": "2024-11-05",
619+
"serverInfo": map[string]interface{}{
620+
"name": "unwrap-mcp",
621+
"version": "1.0.0",
622+
},
623+
},
624+
}
625+
w.Header().Set("Content-Type", "application/json")
626+
w.Header().Set("Mcp-Session-Id", "unwrap-session-1")
627+
w.WriteHeader(http.StatusOK)
628+
json.NewEncoder(w).Encode(response) //nolint:errcheck
629+
}))
630+
defer testServer.Close()
631+
632+
conn, err := NewHTTPConnection(context.Background(), "unwrap", testServer.URL, map[string]string{
633+
"Authorization": "Bearer secret-token",
634+
}, nil, "", 0, 0)
635+
636+
require.NoError(err, "Connection must succeed even when the server's GET SSE endpoint returns 500")
637+
require.NotNil(conn)
638+
defer conn.Close()
639+
640+
assert.Equal(HTTPTransportStreamable, conn.httpTransportType, "Should use streamable transport")
641+
assert.Equal("unwrap-session-1", conn.httpSessionID, "Session ID should be captured from POST response")
642+
assert.Equal(int32(0), getMethodCount.Load(), "No GET requests should be issued (standalone SSE is disabled)")
643+
}

internal/mcp/http_transport.go

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -445,7 +445,15 @@ func tryStreamableHTTPTransport(ctx context.Context, cancel context.CancelFunc,
445445
return &sdk.StreamableClientTransport{
446446
Endpoint: url,
447447
HTTPClient: httpClient,
448-
MaxRetries: 0, // Don't retry on failure - we'll try other transports
448+
MaxRetries: -1, // Disable retries (-1 = 0 retries; SDK treats 0 as "use default: 5"). We try other transports on failure.
449+
// DisableStandaloneSSE prevents the SDK from issuing a GET request for a
450+
// persistent server-sent events stream immediately after initialization.
451+
// Some HTTP MCP servers (e.g. cloud APIs) return 5xx or keep the GET
452+
// request open indefinitely, which causes the SDK to call c.fail() and
453+
// break the connection before the gateway can send the initialized
454+
// notification. The gateway operates in request-response mode only and
455+
// does not need server-initiated messages, so this stream is unnecessary.
456+
DisableStandaloneSSE: true,
449457
}
450458
},
451459
keepAlive,

0 commit comments

Comments
 (0)