Skip to content

Add support for multiple ClickHouse configurations #48

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

ZLBillShaw
Copy link

This PR adds the ability to support multiple ClickHouse configurations in the system. It introduces changes to config parsing and adjusts connection logic accordingly.

@CLAassistant
Copy link

CLAassistant commented May 20, 2025

CLA assistant check
All committers have signed the CLA.

@serprex
Copy link
Member

serprex commented Jun 5, 2025

This PR seems to contain changes outside the scope of its description

Multiconfig seems like a nonstandard convention which we would prefer not to go forward with. Could you describe your use case & the challenge you have adjusting environment variables based on your deployment environment?

@serprex serprex self-requested a review June 5, 2025 17:55
@ZLBillShaw
Copy link
Author

Thanks for the feedback!

Let me clarify the use case first:

In our deployment, we have multiple ClickHouse clusters across different environments and regions (e.g., per-tenant or per-datacenter deployments). We frequently need to query different clusters depending on the context, often in parallel or without restarting the collector.

The existing single-config model forces us to restart or redeploy with updated environment variables every time we want to switch ClickHouse targets, which is operationally inconvenient and not scalable for larger multi-tenant environments.

That's why we proposed a multi-config approach to support multiple configured ClickHouse targets that can be selected dynamically during runtime. This helps us run a single instance of the collector that can handle data routing to the appropriate ClickHouse without restarts or redeployments.

@serprex serprex requested a review from iskakaushik June 6, 2025 02:12
@ddobrinskiy
Copy link

ddobrinskiy commented Jun 11, 2025

I second the need for adding multiple clickhouse clusters as MCP servers.

Probably a better way would be to declare different clickhouse clusters as separate MCP servers (to keep things simple: 1 mcp server per 1 clickhouse instance)

Here's what I tried in my cursor config:

// mcp.json
{
  "mcpServers": {
    "mcp-clickhouse-A": {
      "comment": "Clickhouse MCP server for the A region",
      "command": "uv",
      "args": [ "run", "--with", "mcp-clickhouse", "--python", "3.13", "mcp-clickhouse"
      ],
      "env": {
        "CLICKHOUSE_HOST": "<HOST_A>",
        "CLICKHOUSE_PORT": "<PORT_A>",
        "CLICKHOUSE_USER": "<USER_A>",
        "CLICKHOUSE_PASSWORD": "<PASSWORD_A>"
      }
    },
    "mcp-clickhouse-B": {
      "comment": "Clickhouse MCP server for the B region",
      "command": "uv",
      "args": [ "run", "--with", "mcp-clickhouse", "--python", "3.13", "mcp-clickhouse"
      ],
      "env": {
        "CLICKHOUSE_HOST": "<HOST_B>",
        "CLICKHOUSE_PORT": "<PORT_B>",
        "CLICKHOUSE_USER": "<USER_B>",
        "CLICKHOUSE_PASSWORD": "<PASSWORD_B>"
      }
    }
  }
}

The issue is that since both server configs have the same args, they both end up running connected to instance A in the example below.

I'm not yet sure what's the correct approach (or fix) to enable this usecase. But I imagine it will cover the case for this PR, and will be worth adding to docs as an example

@ZLBillShaw
Copy link
Author

Thank you so much for your thoughtful feedback!

I completely understand the simplicity of “one MCP server per ClickHouse instance” and agree it makes configuration clear and easy in basic single-cluster setups.

However—when operating in environments that require accessing multiple logical clusters or replicas within a single MCP—spinning up a separate MCP server for each cluster becomes cumbersome and less efficient. Some key drawbacks include:

  1. Agent performance degradation — Multiple MCP servers mean multiple tools; the Agent must decide which to call, introducing complexity and potential latency.
  2. Resource inefficiency — Launching many MCP server processes consumes more memory, CPU, and operational resources (logging, monitoring, orchestration).

My PR aims to support multiple cluster connections in a single MCP server by letting the server hold a set of named clusters internally and route requests based on a cluster_alias (or similar) parameter in the Agent call. This approach:

  • Keeps config minimal—just one MCP server definition plus a mapping of names → connections.
  • Allows the Agent to choose a cluster at runtime, eliminating redundant tooling.
  • Saves resources and keeps observability centralized in one process.

Feel free to tweak any details (e.g., param names, links), and good luck with the merge!

@ZLBillShaw ZLBillShaw force-pushed the feature/remonxiao_add_multi_clickHouses_config branch from 42c7a13 to db49c69 Compare July 1, 2025 04:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants