Skip to content

Architecture

Vladimir Alexiev edited this page Aug 11, 2025 · 4 revisions

This page describes the high-level architecture and infrastructure for Statnett's Talk2PowerSystem project.

Overview

The Talk2PowerSystem provides a solution for simplifying power system analysis by enabling users to construct complex queries from natural language inputs. These queries are used to retrieve data and models based on the Common Information Model (CIM), stored in RDF format.

The system provides a chatbot API with a user interface where power system engineers and other stakeholders can converse with the system by simply asking questions and analyzing the responses. To accomplish this, the chatbot is powered by a Large Language Model, integrated via LangChain. This orchestration connects the LLM with other components of the system or with external systems that can enrich the responses, such as Cognite timeseries.

The chatbot and the LLM are responsible for interpreting user input, generating queries and executing them - making them a crucial part of the Talk2PowerSystem.

The high-level goals of the system are:

  • Generate complex queries for complex power system models via natural language inputs
  • Provide reliable and transparent answers by leveraging RDF
  • Integrate with additional systems to enrich the capabilities of the LLM and the quality of the answers

The key technologies used in the project are:

  • OpenAI GPT - Large language model used to generate natural language responses based on RDF and other structured or unstructured data
  • LangChain - Framework for orchestration of LLM services, tools, memory and other external systems
  • Azure AKS - Managed Kubernetes by Azure, providing infrastructure and automation for the Talk2PowerSystem services

The following GitHub repositories are part of the Talk2PowerSystem project:

  • Talk2PowerSystem - Main project repository that outlines the purpose of Talk2PowerSystem, the architecture and used datasets
  • Talk2PowerSystem_LLM - Repository for the chatbot's backend service and its LLM integrations via LangChain
  • Talk2PowerSystem_UI - Frontend application enabling users to interact with the Talk2PowerSystem through a conversational interface
  • Talk2PowerSystem_PM - tasks and project board

Context

The following C4 context diagram provides a high-level view of the Talk2PowerSystem, illustrating its purpose, key actors, and how the system interacts with external systems.

C4 Context

Breakdown

Core Systems

  • Talk2PowerSystem - The central software system that enables power system analytics over CIM-based RDF data, enhanced by Large Language Models (LLMs). Users can interact with the system through a chatbot interface to ask questions and retrieve insights.

Primary Actors and Key Interactions

  • Talk2PowerSystem Developer - A developer from Graphwise responsible for the full software development lifecycle (SDLC), including development and deployment of the LLM-based services.
  • Talk2PowerSystem Stakeholder - Users from Statnett or Graphwise who interact with the system to perform evaluations or conduct analysis using public data through the chatbot interface
  • Statnett Administrator - An administrator from Statnett who manages the infrastructure supporting the deployment and operation of Talk2PowerSystem

External Systems

  • OpenAI - Provides Generative Pre-trained Transformer LLM models (GPT) used to interpret natural language queries, create corresponding SPARQL requests, and generate meaningful, context-aware responses.
  • Statnett Timeseries - Supporting system that provides timeseries data for more comprehensive power system analytics

Components

The next C4 Container Diagram represents the internal architecture of the Talk2PowerSystem, showing how various software containers and external systems interact to deliver power system analytics via natural language interfaces. It breaks down the system into deployable units (containers), highlighting their responsibilities, communication paths, and the key actors involved.

C4 Container

Breakdown

Primary Actors

  • Talk2PowerSystem Developer - A developer from Graphwise responsible for the full software development lifecycle (SDLC), including development and deployment of the LLM-based services.
  • Talk2PowerSystem Stakeholder - Users from Statnett or Graphwise who interact with the system to perform evaluations or conduct analysis using public data through the chatbot interface
  • Statnett Administrator - An administrator from Statnett who manages the infrastructure supporting the deployment and operation of Talk2PowerSystem

Core Containers

  • Chatbot UI

    • Container: NGINX
    • Purpose: A user-facing web interface that enables stakeholders to engage with the LLM system via natural language queries. Communicates with the Chatbot Agent over HTTP.
    • See the Talk2PowerSystem_UI repository for more information
  • Chatbot Agent

    • Container: Python
    • Purpose: Core component of the Talk2PowerSystem, built using LangChain, which processes incoming requests from the UI via a REST API. It integrates with:
      • OpenAI for LLM interactions
      • RDF Data sources via SPARQL/GraphQL
      • Statnett Timeseries for enriched data
      • Chatbot Memory (Redis) for storing temporary session data
    • See the Talk2PowerSystem-Chat repository for more information
  • Python Environment

    • Container: JupyterHub
    • Purpose: Used by developers for local experimentation, development, and automation of deployment tasks by creating Python Notebooks. It accesses code from GitHub and interacts with other containers via HTTPS.
  • RDF Data

    • Container: GraphDB 11
    • Purpose: A triplestore that holds structured datasets (e.g., Nordic44 KG) in RDF format. Exposes SPARQL and GraphQL endpoints for querying. The Chatbot Agent pulls data from here to formulate accurate answers.
  • Chatbot Memory

    • Container: Redis
    • Purpose: Stores active and temporary chatbot session data to maintain conversational context.

External Systems

  • GitHub - Source control system storing public data and deployment manifests. Developers push/pull code here, and use it in the Python Environment
  • OpenAI - Provides Generative Pre-trained Transformer LLM models (GPT) used to interpret natural language queries, create corresponding SPARQL requests, and generate meaningful, context-aware responses.
  • Statnett Timeseries - Provides power system timeseries data, which is queried by the Chatbot Agent to enrich analytical responses

Data Flow

User Queries

The next sequence diagram illustrates the full flow of a user query through the Talk2PowerSystem — from natural language input to a data-enriched natural language response.

C4 Container

Key Behaviours and Clarifications

The Chatbot Agent orchestrates:

  • Conversation memory management, either in-memory or via Redis
  • Context handling - the agent is preconfigured with instructions that are provided to OpenAI as a context
  • Parsing natural language queries via OpenAI, where it decides which tools to use and what queries to provide to them
    • The Chatbot Agent is not generating any of the queries for GraphDB or Cognite, it relies on OpenAI to do that based on the context and the questions
  • Executing queries via its tools - GraphDB, Cognite timeseries, and others
  • Response composition via OpenAI
  • The timeseries data from Cognite is optionally queried, based on the questions from the users

See Talk2PowerSystem-Chat#tools for the full set of available tooling

Infrastructure & Deployment

The Talk2PowerSystem is deployed on Microsoft Azure, leveraging its robust cloud-native services to ensure scalability, security, and operational efficiency. The system is hosted on Azure Kubernetes Service (AKS), which orchestrates containerized workloads such as GraphDB, the Chatbot Agent together with its user interface, and supporting APIs. This architecture enables rapid deployment via Helm charts, simplifies infrastructure management, and provides a resilient foundation for delivering power system analytics using RDF data and advanced language models.

Azure Infrastructure

The following diagram presents a high-level architecture of the Talk2PowerSystem infrastructure deployed on Microsoft Azure:

Azure Overview

Breakdown

  • Azure Kubernetes Service - A managed Kubernetes platform that orchestrates and scales containerized applications across a cluster of virtual machines
    • VM Scale Sets - Compute resources in AKS that automatically scale and manage a group of identical virtual machines used to run Kubernetes node pools
    • Helm releases - Package deployments that simplify the installation and management of Kubernetes applications within AKS through reusable configuration charts
  • Azure Application Gateway - Web traffic load balancer that provides application-level routing, SSL termination, and Web Application Firewall (WAF) protection
  • Azure NAT Gateway - Network resource that enables outbound internet connectivity for Azure virtual networks while keeping internal resources private
  • Azure Managed Disks - High-performance, durable block storage volumes automatically managed by Azure, used for persistent data storage in virtual machines or Kubernetes pods
  • Azure Storage Account - Unified storage resource that provides scalable and secure storage for blobs, files, queues, and tables within Azure-based applications

RNDP Overview

The next diagram illustrates the deployment model of the Research and Development Platform (RNDP) hosted on Azure Kubernetes Service (AKS). The RNDP serves as a shared, secure, and scalable foundation for multiple research projects, such as Talk2PowerSystem.

Azure Overview

Key actors are as follows:

  • Statnett Administrators - Manage shared infrastructure, platform services, and RBAC (Role-Based Access Control) across all research projects.
  • Graphwise Developers - Deploy and configure workloads within dedicated namespaces of the RNDP environment.

GraphDB Deployment

This diagram illustrates the deployment architecture of GraphDB within the Research and Development Platform (RNDP) using a Helm chart on Azure Kubernetes Service (AKS). The deployment consists of core Kubernetes resources for configuration, persistent storage, secure communication, and scalable service management. Integration with Azure components, such as Application Gateway, NAT Gateway, and Managed Disks, provides production-grade ingress, networking, and durable storage. This setup enables projects such as Talk2PowerSystem to query and manage RDF data efficiently within an isolated, self-contained Kubernetes namespace.

Azure Overview

Clone this wiki locally