Skip to content

gpazevedo/spring-genai

Repository files navigation

Spring GenAI

A Spring AI chat agent built for Amazon Bedrock AgentCore. It uses Amazon Nova Lite via AWS Bedrock Converse API and exports traces and logs through the ADOT Java agent to X-Ray and CloudWatch Logs.

Prerequisites

  • Java 25 (Amazon Corretto recommended)
  • Docker
  • AWS CLI configured with valid credentials
  • Terraform >= 1.14.4 (for AgentCore deployment)
  • AWS account with Bedrock model access enabled for us.amazon.nova-lite-v1:0

Project Structure

src/main/java/com/gpazevedo/spring_genai/
  SpringGenaiApplication.java  # Spring Boot entry point
  InvocationsController.java   # POST /invocations — SSE streaming chat (AgentCore contract)
  PingController.java          # GET /ping — health check (AgentCore contract)
  InvocationRequest.java       # Request model: {"input": {"prompt": "..."}}
  JwtUtil.java                 # Extracts user ID from JWT (signature already validated by AgentCore)
  AwsCredentialsCheck.java     # Validates AWS credentials at startup
  JacksonCompatConfig.java     # Jackson ObjectMapper configuration
terraform/                     # Infrastructure as Code for AgentCore deployment
docker-compose.yml             # Local observability stack (Jaeger, Prometheus, Loki, Grafana)

Running Locally

1. Build

./gradlew build

2. Start the observability stack (optional)

docker compose up -d

This starts:

Service URL Purpose
OTel Collector localhost:4318 (HTTP) Receives telemetry
Jaeger http://localhost:16686 Traces
Prometheus http://localhost:9090 Metrics
Loki http://localhost:3100 Logs
Grafana http://localhost:3000 Dashboards (admin/admin)

3. Run the application

export AWS_ACCESS_KEY_ID=<your-key>
export AWS_SECRET_ACCESS_KEY=<your-secret>
export AWS_REGION=us-east-1
export SPRING_PROFILES_ACTIVE=local   # activates application-local.properties (OTLP → localhost:4318)

./gradlew bootRun

The app starts on port 8080.

4. Test locally

Health check:

curl http://localhost:8080/ping

Expected response:

{"status":"Healthy","time_of_last_update":1738972800}

Chat (SSE streaming):

curl -N -X POST http://localhost:8080/invocations \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "What is Spring AI?"}}'

Chat with session ID (conversation memory):

SESSION_ID=my-local-session-1

# First message
curl -N -X POST http://localhost:8080/invocations \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: $SESSION_ID" \
  -d '{"input": {"prompt": "My name is Alice."}}'

# Follow-up — the agent remembers the previous message
curl -N -X POST http://localhost:8080/invocations \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: $SESSION_ID" \
  -d '{"input": {"prompt": "What is my name?"}}'

Deploying to AgentCore

1. Build and push the container image to ECR

./build-and-push.sh

This script:

  • Builds the Spring Boot jar
  • Builds an ARM64 Docker image (cross-compiles with QEMU if needed)
  • Creates an ECR repository and pushes the image

2. Deploy infrastructure with Terraform

./deploy.sh           # normal deploy (update runtime in-place)
./deploy.sh --clean   # delete and recreate the runtime (kills all active sessions); automatically syncs runtime_id_suffix afterward

This provisions:

  • AgentCore Runtime — runs the container in an isolated microVM
  • Cognito User Pool — JWT/OAuth 2.0 authentication for inbound requests
  • IAM Role — permissions for Bedrock model invocation, ECR pull, CloudWatch, X-Ray
  • AgentCore Memory — conversation state (30-day expiry)
  • X-Ray — sampling rules, trace groups with anomaly detection, Transaction Search
  • CloudWatch log deliveries — APPLICATION_LOGS and TRACES vended to CloudWatch/X-Ray

After deployment, note the Terraform outputs:

cognito_user_pool_id     = "us-east-1_xxxxxxx"
cognito_client_id        = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
memory_id                = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
runtime_name             = "spring_genai_123456789012"
runtime_id               = "spring_genai_123456789012-xxxxxxxxxx"
runtime_arn              = "arn:aws:bedrock-agentcore:us-east-1:123456789012:runtime/spring_genai_123456789012-xxxxxxxxxx"
container_uri            = "123456789012.dkr.ecr.us-east-1.amazonaws.com/spring-genai-123456789012:latest"
xray_group_arn           = "arn:aws:xray:us-east-1:123456789012:group/spring-genai-production"
xray_sampling_rule_name  = "spring-genai-production"
xray_indexing_percentage = 100

3. Test the deployed agent

Get a Cognito JWT token:

USER_POOL_ID=$(cd terraform && terraform output -raw cognito_user_pool_id)
CLIENT_ID=$(cd terraform && terraform output -raw cognito_client_id)
RUNTIME_ARN=$(cd terraform && terraform output -raw runtime_arn)
REGION=us-east-1

# Create a test user (one-time)
aws cognito-idp admin-create-user \
  --user-pool-id $USER_POOL_ID \
  --username testuser \
  --temporary-password 'TempPass1!' \
  --message-action SUPPRESS \
  --region $REGION

aws cognito-idp admin-set-user-password \
  --user-pool-id $USER_POOL_ID \
  --username testuser \
  --password 'TestPass1!' \
  --permanent \
  --region $REGION

# Get JWT token (must use AccessToken — it contains the client_id claim that AgentCore validates)
TOKEN=$(aws cognito-idp initiate-auth \
  --client-id $CLIENT_ID \
  --auth-flow USER_PASSWORD_AUTH \
  --auth-parameters USERNAME=testuser,PASSWORD='TestPass1!' \
  --region $REGION \
  --query 'AuthenticationResult.AccessToken' \
  --output text)

Invoke the agent:

The runtime uses JWT/OAuth authentication (Cognito), so the AWS CLI (invoke-agent-runtime) cannot be used directly — it signs requests with SigV4 which causes an AccessDeniedException. Use curl with the Bearer token instead.

The AgentCore data plane API is:

POST https://bedrock-agentcore.{region}.amazonaws.com/runtimes/{agentRuntimeArn}/invocations
# URL-encode the ARN for use in the path
ENCODED_ARN=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$RUNTIME_ARN', safe=''))")
AGENTCORE_URL="https://bedrock-agentcore.${REGION}.amazonaws.com/runtimes/${ENCODED_ARN}/invocations"

# Simple invocation
curl -N -X POST "$AGENTCORE_URL" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"input": {"prompt": "Hello, what can you do?"}}'

# Generate a session ID (must be >= 33 chars; a UUID is 36)
SESSION_ID=$(uuidgen || cat /proc/sys/kernel/random/uuid)

# Invoke with a session ID for conversation memory
curl -N -X POST "$AGENTCORE_URL" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: $SESSION_ID" \
  -d '{"input": {"prompt": "My name is Alice."}}'

# Follow-up in the same session — the agent remembers context
curl -N -X POST "$AGENTCORE_URL" \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -H "X-Amzn-Bedrock-AgentCore-Runtime-Session-Id: $SESSION_ID" \
  -d '{"input": {"prompt": "What is my name?"}}'

4. Monitor

Check runtime status:

ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
RUNTIME_ID=$(cd terraform && terraform output -raw runtime_id)

aws bedrock-agentcore-control get-agent-runtime \
  --agent-runtime-id $RUNTIME_ID \
  --region $REGION

Tail application logs:

RUNTIME_ID=$(cd terraform && terraform output -raw runtime_id)

# Container stdout/stderr
aws logs tail "/aws/bedrock-agentcore/runtimes/${RUNTIME_ID}-DEFAULT" --follow

# OTEL structured logs (JSON, stream written by ADOT agent)
aws logs tail "/aws/bedrock-agentcore/runtimes/${RUNTIME_ID}-DEFAULT" \
  --log-stream-names otel-rt-logs --follow

AWS Console — Observability URLs:

RUNTIME_ID=$(cd terraform && terraform output -raw runtime_id)
REGION=us-east-1

LOG_GROUP=$(python3 -c "import urllib.parse; print(urllib.parse.quote('/aws/bedrock-agentcore/runtimes/${RUNTIME_ID}-DEFAULT', safe=''))")

echo "GenAI Observability (AgentCore agents):"
echo "  https://${REGION}.console.aws.amazon.com/cloudwatch/home?region=${REGION}#/gen-ai-observability/agent-core/agents"

echo "X-Ray Traces (spring-genai):"
echo "  https://${REGION}.console.aws.amazon.com/xray/home?region=${REGION}#/traces?filter=service%28%22spring-genai%22%29"

echo "CloudWatch Logs (container + OTEL):"
echo "  https://${REGION}.console.aws.amazon.com/cloudwatch/home?region=${REGION}#logsV2:log-groups/log-group/${LOG_GROUP}"

API Contract

Endpoint Method Purpose
/invocations POST Chat interaction (SSE streaming)
/ping GET Health check (delegates to Spring Actuator)

POST /invocations

Request:

{
  "input": {
    "prompt": "Your message here"
  }
}

Headers injected by AgentCore:

Header Purpose
Authorization Bearer <JWT> — user identity extracted for per-user conversation memory
X-Amzn-Bedrock-AgentCore-Runtime-Session-Id Session ID for conversation memory
X-Amzn-Trace-Id X-Ray trace propagation
traceparent / tracestate W3C trace context

Response: Content-Type: text/event-stream — Server-Sent Events with streamed text chunks.

GET /ping

Response:

{
  "status": "Healthy",
  "time_of_last_update": 1738972800
}

Returns Healthy when Spring Actuator health is UP, Unhealthy otherwise.

Configuration

Key properties in application.properties:

Property Default Override
OTLP endpoint (local) http://localhost:4318 management.opentelemetry.tracing.export.otlp.endpoint in application-local.properties
Bedrock region us-east-1 spring.ai.bedrock.aws.region
Model Amazon Nova Lite spring.ai.bedrock.converse.chat.options.model
Temperature 0.7 spring.ai.bedrock.converse.chat.options.temperature
Max tokens 1024 spring.ai.bedrock.converse.chat.options.max-tokens
Trace sampling 5% (local) / 100% (production) management.tracing.sampling.probability — 100% is suitable for testing with low traffic only; reduce for production at scale

Teardown

cd terraform
terraform destroy

This removes all AWS resources (AgentCore runtime, Cognito, IAM roles, X-Ray config). The ECR repository is not managed by Terraform — delete it manually if needed:

ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
aws ecr delete-repository \
  --repository-name "spring-genai-${ACCOUNT_ID}" \
  --force \
  --region us-east-1

About

Base Spring Boot 4, Spring AI 2 application with full Observability, deployed at AWS Bedrock AgentCore

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors