Skip to content

lleverage-ai/lleverage-active-monitoring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Lleverage Active Monitoring

Automated monitoring solution for the Lleverage platform that runs health checks every 15 minutes and sends Slack notifications when components fail or the platform is unreachable.

Features

  • βœ… Runs active monitoring checks every 15 minutes via Cloud Scheduler
  • πŸ”” Sends formatted Slack notifications on failures
  • πŸ“Š Monitors both test components and AI providers
  • 🚨 Alerts on platform outages or component failures
  • πŸ“ Beautiful Slack message formatting with emojis and structured blocks

Setup

Prerequisites

  1. Google Cloud Platform account with billing enabled
  2. Slack workspace with Incoming Webhooks enabled
  3. Node.js 14+ (for local testing)

1. Create a Slack Incoming Webhook

  1. Go to https://api.slack.com/apps
  2. Create a new app or select an existing one
  3. Navigate to "Incoming Webhooks"
  4. Activate Incoming Webhooks
  5. Click "Add New Webhook to Workspace"
  6. Select the channel where you want notifications
  7. Copy the webhook URL

2. Install Dependencies

npm install

3. Configure Environment Variables

Set the following environment variables:

export SLACK_WEBHOOK_URL="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
export SMOKE_TEST_API_URL="https://lqnc85.lleverage.run/li8yb6pr"  # Optional, defaults to this
export TEST_TEXT="Test"  # Optional: text to send in the API request
export TEST_FILE_PATH="/path/to/file"  # Optional: file to upload

4. Deploy to Google Cloud Functions

Option A: Using the Deployment Script

Make the script executable and run it:

chmod +x deploy.sh
export SLACK_WEBHOOK_URL="your-slack-webhook-url"
export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
./deploy.sh

Option B: Manual Deployment

# Set your project
gcloud config set project YOUR_PROJECT_ID

# Deploy the function
gcloud functions deploy activeMonitoring \
  --gen2 \
  --runtime=nodejs18 \
  --region=us-central1 \
  --source=. \
  --entry-point=activeMonitoring \
  --trigger-http \
  --allow-unauthenticated \
  --timeout=540s \
  --memory=256MB \
  --set-env-vars="SMOKE_TEST_API_URL=https://lqnc85.lleverage.run/li8yb6pr,SLACK_WEBHOOK_URL=your-slack-webhook-url"

# Create the scheduler job
gcloud scheduler jobs create http active-monitoring-scheduler \
  --location=us-central1 \
  --schedule="*/15 * * * *" \
  --uri="$(gcloud functions describe activeMonitoring --gen2 --region=us-central1 --format='value(serviceConfig.uri)')" \
  --http-method=POST \
  --time-zone="UTC" \
  --description="Runs Lleverage platform active monitoring every 15 minutes"

5. Test Locally

You can test the function locally before deploying:

export SLACK_WEBHOOK_URL="your-slack-webhook-url"
node index.js

How It Works

  1. Cloud Scheduler triggers the function every 15 minutes
  2. Cloud Function makes a POST request to the monitoring API
  3. Response Analysis:
    • If the request fails β†’ Sends critical failure notification
    • If any component returns false β†’ Sends component failure notification
    • If all tests pass β†’ No notification (silent success)
  4. Slack Notifications are formatted with:
    • Status indicators (🚨 for critical, ⚠️ for warnings)
    • Failed components list
    • Passing components summary
    • Timestamp and error details

Slack Notification Format

Critical Failure (Platform Down)

🚨 Lleverage Platform Active Monitoring - CRITICAL FAILURE
Status: ❌ Platform Unreachable
Error Details: [error message]

Component Failure

⚠️ Lleverage Platform Active Monitoring - Component Failure
Status: ⚠️ Some components failed
Failed Components:
  Tests: [component1], [component2]
  Providers: [provider1]
Passing Tests: βœ… [list]
Passing Providers: βœ… [list]

Monitoring

View Logs

gcloud functions logs read activeMonitoring --gen2 --region=us-central1 --limit=50

Test the Function Manually

curl -X POST "https://[YOUR-REGION]-[YOUR-PROJECT].cloudfunctions.net/activeMonitoring"

Check Scheduler Status

gcloud scheduler jobs describe active-monitoring-scheduler --location=us-central1

Configuration

Custom Schedule

Edit the schedule in deploy.sh or update the scheduler job:

gcloud scheduler jobs update http active-monitoring-scheduler \
  --location=us-central1 \
  --schedule="*/30 * * * *"  # Every 30 minutes instead

Common cron schedules:

  • Every 15 minutes: */15 * * * *
  • Every hour: 0 * * * *
  • Every day at 9 AM: 0 9 * * *

Environment Variables

Variable Required Default Description
SLACK_WEBHOOK_URL Yes - Slack incoming webhook URL
SMOKE_TEST_API_URL No https://lqnc85.lleverage.run/li8yb6pr API endpoint to test
TEST_TEXT No Test Text parameter for API request
TEST_FILE_PATH No - Optional file path to upload

Troubleshooting

Function Not Running

  1. Check Cloud Scheduler job status:

    gcloud scheduler jobs describe active-monitoring-scheduler --location=us-central1
  2. Check function logs for errors

  3. Verify environment variables are set correctly

Notifications Not Sending

  1. Verify SLACK_WEBHOOK_URL is correct
  2. Test the webhook URL manually:
    curl -X POST -H 'Content-type: application/json' \
      --data '{"text":"Test message"}' \
      YOUR_SLACK_WEBHOOK_URL

API Request Failing

  1. Check if the API endpoint is accessible
  2. Verify network connectivity from Cloud Functions
  3. Check API endpoint authentication if required

Costs

This solution uses:

  • Cloud Functions Gen 2: Pay per invocation (~$0.40 per million requests)
  • Cloud Scheduler: Free for up to 3 jobs per month, then $0.10 per job per month
  • Networking: Minimal egress costs

Estimated monthly cost: ~$0.50 - $2.00 for 15-minute intervals (2,880 invocations/month)

License

MIT

About

Active monitoring service for Lleverage

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors