This project provides a Mastra workflow system for database introspection and natural language to SQL conversion. It includes tools for analyzing database schemas, generating SQL queries from natural language descriptions, and executing queries safely.
- Database Introspection: Automatically analyzes PostgreSQL database schemas including tables, columns, relationships, and indexes
- Natural Language to SQL: Converts natural language queries into SQL using OpenAI's GPT models
- Schema Presentation: Generates human-readable documentation of database schemas
- Safe Query Execution: Only allows SELECT queries for security
- Workflow Integration: Built using Mastra workflows for orchestration and management
src/
βββ mastra/
β βββ agents/
β β βββ sql-agent.ts # SQL agent for query generation
β βββ tools/
β β βββ database-introspection-tool.ts # Database schema analysis
β β βββ database-seeding-tool.ts # Database seeding
β β βββ sql-generation-tool.ts # Natural language to SQL conversion
β β βββ sql-execution-tool.ts # Safe SQL query execution
β βββ workflows/
β β βββ database-query-workflow.ts # Main workflow orchestration
β βββ index.ts # Mastra instance configuration
Analyzes a PostgreSQL database to extract:
- Table structure and metadata
- Column definitions with types and constraints
- Primary key and foreign key relationships
- Index definitions
- Row counts for each table
Input: Database connection string Output: Complete schema information with summary statistics
Seeds databases with sample data for testing:
- Creates cities table with proper schema
- Imports data from CSV or generates sample data
- Handles batch insertions efficiently
- Returns seeding statistics and metadata
Input: Database connection string Output: Seeding results with record counts and success status
Converts natural language queries to SQL using OpenAI's GPT-4:
- Analyzes database schema context
- Generates optimized SELECT queries
- Provides confidence scores and explanations
- Lists assumptions and tables used
Input: Natural language query + database schema Output: SQL query with metadata and explanations
Safely executes SQL queries:
- Restricts to SELECT queries only
- Manages connection pooling
- Provides detailed error handling
- Returns structured results
Input: Connection string + SQL query Output: Query results or error information
The SQL Agent (sqlAgent
) now has the same capabilities as the workflow, providing a conversational interface for database operations:
const sqlAgent = mastra.getAgent('sqlAgent');
const result = await sqlAgent.generate(
[
{
role: 'user',
content: 'Connect to postgresql://user:password@localhost:5432/database and analyze the schema',
},
],
{ maxSteps: 5 },
);
const result = await sqlAgent.generate(
[
{
role: 'user',
content:
'Seed the database with comprehensive business data including companies, employees, projects, and skills',
},
],
{ maxSteps: 3 },
);
const result = await sqlAgent.generate(
[
{
role: 'user',
content: 'Show me the top 10 most populous cities in Europe',
},
],
{ maxSteps: 5 },
);
β Multi-tool Orchestration - Automatically uses the right tools for each task β Schema-Aware Queries - Understands database structure for accurate SQL generation β Safe Execution - Only allows SELECT queries with proper error handling β Conversational Interface - Natural language interaction with detailed explanations β Complete Workflow - Handles connection β seeding β introspection β querying β execution
The main workflow (databaseQueryWorkflow
) is a multi-step interactive workflow that performs:
- Suspends to collect database connection string from user
- Validates connection to ensure database is accessible
- Suspends to ask if user wants to seed database with sample data
- Creates cities table with sample data if requested
- Provides immediate data for testing and demonstration
- Automatically introspects database schema (tables, columns, relationships, indexes)
- Generates human-readable schema presentation
- Analyzes database structure and relationships
- Suspends to collect natural language query from user
- Shows database schema information to help user formulate queries
- Generates SQL query using AI with confidence scores and explanations
- Suspends to show generated SQL and get user approval
- Allows user to modify the SQL query if needed
- Executes the approved/modified query against the database
- Returns query results with metadata
Usage:
const workflow = mastra.getWorkflow('databaseQueryWorkflow');
const run = await workflow.createRunAsync();
// Start workflow (will suspend for connection string)
let result = await run.start({ inputData: {} });
// Step 1: Provide connection string
result = await run.resume({
step: 'get-connection',
resumeData: { connectionString: 'postgresql://...' },
});
// Step 2: Choose whether to seed database
result = await run.resume({
step: 'seed-database',
resumeData: { seedDatabase: true },
});
// Step 3: Database introspection happens automatically
// Step 4: Provide natural language query
result = await run.resume({
step: 'generate-sql',
resumeData: { naturalLanguageQuery: 'Show me top 10 cities by population' },
});
// Step 5: Review and approve SQL
result = await run.resume({
step: 'review-and-execute',
resumeData: {
approved: true,
modifiedSQL: 'optional modified query',
},
});
- Install Dependencies:
pnpm install
- Environment Setup:
Create a
.env
file with your database connection:
OPENAI_API_KEY=your-openai-api-key
- Only SELECT queries are allowed for security
- Connection strings should be securely managed
- The system uses connection pooling for efficiency
- All database operations are logged for audit trails
β Database Schema Introspection - Automatically analyzes database structure β Database Seeding - Optional sample data creation for testing and demos β Human-readable Documentation - Generates beautiful schema presentations β Natural Language to SQL - AI-powered query generation with explanations β Interactive Workflows - Multi-step suspend/resume for human-in-the-loop β Conversational Agent - Enhanced SQL agent with full workflow capabilities β SQL Review & Editing - User can approve or modify generated queries β Safe Query Execution - Only allows SELECT queries with result display β Multi-tool Orchestration - Agent automatically uses appropriate tools β Type Safety - Full TypeScript support with Zod validation β Error Handling - Comprehensive error management throughout workflow
The seeding tool now provides a comprehensive business dataset with realistic relationships:
- 5 Companies across different industries (Technology, Finance, Healthcare, etc.)
- 7 Office Locations with geographic distribution
- 14 Departments with budgets and head counts
- 20 Job Titles with career levels (Junior, Mid, Senior, Staff, Management)
- 20 Skills across programming languages, frameworks, and tools
- ~100-150 Employees with realistic salary distributions
- ~40-60 Projects with various statuses and budgets
- Relationships: Employee-skill mappings, project assignments, salary history
The enhanced dataset supports queries about:
- Employee hierarchies and reporting structures
- Skill distributions and proficiency levels
- Project team compositions and allocations
- Salary analysis and career progression
- Cross-company comparisons and analytics
- Geographic workforce distribution
- Department budgets and performance
- Employee-skill matching for projects
- Compensation history and trends
- Multi-table joins with complex relationships
Key dependencies:
@mastra/core
: Workflow orchestration@ai-sdk/openai
: AI integrationai
: AI SDK for structured generationpg
: PostgreSQL clientzod
: Schema validation
This project is part of the Mastra ecosystem and follows the same licensing terms.