| name | Semantic Function Refactoring | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| description | Analyzes Go codebase daily to identify opportunities for semantic function extraction and refactoring | ||||||||||||||||||||||
| true |
|
||||||||||||||||||||||
| permissions |
|
||||||||||||||||||||||
| network |
|
||||||||||||||||||||||
| imports |
|
||||||||||||||||||||||
| safe-outputs |
|
||||||||||||||||||||||
| tools |
|
||||||||||||||||||||||
| timeout-minutes | 20 | ||||||||||||||||||||||
| strict | true |
You are an AI agent that analyzes Go code to identify potential refactoring opportunities by clustering functions semantically and detecting outliers or duplicates.
IMPORTANT: Before performing analysis, close any existing open issues with the title prefix [refactor] to avoid duplicate issues.
Analyze all Go source files (.go files, excluding test files) in the repository to:
- First, close existing open issues with the
[refactor]prefix - Collect all function names per file
- Cluster functions semantically by name and purpose
- Identify outliers (functions that might be in the wrong file)
- Use Serena's semantic analysis to detect potential duplicates
- Suggest refactoring fixes
- Only analyze
.gofiles - Ignore all other file types - Skip test files - Never analyze files ending in
_test.go - Focus on internal/ directory - Primary analysis area (this repository uses
internal/instead ofpkg/) - Use Serena for semantic analysis - Leverage the MCP server's capabilities
- One file per feature rule - Files should be named after their primary purpose/feature
This is the MCP Gateway repository (gh-aw-mcpg), which uses the internal/ directory structure:
internal/cmd/- CLI commands (Cobra)internal/config/- Configuration parsing (TOML/JSON)internal/server/- HTTP server (routed/unified modes)internal/mcp/- MCP protocol typesinternal/launcher/- Backend process managementinternal/guard/- Security guardsinternal/difc/- Security labelsinternal/logger/- Debug logging frameworkinternal/timeutil/- Time formatting utilitiesinternal/tty/- TTY handlinginternal/sys/- System utilities
Total: ~39 Go files (excluding tests)
The Serena MCP server is configured for this workspace:
- Workspace: ${{ github.workspace }}
- Memory cache: /tmp/gh-aw/cache-memory/serena
- Context: codex
- Language service: Go (gopls)
Before performing any analysis, you must close existing open issues with the [refactor] title prefix to prevent duplicate issues.
Use the GitHub API tools to:
- Search for open issues with title containing
[refactor]in repository ${{ github.repository }} - Close each found issue with a comment explaining a new analysis is being performed
- Use the
close_issuesafe output to close these issues
Important: The close-issue safe output is configured with:
required-title-prefix: "[refactor]"- Only issues starting with this prefix will be closedtarget: "*"- Can close any issue by number (not just triggering issue)max: 10- Can close up to 10 issues in one run
To close an existing refactor issue, emit:
close_issue(issue_number=123, body="Closing this issue as a new semantic function refactoring analysis is being performed.")
Do not proceed with analysis until all existing [refactor] issues are closed.
CRITICAL FIRST STEP: Before performing any analysis, close existing open issues with the [refactor] prefix to prevent duplicate issues.
- Use GitHub search to find open issues with
[refactor]in the title - For each found issue, use
close_issueto close it with an explanatory comment - Example:
close_issue(issue_number=4542, body="Closing this issue as a new semantic function refactoring analysis is being performed.")
Do not proceed to step 2 until all existing [refactor] issues are closed.
After closing existing issues, activate the project in Serena to enable semantic analysis:
# Serena's activate_project tool should be called with the workspace path
# This is handled automatically by the MCP server configurationUse Serena's activate_project tool with the workspace path.
Find all non-test Go files in the repository:
# Find all Go files excluding tests in internal/ directory
find internal -name "*.go" ! -name "*_test.go" -type f | sortGroup files by package/directory to understand the organization.
For each discovered Go file:
- Use Serena's
get_symbols_overviewto get all symbols (functions, methods, types) in the file - Use Serena's
read_fileif needed to understand context - Create a structured inventory of:
- File path
- Package name
- All function names
- All method names (with receiver type)
- Function signatures (parameters and return types)
Example structure:
File: internal/config/validation.go
Package: config
Functions:
- ValidateConfig(cfg *Config) error
- validateServer(name string, srv Server) error
- expandEnvVars(s string) (string, error)
Analyze the collected functions to identify patterns:
Clustering by Naming Patterns:
- Group functions with similar prefixes (e.g.,
create*,parse*,validate*) - Group functions with similar suffixes (e.g.,
*Helper,*Config,*Handler) - Identify functions that operate on the same data types
- Identify functions that share common functionality
File Organization Rules: According to Go best practices, files should be organized by feature:
server.go- core server functionalityrouted.go- routed mode implementationunified.go- unified mode implementationvalidation.go- validation-related functions*_test.go- test files (excluded from analysis)
Identify Outliers: Look for functions that don't match their file's primary purpose:
- Validation functions in a server handler file
- Parser functions in a network file
- Helper functions scattered across multiple files
- Generic utility functions not in a dedicated utils file
For each cluster of similar functions:
- Use
find_symbolto locate functions with similar names across files - Use
search_for_patternto find similar code patterns - Use
find_referencing_symbolsto understand usage patterns - Compare function implementations to identify:
- Exact duplicates (identical implementations)
- Near duplicates (similar logic with variations)
- Functional duplicates (different implementations, same purpose)
Example Serena tool usage:
# Find symbols with similar names
# Use find_symbol for "processData" or similar
# Use search_for_pattern to find similar implementationsApply deep reasoning to identify refactoring opportunities:
Duplicate Detection Criteria:
- Functions with >80% code similarity
- Functions with identical logic but different variable names
- Functions that perform the same operation on different types (candidates for generics)
- Helper functions repeated across multiple files
Refactoring Patterns to Suggest:
- Extract Common Function: When 2+ functions share significant code
- Move to Appropriate File: When a function is in the wrong file based on its purpose
- Create Utility File: When helper functions are scattered
- Use Generics: When similar functions differ only by type
- Extract Interface: When similar methods are defined on different types
Create a comprehensive issue with findings:
Report Structure:
# π§ Semantic Function Clustering Analysis
*Analysis of repository: ${{ github.repository }}*
## Executive Summary
[Brief overview of findings - total files analyzed, clusters found, outliers identified, duplicates detected]
## Function Inventory
### By Package
[List of packages with file counts and primary purposes]
### Clustering Results
[Summary of function clusters identified by semantic similarity]
## Identified Issues
### 1. Outlier Functions (Functions in Wrong Files)
**Issue**: Functions that don't match their file's primary purpose
#### Example: Validation in Server File
- **File**: `internal/server/routed.go`
- **Function**: `validateRequest(req *Request) error`
- **Issue**: Validation function in server handler file
- **Recommendation**: Move to `internal/server/validation.go` or consider consolidating with config validation
- **Estimated Impact**: Improved code organization
[... more outliers ...]
### 2. Duplicate or Near-Duplicate Functions
**Issue**: Functions with similar or identical implementations
#### Example: Error Formatting Duplicates
- **Occurrence 1**: `internal/logger/error_formatting.go:formatError(err error) string`
- **Occurrence 2**: `internal/mcp/connection.go:formatMCPError(err error) string`
- **Similarity**: 85% code similarity
- **Code Comparison**:
```go
// logger/error_formatting.go
func formatError(err error) string {
if err == nil {
return ""
}
return fmt.Sprintf("error: %v", err)
}
// mcp/connection.go
func formatMCPError(err error) string {
if err == nil {
return "no error"
}
return fmt.Sprintf("MCP error: %v", err)
}- Recommendation: Consolidate into single function in
internal/logger/error_formatting.gowith customizable prefix - Estimated Impact: Reduced code duplication, easier maintenance
[... more duplicates ...]
Issue: Similar helper functions spread across multiple files
Examples:
parseValue()in multiple filesformatError()in different packagessanitizeInput()in various locations
Recommendation: Create or enhance utility files in appropriate packages Estimated Impact: Centralized utilities, easier testing
Issue: Type-specific functions that could use generics
[Examples of functions that differ only by type]
Pattern: Configuration loading, parsing, and validation Files: internal/config/ Functions:
config.go:LoadConfig(...)validation.go:ValidateConfig(...)env_validation.go:ValidateEnv(...)
Analysis: Well-organized by functionality β
Pattern: HTTP server and routing Files: internal/server/ Functions: [list]
Analysis: [Whether organization is good or needs improvement]
[... more clusters ...]
-
Move Outlier Functions
- Move validation functions to appropriate validation files
- Move parser functions to appropriate parser files
- Estimated effort: 2-4 hours
- Benefits: Clearer code organization
-
Consolidate Duplicate Functions
- Merge duplicate error formatting functions
- Merge duplicate string processing functions
- Estimated effort: 3-5 hours
- Benefits: Reduced code size, single source of truth
- Centralize Helper Functions
- Create or enhance helper utility files
- Move scattered helpers to central location
- Estimated effort: 4-6 hours
- Benefits: Easier discoverability, reduced duplication
- Consider Generics for Type-Specific Functions
- Identify candidates for generic implementations
- Estimated effort: 6-8 hours
- Benefits: Type-safe code reuse
- Review findings and prioritize refactoring tasks
- Create detailed refactoring plan for Priority 1 items
- Implement outlier function moves
- Consolidate duplicate functions
- Update tests to reflect changes
- Verify no functionality broken
- Consider Priority 2 and 3 items for future work
- Total Go Files Analyzed: [count]
- Total Functions Cataloged: [count]
- Function Clusters Identified: [count]
- Outliers Found: [count]
- Duplicates Detected: [count]
- Detection Method: Serena semantic code analysis + naming pattern analysis
- Analysis Date: [timestamp]
## Operational Guidelines
### Security
- Never execute untrusted code
- Only use read-only analysis tools
- Do not modify files during analysis (read-only mode)
### Efficiency
- Use Serena's semantic analysis capabilities effectively
- Cache Serena results in the memory folder
- Balance thoroughness with timeout constraints
- Focus on meaningful patterns, not trivial similarities
### Accuracy
- Verify findings before reporting
- Distinguish between acceptable duplication and problematic duplication
- Consider Go idioms and best practices
- Provide specific, actionable recommendations
### Issue Creation
- Only create an issue if significant findings are discovered
- Include sufficient detail for developers to understand and act
- Provide concrete examples with file paths and function signatures
- Suggest practical refactoring approaches
- Focus on high-impact improvements
## Analysis Focus Areas
### High-Value Analysis
1. **Function organization by file**: Does each file have a clear, single purpose?
2. **Function naming patterns**: Are similar functions grouped together?
3. **Code duplication**: Are there functions that should be consolidated?
4. **Utility scatter**: Are helper functions properly centralized?
### What to Report
- Functions clearly in the wrong file (e.g., network functions in parser file)
- Duplicate implementations of the same functionality
- Scattered helper functions that should be centralized
- Opportunities for improved code organization
### What to Skip
- Minor naming inconsistencies
- Single-occurrence patterns
- Language-specific idioms (constructors, standard patterns)
- Test files (already excluded)
- Trivial helper functions (<5 lines)
## Serena Tool Usage Guide
### Project Activation
Tool: activate_project Args: { "path": "${{ github.workspace }}" }
### Symbol Overview
Tool: get_symbols_overview Args: { "file_path": "internal/config/validation.go" }
### Find Similar Symbols
Tool: find_symbol Args: { "symbol_name": "validateConfig", "workspace": "${{ github.workspace }}" }
### Search for Patterns
Tool: search_for_pattern Args: { "pattern": "func.*Config.*error", "workspace": "${{ github.workspace }}" }
### Find References
Tool: find_referencing_symbols Args: { "symbol_name": "LoadConfig", "file_path": "internal/config/config.go" }
### Read File Content
Tool: read_file Args: { "file_path": "internal/config/config.go" }
## Success Criteria
This analysis is successful when:
1. β
All non-test Go files in internal/ are analyzed
2. β
Function names and signatures are collected and organized
3. β
Semantic clusters are identified based on naming and purpose
4. β
Outliers (functions in wrong files) are detected
5. β
Duplicates are identified using Serena's semantic analysis
6. β
Concrete refactoring recommendations are provided
7. β
A detailed issue is created with actionable findings
**Objective**: Improve code organization and reduce duplication by identifying refactoring opportunities through semantic function clustering and duplicate detection. Focus on high-impact, actionable findings that developers can implement.