Go implementation of git-of-theseus, a tool for analyzing how a Git repository has evolved over time.
- π Analyze Git repository evolution with cohort analysis
- π₯ Track code ownership and authorship over time
- π Visualize code survival rates
- ποΈ Support for file extension and directory analysis
- π Domain-based contributor analysis
- β‘ Parallel processing for improved performance
- ποΈ Clean Architecture design
go install github.com/git-of-theseus-go@latest
Or build from source:
git clone https://github.com/shiv3/git-of-theseus-go.git
cd git-of-theseus-go
go build
# Analyze current repository
git-of-theseus-go
# Analyze specific repository
git-of-theseus-go /path/to/repo
# Analyze specific branch
git-of-theseus-go /path/to/repo --branch develop
[repo-path]
: Path to the Git repository (default: current directory)
--branch, -b string
: Branch to analyze--interval string
: Minimum time between commits to analyze- Human-readable formats:
7d
,2w
,1m
,1y
,336h
- Raw seconds:
604800
(backward compatible) - Default:
7d
(7 days)
- Human-readable formats:
--since string
: Analyze commits since this date (YYYY-MM-DD)--until string
: Analyze commits until this date (YYYY-MM-DD)--max-commits int
: Maximum number of commits to analyze
--only string
: Only analyze files matching patterns (comma-separated)- Example:
"*.go,*.js"
- Example:
--ignore string
: Ignore files matching patterns (comma-separated)- Example:
"*_test.go,vendor/*"
- Example:
--procs int
: Number of parallel processes (default: 1)--quiet, -q
: Suppress progress output
The --interval
flag accepts various time formats:
# Days
git-of-theseus-go --interval 7d # 7 days
git-of-theseus-go --interval "14 days" # 14 days
# Weeks
git-of-theseus-go --interval 2w # 2 weeks
git-of-theseus-go --interval "1 week" # 1 week
# Months (approximated as 30 days)
git-of-theseus-go --interval 1m # 1 month
git-of-theseus-go --interval "2 months" # 2 months
# Years (approximated as 365 days)
git-of-theseus-go --interval 1y # 1 year
# Hours
git-of-theseus-go --interval 336h # 336 hours (14 days)
# Raw seconds (for compatibility)
git-of-theseus-go --interval 604800 # 604800 seconds (7 days)
# Analyze with default 7-day intervals
git-of-theseus-go /path/to/repo
# Analyze with 2-week intervals
git-of-theseus-go /path/to/repo --interval 2w
# Analyze develop branch with monthly intervals
git-of-theseus-go /path/to/repo --branch develop --interval 1m
# Analyze commits from last year only
git-of-theseus-go --since 2024-01-01 --until 2024-12-31
# Analyze commits from the last 6 months
git-of-theseus-go --since 2024-07-01
# Analyze commits up to a specific date
git-of-theseus-go --until 2024-06-30
# Combine date range with custom interval
git-of-theseus-go --since 2024-01-01 --until 2024-12-31 --interval 2w
# Use 8 parallel workers for faster processing
git-of-theseus-go /path/to/repo --procs 8
# Limit to 100 commits for quick analysis
git-of-theseus-go /path/to/repo --max-commits 100
# Only analyze Go source files
git-of-theseus-go --only "*.go"
# Analyze JavaScript/TypeScript, ignore tests
git-of-theseus-go --only "*.js,*.ts,*.tsx" --ignore "*test*,*.spec.*"
# Ignore vendor and node_modules directories
git-of-theseus-go --ignore "vendor/*,node_modules/*"
The tool generates JSON files with analysis results:
File | Description |
---|---|
authors.json |
Lines of code per author over time |
cohorts.json |
Code survival by year of creation |
exts.json |
Distribution by file extensions |
dirs.json |
Distribution by directories |
domains.json |
Distribution by email domains |
survival.json |
Code survival statistics |
authors.json
:
{
"y": [[100, 150, 200], [50, 75, 100]],
"ts": ["2024-01-01", "2024-02-01", "2024-03-01"],
"labels": ["Alice", "Bob"]
}
The project follows Clean Architecture principles:
git-of-theseus-go/
βββ domain/ # Core business logic
β βββ entity/ # Domain entities
β βββ repository/ # Repository interfaces
βββ usecase/ # Application business rules
βββ infrastructure/ # External interfaces
β βββ git/ # Git operations
β βββ filesystem/ # File operations
βββ presentation/ # UI/CLI layer
βββ cli/ # Command line interface
- Native Git Integration: Uses native
git blame
for better performance - Incremental Analysis: Caches unchanged files between commits
- Parallel Processing: Concurrent analysis with configurable workers
- Smart Sampling: Time-based commit sampling for large repositories
- Optimized for Large Repos: Automatic file sampling for massive codebases
- Go 1.25 or later
- Git installed and accessible in PATH
- Read access to the target repository
This Go implementation maintains compatibility with the original Python version while offering:
- β‘ Significantly faster performance through parallelization
- π Native Git command integration
- πΎ Better memory management for large repositories
- ποΈ Clean Architecture for maintainability
- π Human-readable interval formats
- π
Date range filtering with
--since
and--until
options
Apache License 2.0
This is a Go reimplementation of git-of-theseus by Erik Bernhardsson.