An intelligent Python tool that automatically resolves merge conflicts between English and translated versions of documentation by using OpenAI's GPT for translation analysis and generation. Supports 12+ languages.
- Automatic Conflict Detection: Scans your entire codebase for merge conflicts
- AI-Powered Translation: Uses OpenAI GPT to compare and translate content
- Multi-Language Support: Supports 12+ languages including French, Spanish, German, Japanese, Chinese, and more
- Parallel Processing: Process multiple conflicts concurrently with configurable workers and rate limiting
- Colored Diff Output: Real-time visualization of translations with before/after diffs
- Smart Resolution: Only translates when necessary, preserving code and technical terms
- Translation-First Approach: Keeps existing translations when they're close enough
- Safety First: Flags problematic translations for manual review instead of making mistakes
| Language | Code | Native Name |
|---|---|---|
| French | fr | Français |
| Spanish | es | Español |
| German | de | Deutsch |
| Portuguese | pt | Português |
| Italian | it | Italiano |
| Japanese | ja | 日本語 |
| Chinese | zh | 中文 |
| Korean | ko | 한국어 |
| Russian | ru | Русский |
| Arabic | ar | العربية |
| Dutch | nl | Nederlands |
| Polish | pl | Polski |
Run python3 src/main.py --list-languages to see all supported languages.
- Detection: Scans for merge conflict markers (
<<<<<<<,=======,>>>>>>>) - Analysis: For each conflict:
- Compares the English (incoming) and translated (current) versions
- Asks GPT if the translated version is an acceptable translation
- If close enough → keeps the existing translation
- If different → translates the English to your target language
- Validation: Ensures translations are in the correct language and don't contain obvious errors
- Resolution: Replaces conflict blocks with the appropriate translated text
- Safety: Leaves conflicts unresolved if translation fails or seems incorrect
# Navigate to the project directory
cd translation-manager
# Create a virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtThe script looks for an .env.local file with your OpenAI API key. It searches in:
- The codebase directory you specify
- The translation-manager directory
Your .env.local should contain:
OPENAI_API_KEY='your-api-key-here'# Activate virtual environment
source .venv/bin/activate
# Translate to French (default language)
python3 src/main.py --codebase-path /path/to/fr.react.dev
# Translate to Spanish
python3 src/main.py --codebase-path /path/to/es.react.dev --language spanish
# Translate to Japanese
python3 src/main.py --codebase-path /path/to/ja.react.dev -l ja# Required: Specify the codebase path
python3 src/main.py --codebase-path /path/to/your/repo
# Specify target language (default: french)
python3 src/main.py -p /path/to/repo --language german
python3 src/main.py -p /path/to/repo -l de
# List all supported languages
python3 src/main.py --list-languages
# Dry run (analyze but don't modify files)
python3 src/main.py -p /path/to/repo --dry-run
# Specify a custom .env.local location
python3 src/main.py -p /path/to/repo --env-file /path/to/.env.local
# Process only the first N files (useful for testing)
python3 src/main.py -p /path/to/repo --max-files 5
# Parallel processing (enabled by default)
python3 src/main.py -p /path/to/repo --workers 10 # Use 10 parallel workers
python3 src/main.py -p /path/to/repo --rate-limit 0.1 # 0.1s between API calls
# Force sequential processing (disables parallel)
python3 src/main.py -p /path/to/repo --sequential
# Combine options
python3 src/main.py -p ../es.react.dev -l spanish --dry-run --max-files 10 -w 5Translation Manager
============================================================
Codebase: /Users/you/es.react.dev
Target language: Spanish (es)
Processing: PARALLEL (5 workers, 0.2s rate limit)
Mode: DRY RUN (no files will be modified)
============================================================
Found 115 file(s) with merge conflicts.
Processing 234 conflicts across 115 files...
Using 5 parallel workers with 0.2s rate limit
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[1/115] learn/keeping-components-pure.md • Conflict 1/3
✓ KEPT EXISTING - Translation is close enough
Current:
// ¡Copia el array!
const storiesToDisplay = stories.slice();
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[1/115] learn/keeping-components-pure.md • Conflict 2/3
⟳ TRANSLATED
── Incoming (English) ──
- // Copy the array!
- const storiesToDisplay = stories.slice();
++ Resolved (Translated) ++
+ // ¡Copia el array!
+ const storiesToDisplay = stories.slice();
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[1/115] learn/keeping-components-pure.md • Conflict 3/3
✗ FAILED - Marked for manual review
Incoming (EN):
<SomeComplexComponent />
══════════════════════════════════════════════════════════════════════
PARALLEL PROCESSING COMPLETE
══════════════════════════════════════════════════════════════════════
Total conflicts: 234
✓ Kept existing: 150
⟳ Translated: 48
✗ Failed: 36
Time elapsed: 180.5s
Processing rate: 1.30 conflicts/sec
============================================================
SUMMARY
============================================================
Target language: Spanish (es)
Total files processed: 115
Total conflicts found: 234
Resolved automatically: 198
Need manual review: 36
Files modified: 98
The script is designed to be conservative and safe:
- Never outputs English: If translation fails, it keeps the conflict for manual review
- Never translates code: Only natural language text is translated; code stays in English
- Preserves formatting: Markdown, code blocks, and structure are maintained
- Validates output: Checks that translations appear to be in the target language
- Dry-run mode: Test without modifying any files
translation-manager/
├── src/
│ ├── main.py # Entry point and orchestration
│ ├── conflict_detector.py # Finds merge conflicts in files
│ ├── translation_checker.py # Wraps OpenAI translation checks
│ ├── openai_client.py # OpenAI API integration
│ ├── file_processor.py # Resolves conflicts in files
│ ├── parallel_processor.py # Parallel processing with colored diffs
│ └── utils/
│ ├── file_utils.py # File operations
│ └── diff_utils.py # Diff utilities
├── config/
│ └── rules.py # Language configs and rules
├── tests/
│ ├── test_conflict_detector.py
│ ├── test_translation_checker.py
│ └── test_openai_client.py
├── requirements.txt # Python dependencies
└── README.md # This file
- Python 3.8+
- OpenAI API key with GPT access
- Dependencies:
- openai>=2.0.0
- python-dotenv==0.17.1
- requests==2.25.1
Make sure you have a .env.local file with your API key in either the codebase directory or the translation-manager directory, or specify the path with --env-file.
Run python3 src/main.py --list-languages to see all supported languages. Use either the language name (e.g., "french") or the language code (e.g., "fr").
The script detected that GPT didn't properly translate to the target language. The conflict will be left for manual review.
If you hit rate limits:
- Use
--max-filesto process fewer files at once - Reduce parallel workers with
--workers 2 - Increase delay between calls with
--rate-limit 1.0 - Or use
--sequentialto disable parallel processing entirely
This script uses OpenAI's GPT API:
- Each conflict requires 1-2 API calls (comparison + translation if needed)
- With 115 files and ~200 conflicts, expect ~300-400 API calls
- Estimated cost: $2-5 per full run
- Parallel processing makes it faster but doesn't change the number of API calls
Use --dry-run and --max-files to control costs while testing.
To add support for a new language, edit config/rules.py and add an entry to SUPPORTED_LANGUAGES:
'your_language': {
'code': 'xx', # ISO 639-1 code
'name': 'Your Language',
'native_name': 'Native Name',
'accented_chars': ['á', 'é', ...], # Unique characters
'common_words': [' word1 ', ' word2 ', ...], # Common words with spaces
'string_indicators': ['indicator1', 'indicator2'], # Words in code strings
'keywords': ['kw1', 'kw2', ...], # Keywords for quick detection
}For non-Latin scripts, add char_ranges with Unicode code point ranges:
'char_ranges': [(0x0400, 0x04FF)], # Example: Cyrillic rangeMIT License