A speech-to-text application that lets you dictate text instead of typing, built using Python, CustomTkinter, and OpenAI's Whisper API.
Sometimes I prefer dictating text rather than typing it. While Windows has a built-in speech-to-text feature, I have to change my system language first before I can speak in that language, which is inconvenient. After searching for alternatives, I came across WhisperTyping. It’s free for now, but according to their FAQ, they plan to charge users in the future. Since I already have an OpenAI account, I decided to create my own program instead—this way, I can keep using it without worrying about switching apps later on.
- Voice-to-Text Conversion: Record your voice and convert it to text using OpenAI's Whisper API
- Customizable Hotkeys: Set your own keyboard shortcut for recording
- Multiple Recording Modes: Choose between "hold" (record while pressing) or "toggle" (press once to start/stop)
- History Management: Access your previous transcriptions with one click
If you're not familiar with coding, you can simply download the .exe file here. And make sure you have OpenAI API Key.
But if you are comfortable with code, here's how to use the application:
- Python 3.10+
- OpenAI API Key (for Whisper speech-to-text service). Visit OpenAI API Keys for more information
- Internet connection (for API access)
-
Clone this repository:
git clone https://github.com/rivalarya/too-lazy-to-type.git cd too-lazy-to-type
-
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python main.py
On first run, you'll need to:
- Enter your OpenAI API key in the main window
- Set your preferred hotkey combination
- Choose your recording mode (hold or toggle)
All settings are automatically saved for future use.
/too-lazy-to-type/
├── main.py # Main entry point
├── config.json # Configuration file
├── recording.wav # Temporary recording file
├── requirements.txt # Project dependencies
├── README.md # This documentation
├── .gitignore # Git ignore file
├── services/ # Service modules
│ ├── __init__.py
│ └── transcription_service.py # Transcription handling with OpenAI
├── ui/ # UI related modules
│ ├── __init__.py
│ ├── main_window.py # Main application window
│ ├── minimized_main_window.py # A window for when the application is minimized
│ └── ui_helper.py # Helper functions for UI
└── utils/ # Utility modules
├── __init__.py
├── audio_recorder.py # Audio recording functionality
├── config_manager.py # Configuration handling
├── history_manager.py # History management
├── hotkey_manager.py # Keyboard shortcut management
└── paste_text_manager.py # Text pasting functionality
- Launch the application
- Enter your OpenAI API key and save it
- Press your configured hotkey to start recording
- Speak clearly into your microphone
- Release the hotkey (or press again in toggle mode) to stop recording
- The transcribed text will be automatically inserted at your cursor position
The default hotkey is ctrl+shift
, but you can change it to any combination:
- Enter your desired key combination in the "Record Hotkey" field
- Click "Apply Hotkey"
Examples:
ctrl+alt
alt+f
ctrl+shift+r
- Hold Mode: Records while you're holding down the hotkey
- Toggle Mode: Press once to start recording, press again to stop
All your transcriptions are saved automatically. To reuse a previous transcription:
- Click on any entry in the history panel to view the full text
- Use the "Copy" button to copy it to clipboard
If text doesn't paste correctly:
- Try adjusting your system's keyboard repeat rate
- Ensure no other application is capturing your keyboard input
- Check if the application you're pasting into has any input restrictions
If you receive an API key error:
- Verify your OpenAI API key is correct
- Check that your account has access to the Whisper API
- Ensure you have sufficient credits in your OpenAI account
If recording isn't working:
- Check your microphone settings in your OS
- Ensure your microphone is set as the default input device
- Try restarting the application
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper API for speech recognition
- CustomTkinter for the modern UI
- PyAudio for audio recording functionality
- keyboard for global hotkey support
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request