🔄 Translation and Transcription System

A powerful web application that converts audio to text and translates it into multiple languages using advanced AI models.

📋 Table of Contents

Overview
Application Interface
Features
Technologies
Installation
Usage
Supported Languages
How It Works
Links

🎯 Overview

This application provides an intuitive interface for real-time audio transcription and translation. Simply upload an audio file or record an audio, select your target language, and get both the transcribed text and its translation instantly. The app features a modern dark theme and works seamlessly on both desktop and mobile devices.

📱 Application Interface

Laptop Mode

Application interface in laptop mode showing the full desktop layout

Mobile View

Mobile interface showing the responsive design for smaller screens

✨ Features

Audio Transcription: Convert speech to text using OpenAI's Whisper model
Multi-language Translation: Translate transcribed text into 7 different languages
Real-time Processing: Live transcription and translation
Multiple Input Methods: Upload audio files or record directly through the interface
Responsive Design: Works perfectly on desktop and mobile devices
Dark Theme Interface: Modern, eye-friendly design
Format Support: WAV, MP3, M4A, FLAC audio formats

🛠️ Technologies

OpenAI Whisper: State-of-the-art speech recognition model
Gradio: Web interface framework for machine learning applications
Deep Translator: Translation library using Google Translate API
PyTorch: Deep learning framework for model execution
Python 3.7+: Core programming language

🚀 Installation

Prerequisites

Python 3.7 or higher
Git

Setup Instructions

Clone the repository

git clone https://huggingface.co/spaces/malimalikayesha/Transcription_and_Translation_App
cd Transcription_and_Translation_App

Create virtual environment

python -m venv venv
venv\Scripts\activate  # On Windows
# source venv/bin/activate  # On macOS/Linux

Install dependencies
```
pip install -r requirements.txt
```
Run the application
```
python app.py
```
Access the application
- Click on the link that appears in the terminal (typically http://127.0.0.1:7860)
- The interface will load in your browser
- Note: Transcription and translation may take a while when running on local CPU

🎮 Usage

Step-by-Step Guide

Upload/Record Audio
- Click "Drop Audio Here" or drag and drop your audio file or use the mic icon to record audio
- Supported formats: WAV, MP3, M4A, FLAC
- Or use the microphone icon to record directly
Select Target Language
- Choose your desired translation language from the dropdown
- Available options: English, Spanish, French, German, Chinese, Japanese, Urdu
Process & View Results
- The system automatically processes your audio
- View the original transcribed text in the left panel
- See the translated text in the right panel
- Use the "Clear" button to reset and start over

🌍 Supported Languages

Language	Code	Language	Code
English 🇺🇸	en	German 🇩🇪	de
Spanish 🇪🇸	es	Chinese (Simplified) 🇨🇳	zh-cn
French 🇫🇷	fr	Japanese 🇯🇵	ja
Urdu 🇵🇰	ur

⚙️ How It Works

Technical Process

Audio Processing
- Audio files are loaded and normalized to 30-second segments
- Converted to log-Mel spectrogram format for processing
Speech Recognition
- OpenAI Whisper "base" model processes the audio
- Generates high-accuracy text transcription
Language Translation
- Google Translator automatically detects the source language
- Translates the transcribed text to the selected target language
Real-time Display
- Both transcribed and translated texts are displayed simultaneously
- Results appear instantly as processing completes

🔗 Links

Live Demo: Hugging Face Space
Whisper Documentation: OpenAI Whisper
Gradio Documentation: Gradio Docs

Made with ❤️ by malimalikayesha

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔄 Translation and Transcription System

📋 Table of Contents

🎯 Overview

📱 Application Interface

Laptop Mode

Mobile View

✨ Features

🛠️ Technologies

🚀 Installation

Prerequisites

Setup Instructions

🎮 Usage

Step-by-Step Guide

🌍 Supported Languages

⚙️ How It Works

Technical Process

🔗 Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🔄 Translation and Transcription System

📋 Table of Contents

🎯 Overview

📱 Application Interface

Laptop Mode

Mobile View

✨ Features

🛠️ Technologies

🚀 Installation

Prerequisites

Setup Instructions

🎮 Usage

Step-by-Step Guide

🌍 Supported Languages

⚙️ How It Works

Technical Process

🔗 Links

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages