Real-Time Voice Translation Application for macOS

A powerful desktop application that captures live audio from video calls or system audio, converts speech to text using OpenAI Whisper, and translates it in real-time to your desired language.

🎯 Features

Live Audio Capture: Capture from microphone or system audio (video calls, music, etc.)
Offline Speech Recognition: Uses OpenAI Whisper for accurate transcription
Real-Time Translation: Instant translation using Google Translate
Multi-Language Support: 18+ languages including English, Spanish, French, German, Persian, Arabic, Chinese, Japanese, and more
User-Friendly Interface: Clean Tkinter-based GUI with real-time text display
macOS Optimized: Compatible with both Intel and Apple Silicon (M1/M2) Macs

📋 Prerequisites

System Requirements

macOS 10.15 (Catalina) or later
Python 3.8 or later
At least 4GB RAM (8GB recommended for better performance)
Internet connection (for translation service)

Required Software

Python 3.8+: Download from python.org
Homebrew: Install from brew.sh
Xcode Command Line Tools: Run xcode-select --install

🛠️ Installation Guide

Step 1: Clone or Download the Project

# If using git
git clone <repository-url>
cd voice-translator

# Or download and extract the files to a folder

Step 2: Install System Dependencies

# Install Homebrew (if not already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install PortAudio for audio processing
brew install portaudio

# Install FFmpeg (required by Whisper)
brew install ffmpeg

Step 3: Set Up Python Environment

# Create a virtual environment
python3 -m venv venv

# Activate the virtual environment
source venv/bin/activate

# Upgrade pip
pip install --upgrade pip

Step 4: Install Python Dependencies

# Install all required packages
pip install -r requirements.txt

# If you encounter issues with torch on Apple Silicon:
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu

Step 5: Install BlackHole (Optional - for System Audio Capture)

To capture audio from video calls (Zoom, Skype, FaceTime, etc.):

Download BlackHole from existential.audio/blackhole
Install the .pkg file
Configure Audio MIDI Setup:
- Open "Audio MIDI Setup" (Applications > Utilities)
- Create a "Multi-Output Device"
- Select both your speakers and BlackHole
- Set this as your default output device

🚀 Usage

Running the Application

# Make sure you're in the project directory and virtual environment is activated
source venv/bin/activate

# Run the application
python voice_translator.py

Using the Interface

Language Selection:
- Choose source language (or "Auto Detect")
- Select target language for translation
Audio Input:
- Microphone: Captures your voice directly
- System Audio: Captures all system audio (requires BlackHole setup)
Controls:
- Start Listening: Begin audio capture and processing
- Stop Listening: Stop the process
- Clear Text: Clear both text areas
Real-Time Display:
- Left panel shows original transcribed text
- Right panel shows translated text
- Timestamps are included for each entry

Tips for Best Results

Audio Quality: Ensure clear audio input with minimal background noise
Speaking Pace: Speak clearly and at a moderate pace
Language Detection: Use "Auto Detect" for mixed-language conversations
System Audio: Make sure BlackHole is properly configured for video call capture

🔧 Configuration Options

Supported Languages

Source Languages: Auto-detect, English, Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, Korean, Arabic, Persian, Turkish, Dutch, Swedish, Norwegian, Danish, Finnish
Target Languages: All of the above except Auto-detect

Audio Settings

Sample Rate: 16kHz (optimized for speech)
Chunk Duration: 3 seconds (adjustable in code)
Processing: Real-time with minimal latency

🐛 Troubleshooting

Common Issues

1. "No module named 'whisper'" Error

pip install openai-whisper

2. Audio Device Not Found

Check that your microphone/audio device is connected
Grant microphone permissions in System Preferences > Security & Privacy

3. BlackHole Not Working

Restart the application after installing BlackHole
Check Audio MIDI Setup configuration
Ensure BlackHole is set as input device

4. Translation Errors

Check internet connection
Try different source/target language combinations
Restart the application if translation service becomes unresponsive

5. Performance Issues

Close other resource-intensive applications
Use "base" Whisper model (default) for better performance
Consider upgrading RAM if processing is slow

macOS Permissions

The app may request permissions for:

Microphone Access: Required for audio capture
Accessibility: May be needed for system audio capture

Grant these permissions in System Preferences > Security & Privacy.

🔄 Updates and Improvements

Potential Enhancements

Offline Translation: Add support for offline translation models
Audio Recording: Save audio clips with translations
Custom Models: Support for specialized Whisper models
Hotkeys: Global shortcuts for start/stop
Themes: Dark mode and custom UI themes

Performance Optimization

Use smaller Whisper models ("tiny", "small") for faster processing
Adjust chunk duration based on your needs
Consider using GPU acceleration if available

📝 Technical Details

Architecture

Audio Capture: sounddevice library with real-time streaming
Speech Recognition: OpenAI Whisper (local processing)
Translation: Google Translate API via googletrans
GUI: Tkinter (cross-platform, included with Python)
Threading: Separate threads for audio processing and UI updates

File Structure

voice-translator/
├── voice_translator.py    # Main application
├── requirements.txt       # Python dependencies
├── README.md             # This file
└── venv/                 # Virtual environment (created during setup)

🆘 Support

If you encounter issues:

Check the troubleshooting section above
Ensure all dependencies are properly installed
Verify macOS permissions are granted
Try running with different language combinations

📄 License

This project is for educational and personal use. Please respect the terms of service for the translation services used.

Note: This application requires an internet connection for translation services. Speech recognition (Whisper) works offline once the model is downloaded.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
QUICKSTART.md		QUICKSTART.md
README.md		README.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
launcher.py		launcher.py
requirements.txt		requirements.txt
setup.sh		setup.sh
voice_translator.py		voice_translator.py
voice_translator_advanced.py		voice_translator_advanced.py
voice_translator_settings.json		voice_translator_settings.json

Folders and files

Latest commit

History

Repository files navigation

Real-Time Voice Translation Application for macOS

🎯 Features

📋 Prerequisites

System Requirements

Required Software

🛠️ Installation Guide

Step 1: Clone or Download the Project

Step 2: Install System Dependencies

Step 3: Set Up Python Environment

Step 4: Install Python Dependencies

Step 5: Install BlackHole (Optional - for System Audio Capture)

🚀 Usage

Running the Application

Using the Interface

Tips for Best Results

🔧 Configuration Options

Supported Languages

Audio Settings

🐛 Troubleshooting

Common Issues

macOS Permissions

🔄 Updates and Improvements

Potential Enhancements

Performance Optimization

📝 Technical Details

Architecture

File Structure

🆘 Support

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages