๐ค Voice Tools
Extract and profile specific voices from audio files using AI-powered speaker diarization and voice matching.
Choose a workflow below to get started.
๐ฅ Speaker Separation
Analyze multi-speaker audio files to automatically detect and separate individual speakers into separate audio streams.
Upload an audio file with multiple speakers, and this tool will:
- Detect all speakers automatically
- Separate each speaker's audio
- Export clean individual streams
๐ค Input File
โ๏ธ Speaker Detection Settings
๐ Results
๐พ Downloads
๐ Usage Tips
How to Use:
- Upload Audio: Select an M4A, WAV, or MP3 file with multiple speakers
- Configure Detection:
- Use min/max speakers for auto-detection (recommended)
- Or set exact speaker count if you know it
- Choose Output: Select format, sample rate, and bitrate
- Separate: Click the button and wait for processing
- Download: Get individual speaker files and a detailed report
Best Practices:
- Clear audio with distinct speakers works best
- If you know the exact speaker count, specify it for better results
- Processing time scales with file duration (expect ~2x realtime)
- M4A format provides best quality-to-size ratio
- For long files (>1 hour), expect several minutes of processing
Troubleshooting:
- If fewer speakers detected than expected, try increasing max_speakers
- If too many speakers detected, try increasing min_speakers
- For overlapping speech, the tool will assign to the dominant speaker
Extract a specific speaker from audio using a reference clip. Upload a short clip (3+ seconds) of the target speaker's voice, then upload the audio file to extract from.
Step 1: Upload Reference Clip
Step 2: Upload Target Audio
Step 3: Configure Parameters
Results
Examples
Voice Denoising
Remove silence and background noise from audio files using voice activity detection (VAD).
How it works:
- Upload an audio file
- Adjust VAD and silence thresholds
- Process the audio to remove unwanted segments
- Download the cleaned result
Tips:
- VAD Threshold: Higher values are more aggressive (remove more segments)
- Silence Threshold: Larger values keep longer silent gaps
- Min Duration: Filters out very short voice segments (reduces false positives)
Input
Parameters
Output
Upload an audio file to begin
Examples
Extract specific voices from audio files using a reference clip. Upload a reference voice clip and one or more audio files to extract matching voice segments.
๐ค Input Files
โ๏ธ Configuration
๐ Results
๐พ Downloads
๐ Examples
Quick Start Guide:
- Upload Reference Voice: A short, clear clip (5-30 seconds) of the voice you want to extract
- Upload Audio Files: One or more files to process (can be long recordings)
- Select Mode: Choose what to extract:
- Speech: Only spoken words and sentences
- Nonverbal: Sighs, laughs, moans, humming, etc.
- Both: Everything from the matched voice
- Start Extraction: Click the button and wait for results
- Download: Get individual segments or download everything as a ZIP
Tips for Best Results:
- Use a high-quality reference clip with minimal background noise
- Reference should contain only the target voice (no other speakers)
- Enable VAD optimization for faster processing of sparse audio
- Adjust voice threshold if you're getting too many/few matches