👥 Speaker Separation

Analyze multi-speaker audio files to automatically detect and separate individual speakers into separate audio streams.

Upload an audio file with multiple speakers, and this tool will:

Detect all speakers automatically
Separate each speaker's audio
Export clean individual streams

📤 Input File

Multi-Speaker Audio File

⚙️ Speaker Detection Settings

Minimum Speakers

Minimum number of speakers expected

1 10

Maximum Speakers

Maximum number of speakers expected

1 10

Exact Speaker Count (0 = auto-detect)

Set to non-zero to specify exact number

0 10

Output Format

m4a wav mp3

Sample Rate (Hz)

8000 48000

Bitrate

📊 Results

Status

💾 Downloads

📚 Usage Tips

How to Use:

Upload Audio: Select an M4A, WAV, or MP3 file with multiple speakers
Configure Detection:
- Use min/max speakers for auto-detection (recommended)
- Or set exact speaker count if you know it
Choose Output: Select format, sample rate, and bitrate
Separate: Click the button and wait for processing
Download: Get individual speaker files and a detailed report

Best Practices:

Clear audio with distinct speakers works best
If you know the exact speaker count, specify it for better results
Processing time scales with file duration (expect ~2x realtime)
M4A format provides best quality-to-size ratio
For long files (>1 hour), expect several minutes of processing

Troubleshooting:

If fewer speakers detected than expected, try increasing max_speakers
If too many speakers detected, try increasing min_speakers
For overlapping speech, the tool will assign to the dominant speaker

Extract a specific speaker from audio using a reference clip. Upload a short clip (3+ seconds) of the target speaker's voice, then upload the audio file to extract from.

Step 1: Upload Reference Clip

Reference Clip (3+ seconds of target speaker)

Reference Validation

Step 2: Upload Target Audio

Target Audio File

Step 3: Configure Parameters

Matching Threshold

Lower = stricter matching (0.0-1.0)

0 1

Minimum Confidence

Minimum confidence to include segments (0.0-1.0)

0 1

Results

Status

Extracted Audio

Examples

Voice Denoising

Remove silence and background noise from audio files using voice activity detection (VAD).

How it works:

Upload an audio file
Adjust VAD and silence thresholds
Process the audio to remove unwanted segments
Download the cleaned result

Tips:

VAD Threshold: Higher values are more aggressive (remove more segments)
Silence Threshold: Larger values keep longer silent gaps
Min Duration: Filters out very short voice segments (reduces false positives)

Input

Audio File

Parameters

VAD Threshold

Voice activity detection threshold (higher = more aggressive)

0 1

Silence Threshold (seconds)

Maximum silence duration to remove

0.5 5

Min Segment Duration (seconds)

Minimum voice segment length to keep

0.1 2

Output

Denoised Audio

Upload an audio file to begin

Examples

Extract specific voices from audio files using a reference clip. Upload a reference voice clip and one or more audio files to extract matching voice segments.

📤 Input Files

Reference Voice

⚙️ Configuration

Extraction Mode

Speech Nonverbal Both

📊 Results

Status

💾 Downloads

📚 Examples

Quick Start Guide:

Upload Reference Voice: A short, clear clip (5-30 seconds) of the voice you want to extract
Upload Audio Files: One or more files to process (can be long recordings)
Select Mode: Choose what to extract:
- Speech: Only spoken words and sentences
- Nonverbal: Sighs, laughs, moans, humming, etc.
- Both: Everything from the matched voice
Start Extraction: Click the button and wait for results
Download: Get individual segments or download everything as a ZIP

Tips for Best Results:

Use a high-quality reference clip with minimal background noise
Reference should contain only the target voice (no other speakers)
Enable VAD optimization for faster processing of sparse audio
Adjust voice threshold if you're getting too many/few matches

🎤 Voice Tools

👥 Speaker Separation

📤 Input File

⚙️ Speaker Detection Settings

📊 Results

💾 Downloads

📚 Usage Tips

Step 1: Upload Reference Clip

Step 2: Upload Target Audio

Step 3: Configure Parameters

Results

Examples

Voice Denoising

Input

Parameters

Output

Examples

📤 Input Files

⚙️ Configuration

📊 Results

💾 Downloads

📚 Examples