OwnVocalRemover

A high-performance audio source separation library for .NET that uses ONNX models to separate vocals and instrumental tracks from mixed audio files. I've been looking for a good vocal and music separation code in C# for a very long time that provides decent quality. Unfortunately, I could only find such code in Python, so I decided to create a pure Csharp vocal separator that would deliver the quality created by the Python code!

Features

ONNX Model Support: Works with pre-trained ONNX models for audio separation
GPU Acceleration: Automatic CUDA support with CPU fallback
Parallel Processing: Multi-threaded chunk processing with session pooling
Memory Management: Adaptive memory pressure monitoring
Chunked Processing: Handles large files by processing in configurable chunks
Noise Reduction: Optional denoising for improved separation quality
Batch Processing: Process multiple files efficiently
Progress Tracking: Real-time progress reporting with events
Auto-Configuration: Automatically detects model parameters from ONNX metadata

Dependencies

MathNet.Numerics - FFT operations
Microsoft.ML.OnnxRuntime - ONNX model inference
Ownaudio - Audio I/O operations
Microsoft.Extensions.ObjectPool - Session pooling for parallel processing

Support My Work

If you find this project helpful, consider buying me a coffee!

Quick Start

Basic Usage (Traditional Mode)

// Basic usage with included model
var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);
await service.InitializeAsync();

var result = await service.SeparateAsync("input_song.wav");
Console.WriteLine($"Vocals: {result.VocalsPath}");
Console.WriteLine($"Instrumental: {result.InstrumentalPath}");

service.Dispose();

Parallel Processing Mode

// Parallel processing for faster performance
var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);

var parallelOptions = new ParallelProcessingOptions
{
    MaxDegreeOfParallelism = 4,
    SessionPoolSize = 3,
    EnableMemoryPressureMonitoring = true
};

await service.InitializeParallelAsync(parallelOptions);
var result = await service.SeparateAsync("input_song.wav");

service.Dispose();

Configuration Options

SeparationOptions

ModelPath: Path to ONNX model file
OutputDirectory: Output directory for separated files
DisableNoiseReduction: Disable denoising (default: false)
Margin: Overlap margin for chunks (default: 44100 samples)
ChunkSizeSeconds: Chunk duration in seconds (0 = process entire file)
NFft: FFT size (default: 6144)
DimT: Temporal dimension parameter (default: 8)
DimF: Frequency dimension parameter (default: 2048)

ParallelProcessingOptions

MaxDegreeOfParallelism: Maximum concurrent chunks (0 = auto-detect)
SessionPoolSize: Number of ONNX sessions in pool (0 = auto-detect)
EnableMemoryPressureMonitoring: Monitor memory usage (default: true)
MemoryPressureThreshold: Memory threshold in bytes (default: 2GB)
ChunkQueueCapacity: Queue capacity for chunks (default: 10)

Usage Examples

Custom Configuration with Parallel Processing

var options = new SeparationOptions
{
    ModelPath = "",
	Model = InternalModel.Default,
    OutputDirectory = "output",
    ChunkSizeSeconds = 20,
    DisableNoiseReduction = false
};

var parallelOptions = new ParallelProcessingOptions
{
    MaxDegreeOfParallelism = 6,
    SessionPoolSize = 4,
    EnableMemoryPressureMonitoring = true,
    MemoryPressureThreshold = 3_000_000_000 // 3GB
};

var service = new AudioSeparationService(options);
await service.InitializeParallelAsync(parallelOptions);

System-Optimized Configuration

// Automatically configure based on system capabilitiesvar 
var (service, parallelOptions) = AudioSeparationFactory.CreateSystemOptimized(InternalModel.Default, @"output");
await service.InitializeParallelAsync(parallelOptions);

Progress Monitoring

service.ProgressChanged += (sender, progress) =>
{
    Console.WriteLine($"Progress: {progress.OverallProgress:F1}% - {progress.Status}");
    Console.WriteLine($"Chunks: {progress.ProcessedChunks}/{progress.TotalChunks}");
};

service.ProcessingStarted += (sender, file) =>
{
    Console.WriteLine($"Started processing: {file}");
};

service.ProcessingCompleted += (sender, result) =>
{
    Console.WriteLine($"Completed in {result.ProcessingTime}");
};

Batch Processing

var files = new[] { "song1.wav", "song2.wav", "song3.wav" };
var results = await service.SeparateMultipleAsync(files);

foreach (var result in results)
{
    Console.WriteLine($"Processed: {result.VocalsPath}");
}

Pre-configured Factory Methods

Mobile Optimized (Faster)

var service = AudioSeparationFactory.CreateMobileOptimized(
    InternalModel.Default, 
    "output", 
    disableNoiseReduction: true
);
await service.InitializeAsync(); // Traditional mode for mobile

Desktop Optimized (Better Quality)

var service = AudioSeparationFactory.CreateDesktopOptimized(
    InternalModel.Default, 
    "output"
);
await service.InitializeParallelAsync(); // Parallel mode for desktop

System-Optimized with Parallel Processing

var (service, parallelOptions) = AudioSeparationFactory.CreateSystemOptimized(
    InternalModel.Default, 
    "output"
);
await service.InitializeParallelAsync(parallelOptions);

Choosing the Right Model

For general use: Start with default model

var service = AudioSeparationFactory.CreateBatchOptimized(InternalModel.Default, "output");

For best quality: Use best model with desktop settings

var service = AudioSeparationFactory.CreateDesktopOptimized(InternalModel.Best, "output");

For karaoke creation: Use karaoke model

var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Karaoke);

For custom MDXNET models: Any compatible model works

var service = AudioSeparationExtensions.CreateDefaultService("models/custom_mdxnet.onnx");

Processing Modes

Traditional Mode

Single-threaded processing
Lower memory usage
Suitable for mobile/low-end devices
Initialize with InitializeAsync()

Parallel Processing Mode

Multi-threaded chunk processing
Higher performance on multi-core systems
Session pooling for better resource utilization
Memory pressure monitoring
Initialize with InitializeParallelAsync()

Supported Audio Formats

WAV (.wav)
MP3 (.mp3)
FLAC (.flac)

Output Files

The service generates two files per input:

{filename}_vocals.wav - Extracted vocals
{filename}_music.wav - Instrumental track

Error Handling

try
{
    var result = await service.SeparateAsync("input.wav");
}
catch (FileNotFoundException ex)
{
    Console.WriteLine($"File not found: {ex.Message}");
}
catch (InvalidOperationException ex)
{
    Console.WriteLine($"Service error: {ex.Message}");
}
catch (AggregateException ex) when (ex.InnerExceptions.Any())
{
    Console.WriteLine("Parallel processing errors occurred:");
    foreach (var innerEx in ex.InnerExceptions)
    {
        Console.WriteLine($"- {innerEx.Message}");
    }
}

Performance Tips

Processing Mode: Use parallel processing on multi-core systems
GPU Acceleration: Ensure CUDA is available for faster processing
Chunk Size: Adjust ChunkSizeSeconds based on available memory
Session Pool: Increase SessionPoolSize for better parallel performance
Memory Management: Enable memory pressure monitoring for large files
Noise Reduction: Disable for faster processing in batch scenarios

Memory Management

The parallel processing mode includes adaptive memory management:

Memory Pressure Monitoring: Automatically detects high memory usage
Garbage Collection: Forces GC under memory pressure
Throttling: Reduces parallelism when memory is constrained
Session Pooling: Efficient reuse of ONNX sessions

Statistics and Analysis

The SeparationResult includes audio statistics:

var stats = result.Statistics;
Console.WriteLine($"Vocals RMS: {stats.VocalsRMS:F4}");
Console.WriteLine($"Instrumental RMS: {stats.InstrumentalRMS:F4}");
Console.WriteLine($"Sample Rate: {stats.SampleRate} Hz");
Console.WriteLine($"Processing Time: {result.ProcessingTime}");

Included Models

The library comes with three pre-trained models:

DEFAULT model

Type: Basic instrumental separation
Quality: Good baseline performance
Use case: General purpose separation, fastest processing
Output: Clean vocals and instrumental tracks

BEST model

Type: High-quality instrumental separation
Quality: Superior separation accuracy
Use case: When quality is more important than speed
Output: High-fidelity vocals and instrumental tracks

Karaoke model

Type: Karaoke model (lead vocal removal)
Quality: Specialized for karaoke creation
Use case: Remove lead vocals while preserving backing vocals
Output: Lead vocals and music with backing vocals intact

Model Usage Examples

// Using the default model
var defaultService = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);

// Using the best quality model with parallel processing
var bestService = AudioSeparationExtensions.CreateDefaultService(InternalModel.Best);
await bestService.InitializeParallelAsync();

// Using the karaoke model
var karaokeService = AudioSeparationExtensions.CreateDefaultService(InternalModel.Karaoke);

MDXNET Model Support

The library is fully compatible with any MDXNET model:

// Using custom MDXNET model with parallel processing
var mdxService = AudioSeparationExtensions.CreateDefaultService("models/my_mdxnet_model.onnx");
await mdxService.InitializeParallelAsync(); // Auto-detects model parameters

Model Requirements

ONNX models should:

Accept input shape: [batch, 4, frequency, time]
Output same shape as input
Support 44.1kHz stereo audio
Use STFT-based processing
Be compatible with MDXNET architecture

Thread Safety

The AudioSeparationService is not thread-safe for concurrent operations on the same instance
Parallel processing is handled internally and is thread-safe
Create separate instances for concurrent processing of different files
Session pooling ensures safe concurrent access to ONNX models

Best Practices

For Single Files

using var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);
await service.InitializeParallelAsync();
var result = await service.SeparateAsync("song.wav");

For Batch Processing

var service = AudioSeparationFactory.CreateBatchOptimized(InternalModel.Default, "output");
await service.InitializeParallelAsync();

var files = Directory.GetFiles("input", "*.wav");
var results = await service.SeparateMultipleAsync(files);

service.Dispose();

For System-Specific Optimization

var (service, options) = AudioSeparationFactory.CreateSystemOptimized(
    InternalModel.Default, 
    "output",
    Environment.ProcessorCount,
    GC.GetTotalMemory(false) / (1024.0 * 1024.0 * 1024.0) // Available memory in GB
);
await service.InitializeParallelAsync(options);

Disposal

Always dispose the service to free ONNX resources and session pools:

using var service = new AudioSeparationService(options);
await service.InitializeParallelAsync();
// Use service...
// Automatically disposed, including session pool cleanup

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
Example		Example
ExampleParallel		ExampleParallel
OwnVocalRemover		OwnVocalRemover
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
OwnVocalRemover.sln		OwnVocalRemover.sln
README.md		README.md
ownaudio.png		ownaudio.png

Uh oh!

License

ModernMube/OwnVocalRemover

Folders and files

Latest commit

History

Repository files navigation

OwnVocalRemover

Features

Dependencies

Support My Work

Quick Start

Basic Usage (Traditional Mode)

Parallel Processing Mode

Configuration Options

SeparationOptions

ParallelProcessingOptions

Usage Examples

Custom Configuration with Parallel Processing

System-Optimized Configuration

Progress Monitoring

Batch Processing

Pre-configured Factory Methods

Mobile Optimized (Faster)

Desktop Optimized (Better Quality)

System-Optimized with Parallel Processing

Choosing the Right Model

Processing Modes

Traditional Mode

Parallel Processing Mode

Supported Audio Formats

Output Files

Error Handling

Performance Tips

Memory Management

Statistics and Analysis

Included Models

DEFAULT model

BEST model

Karaoke model

Model Usage Examples

MDXNET Model Support

Model Requirements

Thread Safety

Best Practices

For Single Files

For Batch Processing

For System-Specific Optimization

Disposal

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages