A high-performance audio source separation library for .NET that uses ONNX models to separate vocals and instrumental tracks from mixed audio files. I've been looking for a good vocal and music separation code in C# for a very long time that provides decent quality. Unfortunately, I could only find such code in Python, so I decided to create a pure Csharp vocal separator that would deliver the quality created by the Python code!
- ONNX Model Support: Works with pre-trained ONNX models for audio separation
- GPU Acceleration: Automatic CUDA support with CPU fallback
- Parallel Processing: Multi-threaded chunk processing with session pooling
- Memory Management: Adaptive memory pressure monitoring
- Chunked Processing: Handles large files by processing in configurable chunks
- Noise Reduction: Optional denoising for improved separation quality
- Batch Processing: Process multiple files efficiently
- Progress Tracking: Real-time progress reporting with events
- Auto-Configuration: Automatically detects model parameters from ONNX metadata
MathNet.Numerics- FFT operationsMicrosoft.ML.OnnxRuntime- ONNX model inferenceOwnaudio- Audio I/O operationsMicrosoft.Extensions.ObjectPool- Session pooling for parallel processing
If you find this project helpful, consider buying me a coffee!
// Basic usage with included model
var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);
await service.InitializeAsync();
var result = await service.SeparateAsync("input_song.wav");
Console.WriteLine($"Vocals: {result.VocalsPath}");
Console.WriteLine($"Instrumental: {result.InstrumentalPath}");
service.Dispose();// Parallel processing for faster performance
var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);
var parallelOptions = new ParallelProcessingOptions
{
MaxDegreeOfParallelism = 4,
SessionPoolSize = 3,
EnableMemoryPressureMonitoring = true
};
await service.InitializeParallelAsync(parallelOptions);
var result = await service.SeparateAsync("input_song.wav");
service.Dispose();- ModelPath: Path to ONNX model file
- OutputDirectory: Output directory for separated files
- DisableNoiseReduction: Disable denoising (default: false)
- Margin: Overlap margin for chunks (default: 44100 samples)
- ChunkSizeSeconds: Chunk duration in seconds (0 = process entire file)
- NFft: FFT size (default: 6144)
- DimT: Temporal dimension parameter (default: 8)
- DimF: Frequency dimension parameter (default: 2048)
- MaxDegreeOfParallelism: Maximum concurrent chunks (0 = auto-detect)
- SessionPoolSize: Number of ONNX sessions in pool (0 = auto-detect)
- EnableMemoryPressureMonitoring: Monitor memory usage (default: true)
- MemoryPressureThreshold: Memory threshold in bytes (default: 2GB)
- ChunkQueueCapacity: Queue capacity for chunks (default: 10)
var options = new SeparationOptions
{
ModelPath = "",
Model = InternalModel.Default,
OutputDirectory = "output",
ChunkSizeSeconds = 20,
DisableNoiseReduction = false
};
var parallelOptions = new ParallelProcessingOptions
{
MaxDegreeOfParallelism = 6,
SessionPoolSize = 4,
EnableMemoryPressureMonitoring = true,
MemoryPressureThreshold = 3_000_000_000 // 3GB
};
var service = new AudioSeparationService(options);
await service.InitializeParallelAsync(parallelOptions);// Automatically configure based on system capabilitiesvar
var (service, parallelOptions) = AudioSeparationFactory.CreateSystemOptimized(InternalModel.Default, @"output");
await service.InitializeParallelAsync(parallelOptions);service.ProgressChanged += (sender, progress) =>
{
Console.WriteLine($"Progress: {progress.OverallProgress:F1}% - {progress.Status}");
Console.WriteLine($"Chunks: {progress.ProcessedChunks}/{progress.TotalChunks}");
};
service.ProcessingStarted += (sender, file) =>
{
Console.WriteLine($"Started processing: {file}");
};
service.ProcessingCompleted += (sender, result) =>
{
Console.WriteLine($"Completed in {result.ProcessingTime}");
};var files = new[] { "song1.wav", "song2.wav", "song3.wav" };
var results = await service.SeparateMultipleAsync(files);
foreach (var result in results)
{
Console.WriteLine($"Processed: {result.VocalsPath}");
}var service = AudioSeparationFactory.CreateMobileOptimized(
InternalModel.Default,
"output",
disableNoiseReduction: true
);
await service.InitializeAsync(); // Traditional mode for mobilevar service = AudioSeparationFactory.CreateDesktopOptimized(
InternalModel.Default,
"output"
);
await service.InitializeParallelAsync(); // Parallel mode for desktopvar (service, parallelOptions) = AudioSeparationFactory.CreateSystemOptimized(
InternalModel.Default,
"output"
);
await service.InitializeParallelAsync(parallelOptions);For general use: Start with default model
var service = AudioSeparationFactory.CreateBatchOptimized(InternalModel.Default, "output");For best quality: Use best model with desktop settings
var service = AudioSeparationFactory.CreateDesktopOptimized(InternalModel.Best, "output");For karaoke creation: Use karaoke model
var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Karaoke);For custom MDXNET models: Any compatible model works
var service = AudioSeparationExtensions.CreateDefaultService("models/custom_mdxnet.onnx");- Single-threaded processing
- Lower memory usage
- Suitable for mobile/low-end devices
- Initialize with
InitializeAsync()
- Multi-threaded chunk processing
- Higher performance on multi-core systems
- Session pooling for better resource utilization
- Memory pressure monitoring
- Initialize with
InitializeParallelAsync()
- WAV (.wav)
- MP3 (.mp3)
- FLAC (.flac)
The service generates two files per input:
{filename}_vocals.wav- Extracted vocals{filename}_music.wav- Instrumental track
try
{
var result = await service.SeparateAsync("input.wav");
}
catch (FileNotFoundException ex)
{
Console.WriteLine($"File not found: {ex.Message}");
}
catch (InvalidOperationException ex)
{
Console.WriteLine($"Service error: {ex.Message}");
}
catch (AggregateException ex) when (ex.InnerExceptions.Any())
{
Console.WriteLine("Parallel processing errors occurred:");
foreach (var innerEx in ex.InnerExceptions)
{
Console.WriteLine($"- {innerEx.Message}");
}
}- Processing Mode: Use parallel processing on multi-core systems
- GPU Acceleration: Ensure CUDA is available for faster processing
- Chunk Size: Adjust
ChunkSizeSecondsbased on available memory - Session Pool: Increase
SessionPoolSizefor better parallel performance - Memory Management: Enable memory pressure monitoring for large files
- Noise Reduction: Disable for faster processing in batch scenarios
The parallel processing mode includes adaptive memory management:
- Memory Pressure Monitoring: Automatically detects high memory usage
- Garbage Collection: Forces GC under memory pressure
- Throttling: Reduces parallelism when memory is constrained
- Session Pooling: Efficient reuse of ONNX sessions
The SeparationResult includes audio statistics:
var stats = result.Statistics;
Console.WriteLine($"Vocals RMS: {stats.VocalsRMS:F4}");
Console.WriteLine($"Instrumental RMS: {stats.InstrumentalRMS:F4}");
Console.WriteLine($"Sample Rate: {stats.SampleRate} Hz");
Console.WriteLine($"Processing Time: {result.ProcessingTime}");The library comes with three pre-trained models:
- Type: Basic instrumental separation
- Quality: Good baseline performance
- Use case: General purpose separation, fastest processing
- Output: Clean vocals and instrumental tracks
- Type: High-quality instrumental separation
- Quality: Superior separation accuracy
- Use case: When quality is more important than speed
- Output: High-fidelity vocals and instrumental tracks
- Type: Karaoke model (lead vocal removal)
- Quality: Specialized for karaoke creation
- Use case: Remove lead vocals while preserving backing vocals
- Output: Lead vocals and music with backing vocals intact
// Using the default model
var defaultService = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);
// Using the best quality model with parallel processing
var bestService = AudioSeparationExtensions.CreateDefaultService(InternalModel.Best);
await bestService.InitializeParallelAsync();
// Using the karaoke model
var karaokeService = AudioSeparationExtensions.CreateDefaultService(InternalModel.Karaoke);The library is fully compatible with any MDXNET model:
// Using custom MDXNET model with parallel processing
var mdxService = AudioSeparationExtensions.CreateDefaultService("models/my_mdxnet_model.onnx");
await mdxService.InitializeParallelAsync(); // Auto-detects model parametersONNX models should:
- Accept input shape:
[batch, 4, frequency, time] - Output same shape as input
- Support 44.1kHz stereo audio
- Use STFT-based processing
- Be compatible with MDXNET architecture
- The
AudioSeparationServiceis not thread-safe for concurrent operations on the same instance - Parallel processing is handled internally and is thread-safe
- Create separate instances for concurrent processing of different files
- Session pooling ensures safe concurrent access to ONNX models
using var service = AudioSeparationExtensions.CreateDefaultService(InternalModel.Default);
await service.InitializeParallelAsync();
var result = await service.SeparateAsync("song.wav");var service = AudioSeparationFactory.CreateBatchOptimized(InternalModel.Default, "output");
await service.InitializeParallelAsync();
var files = Directory.GetFiles("input", "*.wav");
var results = await service.SeparateMultipleAsync(files);
service.Dispose();var (service, options) = AudioSeparationFactory.CreateSystemOptimized(
InternalModel.Default,
"output",
Environment.ProcessorCount,
GC.GetTotalMemory(false) / (1024.0 * 1024.0 * 1024.0) // Available memory in GB
);
await service.InitializeParallelAsync(options);Always dispose the service to free ONNX resources and session pools:
using var service = new AudioSeparationService(options);
await service.InitializeParallelAsync();
// Use service...
// Automatically disposed, including session pool cleanup