Configuration
Overview
Amphion can be configured through both Python code and configuration files. This guide explains all available configuration options.
Basic Configuration
Environment Variables
# Set CUDA device export AMPHION_DEVICE="cuda:0" # Set model directory export AMPHION_MODEL_DIR="/path/to/models" # Set cache directory export AMPHION_CACHE_DIR="/path/to/cache"
Python Configuration
from amphion.config import Config config = Config( device="cuda:0", model_dir="/path/to/models", cache_dir="/path/to/cache", sample_rate=44100, batch_size=32 )
Model Configuration
Text-to-Speech
tts_config = Config( model_name="fastspeech2", vocoder="hifigan", speaker_embedding=True, prosody_modeling=True )
Voice Conversion
vc_config = Config( model_name="contentvec", speaker_embedding="dvec", conversion_mode="zero-shot" )
Text-to-Audio
t2a_config = Config( model_name="musicgen", sampling_rate=44100, audio_length=10.0 )
Advanced Configuration
Custom Model Paths
config = Config( model_paths={ "tts": "/path/to/tts/model", "vocoder": "/path/to/vocoder", "speaker_encoder": "/path/to/speaker/encoder" } )
Performance Tuning
config = Config( batch_size=32, num_workers=4, pin_memory=True, mixed_precision=True )
Logging Configuration
config = Config( log_level="INFO", log_file="/path/to/log/file.log", enable_tensorboard=True )
Configuration File
You can also use a YAML configuration file:
# config.yaml device: cuda:0 model_dir: /path/to/models cache_dir: /path/to/cache tts: model_name: fastspeech2 vocoder: hifigan speaker_embedding: true voice_conversion: model_name: contentvec conversion_mode: zero-shot logging: level: INFO file: /path/to/log/file.log
Load the configuration file in Python:
from amphion.config import Config config = Config.from_yaml("config.yaml")