GitHub - niknah/ComfyUI-F5-TTS: ComfyUI node for F5-Text To Speech

ComfyUI node to make text to speech audio with your own voice.

Using F5-TTS https://github.com/SWivid/F5-TTS

Instructions

Put in ComfyUI's input folder a .wav file of an audio of the voice you'd like to use, remove any background music, noise.
And a .txt file of the same name with what was said.
Press refresh to see it in the node
input/F5-TTS input/audio folders will also work.

You can use the examples here...

Other languages / custom models...

You can put the model & vocab txt files into models/checkpoints/F5-TTS folder if you have any more models. Name the .txt vocab file and the .pt model file the same names. Press "refresh" and it should appear under the "model" selection.

Example...

YourLanguage.txt
YourLanguage.safetensors

Custom F5-TTS languages on huggingface

I haven't tried these...

Multi voices output...

Use the F5-TTS Audio node(not the from input node).

Put your sample voice files into the input folder like...

voice.wav
voice.txt
voice.deep.wav
voice.deep.txt
voice.chipmunk.wav
voice.chipmunk.txt

Then you can use prompts for different voices...

{main} Hello World this is the end
{deep} This is the narrator
{chipmunk} Please, I need more helium

Multi voice input...

Put a sentence of voice 1 and a sentence from voice 2 into the input audio sample. F5-TTS cuts the audio off at 15 seconds so don't make it too long. Example

BigVGAN models.

To use BigVGAN, you have to add a little dot to make it work with ComfyUI...

In the file custom_nodes/ComfyUI-F5-TTS/F5-TTS/src/third_party/BigVGAN/bigvgan.py

Add a little dot on the line at the top that says...

from utils import init_weights, get_padding

so it's becomes...

from .utils import init_weights, get_padding

Tips...

F5-TTS cuts your voice sample off at 15 secs. It may cut off in the middle of a word and not cut the text only audio. Make sure your input samples are less than 15 secs.
If you're using the ComfyUI-Whisper node you will also need to install ffmpeg

Install from git

It's best to install from ComfyUI-manager because it will update all your custom_nodes when you click "update all". With git, you will have to update manually.

Clone this repository into custom_nodes and run this to install from git

cd custom_nodes/ComfyUI-F5-TTS
git submodule update --init --recursive
pip install -r requirements.txt

No module named f5_tts

Some versions of git doesn't handle submodules well. Remove the custom\_nodes/ComfyUI-F5-TTS/F5-TTS folder and clone the F5-TTS repository...

cd ComfyUI/custom\_nodes/ComfyUI-F5-TTS/
rm -rf F5-TTS
rmdir /s F5-TTS
git clone https://github.com/SWivid/F5-TTS.git F5-TTS

Changes

1.0.22: Added TDHS(Time-domain harmonic scaling) to advanced node.
1.0.21: Added advanced node
1.0.19: Added model_type.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github		.github
F5-TTS @ 297755f		F5-TTS @ 297755f
example_workflows		example_workflows
.gitignore		.gitignore
.gitmodules		.gitmodules
F5TTS.py		F5TTS.py
Install.py		Install.py
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Instructions

Other languages / custom models...

Multi voices output...

Multi voice input...

BigVGAN models.

Tips...

Install from git

No module named f5_tts

Changes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

License

niknah/ComfyUI-F5-TTS

Folders and files

Latest commit

History

Repository files navigation

Instructions

Other languages / custom models...

Multi voices output...

Multi voice input...

BigVGAN models.

Tips...

Install from git

No module named f5_tts

Changes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages