Skip to content

Commit d32e41d

Browse files
authored
Merge branch 'MoonInTheRiver:master' into master
2 parents e0dc542 + 5f2f6eb commit d32e41d

1 file changed

Lines changed: 11 additions & 10 deletions

File tree

README.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,6 @@
99

1010
This repository is the official PyTorch implementation of our AAAI-2022 [paper](https://arxiv.org/abs/2105.02446), in which we propose DiffSinger (for Singing-Voice-Synthesis) and DiffSpeech (for Text-to-Speech).
1111

12-
<table style="width:100%">
13-
<tr>
14-
<th>DiffSinger/DiffSpeech at training</th>
15-
<th>DiffSinger/DiffSpeech at inference</th>
16-
</tr>
17-
<tr>
18-
<td><img src="resources/model_a.png" alt="Training" height="300"></td>
19-
<td><img src="resources/model_b.png" alt="Inference" height="300"></td>
20-
</tr>
21-
</table>
2212

2313
:tada: :tada: :tada: **Updates**:
2414
- Sep.11, 2022: :electric_plug: [DiffSinger-PN](docs/README-SVS-opencpop-pndm.md). Add plug-in [PNDM](https://arxiv.org/abs/2202.09778), ICLR 2022 in our laboratory, to accelerate DiffSinger freely.
@@ -48,6 +38,17 @@ or pip install -r requirements_3090.txt (GPU 3090, CUDA 11.4)
4838
- [Run DiffSpeech (TTS version)](docs/README-TTS.md).
4939
- [Run DiffSinger (SVS version)](docs/README-SVS.md).
5040

41+
## Overview
42+
| Mel Pipeline | Dataset | Pitch Input | F0 Prediction | Acceleration Method | Vocoder |
43+
| ------------------------------------------------------------------------------------------- | ---------------------------------------------------------| ----------------- | ------------- | --------------------------- | ----------------------------- |
44+
| [DiffSpeech (Text->F0, Text+F0->Mel, Mel->Wav)](docs/README-TTS.md) | [Ljspeech](https://keithito.com/LJ-Speech-Dataset/) | None | Explicit | Shallow Diffusion | NSF-HiFiGAN |
45+
| [DiffSinger (Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-popcs.md) | [PopCS](https://github.com/MoonInTheRiver/DiffSinger) | Ground-Truth F0 | None | Shallow Diffusion | NSF-HiFiGAN |
46+
| [DiffSinger (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-opencpop-cascade.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Explicit | Shallow Diffusion | NSF-HiFiGAN |
47+
| [FFT-Singer (Lyric+MIDI->F0, Lyric+F0->Mel, Mel->Wav)](docs/README-SVS-opencpop-cascade.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Explicit | Invalid | NSF-HiFiGAN |
48+
| [DiffSinger (Lyric+MIDI->Mel, Mel->Wav)](docs/README-SVS-opencpop-e2e.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Implicit | None | Pitch-Extractor + NSF-HiFiGAN |
49+
| [DiffSinger+PNDM (Lyric+MIDI->Mel, Mel->Wav)](docs/README-SVS-opencpop-pndm.md) | [OpenCpop](https://wenet.org.cn/opencpop/) | MIDI | Implicit | PLMS | Pitch-Extractor + NSF-HiFiGAN |
50+
51+
5152
## Tensorboard
5253
```sh
5354
tensorboard --logdir_spec exp_name

0 commit comments

Comments
 (0)