You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readmes-readable.md
30
-
1.What is this repo or project? (You can reuse the repo description you used earlier because this section doesn’t have to be long.)
31
-
2.How does it work?
32
-
3.Who will use this repo or project?
33
-
4.What is the goal of this project?
34
-
-->
35
-
36
28
37
29
**PaddleSpeech** is an open-source toolkit on [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) platform for a variety of critical tasks in speech and audio, with the state-of-art and influential models.
38
30
@@ -61,7 +53,6 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme
61
53
</td>
62
54
<td>我认为跑步最重要的就是给我带来了身体健康。</td>
63
55
</tr>
64
-
65
56
</tbody>
66
57
</table>
67
58
@@ -95,7 +86,7 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme
95
86
<tablestyle="width:100%">
96
87
<thead>
97
88
<tr>
98
-
<th><img width="200" height="1"> Input Text <img width="200" height="1"> </th>
89
+
<th width="550" > Input Text</th>
99
90
<th>Synthetic Audio</th>
100
91
</tr>
101
92
</thead>
@@ -114,14 +105,53 @@ from https://github.com/18F/open-source-guide/blob/18f-pages/pages/making-readme
4 Days Live Courses: Depth interpretation of PaddleSpeech!
151
+
152
+
**Courses videos and related materials: https://aistudio.baidu.com/aistudio/education/group/info/25130**
153
+
154
+
### Features
125
155
126
156
Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. To be more specific, this toolkit features at:
127
157
- 📦 **Ease of Use**: low barriers to install, and [CLI](#quick-start) is available to quick-start your journey.
@@ -132,34 +162,30 @@ Via the easy-to-use, efficient, flexible and scalable implementation, our vision
132
162
- 🔬 *Integration of mainstream models and datasets*: the toolkit implements modules that participate in the whole pipeline of the speech tasks, and uses mainstream datasets like LibriSpeech, LJSpeech, AIShell, CSMSC, etc. See also [model list](#model-list) for more details.
133
163
- 🧩 *Cascaded models application*: as an extension of the typical traditional audio tasks, we combine the workflows of the aforementioned tasks with other fields like Natural language processing (NLP) and Computer Vision (CV).
134
164
135
-
136
-
### Recent Update:
165
+
### Recent Update
137
166
138
167
<!---
139
168
2021.12.14: We would like to have an online courses to introduce basics and research of speech, as well as code practice with `paddlespeech`. Please pay attention to our [Calendar](https://www.paddlepaddle.org.cn/live).
140
169
--->
141
170
- 🤗 2021.12.14: Our PaddleSpeech [ASR](https://huggingface.co/spaces/KPatrick/PaddleSpeechASR) and [TTS](https://huggingface.co/spaces/akhaliq/paddlespeech) Demos on Hugging Face Spaces are available!
142
171
- 👏🏻 2021.12.10: PaddleSpeech CLI is available for Audio Classification, Automatic Speech Recognition, Speech Translation (English to Chinese) and Text-to-Speech.
143
172
144
-
### Communication
145
-
If you are in China, we recommend you to join our WeChat group to contact directly with our team members!
173
+
### Community
174
+
- Scan the QR code below with your Wechat (reply【语音】after your friend's application is approved), you can access to official technical exchange group. Look forward to your participation.
We strongly recommend our users to install PaddleSpeech in **Linux** with *python>=3.7*, where `paddlespeech` can be easily installed with `pip`:
155
-
```python
156
-
pip install paddlepaddle paddlespeech
157
-
```
158
-
Up to now, **Linux** supports CLI for the all our tasks, **Mac OSX and Windows** only supports PaddleSpeech CLI for Audio Classification, Speech-to-Text and Text-to-Speech. Please see [installation](./docs/source/install.md) for other alternatives.
182
+
We strongly recommend our users to install PaddleSpeech in **Linux** with *python>=3.7*.
183
+
Up to now, **Linux** supports CLI for the all our tasks, **Mac OSX** and **Windows** only supports PaddleSpeech CLI for Audio Classification, Speech-to-Text and Text-to-Speech. To install `PaddleSpeech`, please see [installation](./docs/source/install.md).
159
184
185
+
<aname="quickstart"></a>
160
186
## Quick Start
161
187
162
-
Developers can have a try of our models with [PaddleSpeech Command Line](./demos/README.md). Change `--input` to test your own audio/text.
188
+
Developers can have a try of our models with [PaddleSpeech Command Line](./paddlespeech/cli/README.md). Change `--input` to test your own audio/text.
163
189
164
190
**Audio Classification**
165
191
```shell
@@ -177,11 +203,20 @@ paddlespeech st --input input_16k.wav
- web demo for Text to Speech is integrated to [Huggingface Spaces](https://huggingface.co/spaces) with [Gradio](https://github.com/gradio-app/gradio). See Demo: [TTS Demo](https://huggingface.co/spaces/akhaliq/paddlespeech)
183
209
210
+
**Text Postprocessing**
211
+
- Punctuation Restoration
212
+
```bash
213
+
paddlespeech text --task punc --input 今天的天气真不错啊你下午有空吗我想约你一起去吃饭
214
+
```
215
+
184
216
217
+
218
+
For more command lines, please see: [demos](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/demos)
219
+
185
220
If you want to try more functions like training and tuning, please have a look at [Speech-to-Text Quick Start](./docs/source/asr/quick_start.md) and [Text-to-Speech Quick Start](./docs/source/tts/quick_start.md).
186
221
187
222
## Model List
@@ -190,10 +225,6 @@ PaddleSpeech supports a series of most popular models. They are summarized in [r
190
225
191
226
**Speech-to-Text** contains *Acoustic Model*, *Language Model*, and *Speech Translation*, with the following details:
192
227
193
-
<!---
194
-
The current hyperlinks redirect to [Previous Parakeet](https://github.com/PaddlePaddle/Parakeet/tree/develop/examples).
195
-
-->
196
-
197
228
<tablestyle="width:100%">
198
229
<thead>
199
230
<tr>
@@ -313,7 +344,7 @@ The current hyperlinks redirect to [Previous Parakeet](https://github.com/Paddle
313
344
</td>
314
345
</tr>
315
346
<tr>
316
-
<td rowspan="3">Vocoder</td>
347
+
<td rowspan="5">Vocoder</td>
317
348
<td >WaveFlow</td>
318
349
<td >LJSpeech</td>
319
350
<td>
@@ -333,7 +364,21 @@ The current hyperlinks redirect to [Previous Parakeet](https://github.com/Paddle
333
364
<td>
334
365
<a href = "./examples/csmsc/voc3">Multi Band MelGAN-csmsc</a>
Normally, [Speech SoTA](https://paperswithcode.com/area/speech), [Audio SoTA](https://paperswithcode.com/area/audio) and [Music SoTA](https://paperswithcode.com/area/music) give you an overview of the hot academic topics in the related area. To focus on the tasks in PaddleSpeech, you will find the following guidelines are helpful to grasp the core ideas.
The Text-to-Speech module is originally called [Parakeet](https://github.com/PaddlePaddle/Parakeet), and now merged with this repository. If you are interested in academic research about this task, please see [TTS research overview](https://github.com/PaddlePaddle/PaddleSpeech/tree/develop/docs/source/tts#overview). Also, [this document](https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/docs/source/tts/models_introduction.md) is a good guideline for the pipeline components.
You are warmly welcome to submit questions in [discussions](https://github.com/PaddlePaddle/PaddleSpeech/discussions) and bug reports in [issues](https://github.com/PaddlePaddle/PaddleSpeech/issues)! Also, we highly appreciate if you are willing to contribute to this project!
@@ -460,13 +534,16 @@ You are warmly welcome to submit questions in [discussions](https://github.com/P
460
534
461
535
## Acknowledgement
462
536
463
-
- Many thanks to [yeyupiaoling](https://github.com/yeyupiaoling) for years of attention, constructive advice and great help.
537
+
538
+
- Many thanks to [yeyupiaoling](https://github.com/yeyupiaoling)/[PPASR](https://github.com/yeyupiaoling/PPASR)/[PaddlePaddle-DeepSpeech](https://github.com/yeyupiaoling/PaddlePaddle-DeepSpeech)/[VoiceprintRecognition-PaddlePaddle](https://github.com/yeyupiaoling/VoiceprintRecognition-PaddlePaddle)/[AudioClassification-PaddlePaddle](https://github.com/yeyupiaoling/AudioClassification-PaddlePaddle) for years of attention, constructive advice and great help.
464
539
- Many thanks to [AK391](https://github.com/AK391) for TTS web demo on Huggingface Spaces using Gradio.
465
540
- Many thanks to [mymagicpower](https://github.com/mymagicpower) for the Java implementation of ASR upon [short](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_sdk) and [long](https://github.com/mymagicpower/AIAS/tree/main/3_audio_sdks/asr_long_audio_sdk) audio files.
466
-
541
+
- Many thanks to [JiehangXie](https://github.com/JiehangXie)/[PaddleBoBo](https://github.com/JiehangXie/PaddleBoBo) for developing Virtual Uploader(VUP)/Virtual YouTuber(VTuber) with PaddleSpeech TTS function.
542
+
- Many thanks to [745165806](https://github.com/745165806)/[PaddleSpeechTask](https://github.com/745165806/PaddleSpeechTask) for contributing Punctuation Restoration model.
467
543
468
544
Besides, PaddleSpeech depends on a lot of open source repositories. See [references](./docs/source/reference.md) for more information.
469
545
546
+
<aname="License"></a>
470
547
## License
471
548
472
549
PaddleSpeech is provided under the [Apache-2.0 License](./LICENSE).
0 commit comments