Add audio text to text #1691

Dhiraj309 · 2025-08-17T14:02:30Z

This PR introduces a new audio-text-to-text task to the huggingface.js library. It enables converting audio input into text using automatic speech recognition (ASR) models.

Key updates:

Added packages/tasks/src/tasks/audio-text-to-text/ with:

data.ts – metadata for datasets, models, metrics, and demo.

inference.ts – logic for converting audio to text.

about.md – task description.

spec/input.json & spec/output.json – example inputs and expected outputs.

Task summary: "Convert audio input into text using speech-to-text (ASR) models."

Demonstration includes a sample .wav file and expected transcription output.

⚠️ Note: This task is currently not integrated into the main pipeline or automated tests. Manual testing is recommended before pipeline inclusion.

Dhiraj309 added 2 commits August 17, 2025 15:35

Add audio-text-to-text task (speech-to-text)

a9d8076

Add audio-text-to-text task (speech-to-text)

c34b22e

Dhiraj309 requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners August 17, 2025 14:02

Dhiraj309 added 4 commits August 18, 2025 12:16

feat(tasks): add audio-text-to-text task definition

e613750

Merge branch 'main' into add-audio-text-to-text

c45c8c8

Merge branch 'main' into add-audio-text-to-text

c85f637

Merge branch 'main' into add-audio-text-to-text

1190be1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add audio text to text #1691

Add audio text to text #1691

Uh oh!

Dhiraj309 commented Aug 17, 2025

Uh oh!

Uh oh!

Add audio text to text #1691

Are you sure you want to change the base?

Add audio text to text #1691

Uh oh!

Conversation

Dhiraj309 commented Aug 17, 2025

Uh oh!

Uh oh!