[Inference] Add ASR support for Replicate provider #1679

lucataco · 2025-08-07T16:36:17Z

Hello! This PR adds support for the Automatic Speech Recognition task type for Replicate models.

Example:

zeke

Looks good to me. 👍🏼

hanouticelina

Hi @lucataco, thanks a lot for the contribution! could you also add the automatic-speech-recognition mapping for Replicate in

huggingface.js/packages/inference/src/lib/getProviderHelper.ts

Line 54 in e841a53

    
           export const PROVIDERS: Record<InferenceProvider, Partial<Record<InferenceTask, TaskProviderHelper>>> = {

you can find the complete guideline for provider/task JS integration in the documentation here: https://huggingface.co/docs/inference-providers/register-as-a-provider#2-js-client-integration

lucataco · 2025-08-12T00:42:02Z

Thank you for taking a look! Ive added the mapping as specified

hanouticelina · 2025-08-12T15:04:42Z

packages/inference/src/providers/replicate.ts

+		const out = response?.output as
+			| undefined
+			| {
+					transcription?: string;
+					translation?: string;
+					txt_file?: string;
+			  };


following the schema defined in https://replicate.com/openai/whisper/api/schema#output-schema

hanouticelina

Thanks @lucataco for the PR! I pushed a commit to fix the response parsing part.
Also, i think the version is missing in the providerId defined in the Replicate model mapping: https://huggingface.co/api/partners/replicate/models. it should be
"openai/whisper:8099696689d249cf8b122d833c36ac3f75505c666a395ca40ef26f68e7d3d16e". could you update it accordingly? thanks 🙏

lucataco · 2025-08-12T15:18:56Z

Oh good catch, thank you! Yes of course.
Ive updated the mapping with the specified whisper version here

zeke · 2025-08-21T19:20:45Z

Gentle bump. Anything blocking getting this shipped?

coyotte508 · 2025-08-25T10:33:25Z

merging @SBrandeis @hanouticelina

Copilot

Pull Request Overview

This PR adds Automatic Speech Recognition (ASR) support for the Replicate provider in the inference package. It enables users to perform speech-to-text transcription using Replicate models like OpenAI's Whisper.

Implements ReplicateAutomaticSpeechRecognitionTask class to handle ASR requests for Replicate provider
Removes existing output validation from the generic ASR function to allow provider-specific handling
Registers the new ASR task in the provider configuration

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
packages/inference/src/tasks/audio/automaticSpeechRecognition.ts	Removes generic output validation to allow provider-specific response handling
packages/inference/src/providers/replicate.ts	Implements new ASR task class with audio input processing and response parsing
packages/inference/src/lib/getProviderHelper.ts	Registers the new ASR task for the Replicate provider

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-08-25T10:34:45Z

packages/inference/src/providers/replicate.ts

+		if (!blob || !(blob instanceof Blob)) {
+			throw new Error("Audio input must be a Blob");


The error message 'Audio input must be a Blob' is not descriptive enough. Consider providing more context about expected input formats and how to convert them to Blob.

Suggested change

if (!blob || !(blob instanceof Blob)) {

throw new Error("Audio input must be a Blob");

throw new Error(

"Audio input must be a Blob (e.g., a File or Blob object from the browser). " +

"Received: " + (blob === undefined ? "undefined" : typeof blob) + ". " +

"To convert an ArrayBuffer or base64 string to a Blob, use: " +

"`new Blob([arrayBuffer], { type: 'audio/wav' })` or " +

"`fetch('data:audio/wav;base64,...').then(res => res.blob())`. " +

"See documentation for supported input formats."

);

@hanouticelina

Hello! This PR adds support for the `Automatic Speech Recognition` task type for Replicate models. Example: - [huggingface.co/openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) - [replicate.com/openai/whisper](https://replicate.com/openai/whisper) cc @hanouticelina --------- Co-authored-by: Celina Hanouti <hanouticelina@gmail.com> Co-authored-by: Eliott C. <coyotte508@gmail.com>

Add ASR support for Replicate provider

ceaf141

lucataco requested review from julien-c, hanouticelina and SBrandeis as code owners August 7, 2025 16:36

Fix lint

af7a21e

zeke approved these changes Aug 10, 2025

View reviewed changes

Merge branch 'main' into main

ba1755a

hanouticelina reviewed Aug 11, 2025

View reviewed changes

Add ReplicateASR mapping

5299c1c

fix get response

de2778d

hanouticelina reviewed Aug 12, 2025

View reviewed changes

hanouticelina approved these changes Aug 12, 2025

View reviewed changes

Merge branch 'main' into main

3dae24c

coyotte508 requested a review from Copilot August 25, 2025 10:34

Copilot AI reviewed Aug 25, 2025

View reviewed changes

coyotte508 merged commit 166cd60 into huggingface:main Aug 25, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Inference] Add ASR support for Replicate provider #1679

[Inference] Add ASR support for Replicate provider #1679

lucataco commented Aug 7, 2025

Uh oh!

zeke left a comment

Uh oh!

hanouticelina left a comment

Uh oh!

lucataco commented Aug 12, 2025

Uh oh!

hanouticelina Aug 12, 2025

Uh oh!

hanouticelina left a comment

Uh oh!

lucataco commented Aug 12, 2025

Uh oh!

zeke commented Aug 21, 2025

Uh oh!

coyotte508 commented Aug 25, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

		if (!blob \|\| !(blob instanceof Blob)) {
		throw new Error("Audio input must be a Blob");

-		if (!blob || !(blob instanceof Blob)) {
-			throw new Error("Audio input must be a Blob");
+			throw new Error(
+				"Audio input must be a Blob (e.g., a File or Blob object from the browser). " +
+				"Received: " + (blob === undefined ? "undefined" : typeof blob) + ". " +
+				"To convert an ArrayBuffer or base64 string to a Blob, use: " +
+				"`new Blob([arrayBuffer], { type: 'audio/wav' })` or " +
+				"`fetch('data:audio/wav;base64,...').then(res => res.blob())`. " +
+				"See documentation for supported input formats."
+			);

[Inference] Add ASR support for Replicate provider #1679

[Inference] Add ASR support for Replicate provider #1679

Conversation

lucataco commented Aug 7, 2025

Uh oh!

zeke left a comment

Choose a reason for hiding this comment

Uh oh!

hanouticelina left a comment

Choose a reason for hiding this comment

Uh oh!

lucataco commented Aug 12, 2025

Uh oh!

hanouticelina Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

hanouticelina left a comment

Choose a reason for hiding this comment

Uh oh!

lucataco commented Aug 12, 2025

Uh oh!

zeke commented Aug 21, 2025

Uh oh!

coyotte508 commented Aug 25, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!