aws-marketplace-docs

This repository contains usage instructions for the pyannoteAI AWS Marketplace Diarization Model.

This solution provides precise diarization tools for businesses that require data security and accurate speaker labeling.

Benchmark

Benchmark was made using ml.g4dn.xlarge

Audio files used for benchmarking:

16kHz sample rate
mono channel (average channels)
compressed MP3 or similar formats are recommended for optimal performance.

Real time endpoints have a 20 MiB limit for real time endpoints

Product Highlights

The Speaker Diarization Model enables accurate segmentation of audio recordings by detecting and labeling individual speakers across time. Designed for seamless integration into transcription pipelines, media workflows, and audio analytics systems, it supports a wide range of formats including WAV, MP3, FLAC, and OGG. The service is language-agnostic and works across diverse audio sourcecalls, meetings, interviews, podcasts, and more. With built-in support for mono and stereo channels, varying sample rates, and flexible input options it can be deployed in batch or near-real-time use cases. Key features include automatic speaker count estimation, precise time-stamped speaker labeling, and detection of overlapping speech. Outputs are returned in structured JSON for easy integration with transcription engines, search indexes, or business intelligence tools. Whether you are enriching speech-to-text transcripts, analyzing call center performance, or processing long-form media, this API improves clarity, organization, and data usability.

Data Security

This solution operates within Amazon SageMaker, securely within your AWS account. Per AWS security policy, SageMaker deploys images in an environment devoid of network or AWS service endpoint access. This means that when you launch this product from AWS Marketplace, this model is deployed without network access. Additionally, AWS restricts the model from accessing the internet during runtime. Therefore, when you use this product, only you can access your data. For more detailed information about security and data access, please refer to the following resources here and here.

Delivery method via SageMaker supports compliance certifications: C5, CCCS, CISPE, DESC CSP, DoD CC SRG, ENS High, FedRAMP, FINMA, GSMA, HIPAA BAA, HITRUST CSF, IAR, IRAP, ISMAP, ISO and CSA STAR certificates, K-ISMS, MTCS, OSPAR, PCI, Pinakes, PiTuKri, SNI 27001, SOC. For further details, please visit this page.

Features

Diarization (speaker detection and labeling)

This solution detects and labels any number of speakers. It can be used to segment audio recordings into speaker turns, providing precise time-stamped labels for each speaker. This is particularly useful for applications such as meeting transcription, call center analysis, and media content organization.

The number of speakers is automatically estimated, and the model can handle overlapping speech scenarios. The output is structured in JSON format, making it easy to integrate with transcription engines or other data processing systems.

Contact us

If you have any questions about our product, feel free to email us at support@pyannote.ai

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
example_files		example_files
notebooks		notebooks
README.md		README.md
example-input.json		example-input.json
example-output.json		example-output.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

aws-marketplace-docs

Benchmark

Product Highlights

Data Security

Features

Diarization (speaker detection and labeling)

Contact us

About

Uh oh!

Languages

pyannote/aws-marketplace-docs

Folders and files

Latest commit

History

Repository files navigation

aws-marketplace-docs

Benchmark

Product Highlights

Data Security

Features

Diarization (speaker detection and labeling)

Contact us

About

Resources

Uh oh!

Stars

Watchers

Forks

Languages