Skip to content

Commit 7c6f2b4

Browse files
author
Buddhi Dhananjaya
committed
Refactor documentation to improve clarity on local setup and AWS configuration; update sidebar, links, and environment variable descriptions.
1 parent ce3b43a commit 7c6f2b4

File tree

7 files changed

+159
-56
lines changed

7 files changed

+159
-56
lines changed

docs/aws-setup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ displayed_sidebar: pageSidebar
99

1010
Before you begin, ensure the following tools are installed on your system:
1111

12-
- [Amazon AWS](https://signin.aws.amazon.com/signup?request_type=register)
12+
- [Amazon AWS Account](https://signin.aws.amazon.com/signup?request_type=register)
1313

1414
## Deployment of Meeting Bots using AWS CodePipeline and ECS (Fargate)
1515

docs/introduction.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,31 +10,31 @@ CueMeet.ai is a pioneering open source API that enables developers to build inte
1010

1111
This documentation will guide you through setting up and using the CueMeet platform effectively.
1212

13-
## Overview
13+
### **1. System Architecture**
1414

15-
CueMeet is a powerful platform that offer a self-hostable, unified API solution for deploying intelligent meeting bots across platforms like Zoom, Google Meet, Microsoft Teams, and more. This guide will walk you through:
15+
To fully leverage CueMeet, first, go through our **[System Architecture Guide](/cuemeet-documentation/docs/system-architecture)**, where we break down:
16+
- **Core components** of the CueMeet ecosystem
17+
- **How the API communicates with meeting platforms**
18+
- **Data flow and processing pipeline**
19+
- **Security and scalability considerations**
1620

17-
1. **Local Setup** - Getting your development environment ready
18-
2. **AWS Setup** - Configuring your AWS infrastructure to run the meeting bots on demand
19-
3. **API Integration** - Using CueMeet's APIs for communication
20-
4. **Bot Configuration** - Setting up and customizing your bots
21+
Understanding the architecture will help you make informed deployment decisions.
2122

22-
## Getting Started
23+
### **2. Getting Started**
2324

24-
We recommend following these steps in order:
25+
Once you have a clear understanding of the architecture, follow these steps:
2526

26-
1. First, complete the [Local Setup](/cuemeet-documentation/docs/local-setup) to prepare your development environment
27-
2. Then, follow the [AWS Setup Guide](/cuemeet-documentation/docs/aws-setup) to configure your cloud infrastructure
28-
3. Learn how to integrate with the [CueMeet API](/cuemeet-documentation/docs/bot/api-info) for seamless communication
29-
4. Finally, explore our [Bot Configuration Guide](/cuemeet-documentation/docs/meeting-bots) to customize your bots
27+
1. **[AWS Setup Guide](/cuemeet-documentation/docs/aws-setup)** – Configure your cloud infrastructure to run meeting bots efficiently.
28+
2. **[Local Setup](/cuemeet-documentation/docs/local-setup)** – Prepare your development environment for customization.
29+
3. **[CueMeet API Integration](/cuemeet-documentation/docs/bot/api-info)** – Learn how to interact with CueMeet's APIs.
30+
4. **[Bot Configuration Guide](/cuemeet-documentation/docs/meeting-bots)** – Customize and improve bot functionalities.
3031

3132
Each section contains detailed instructions and best practices to ensure a smooth implementation.
3233

34+
---
35+
3336
## Need Help?
3437

3538
If you encounter any issues or have questions, please:
3639
- Check out our [GitHub Organization](https://github.com/CueMeet) for updates and contributions
37-
- Join our [Discord Server](https://discord.gg/55FbAHA9) to connect with the community
38-
39-
40-
Let's get started with the [Local Setup](./local-setup)!
40+
- Join our [Discord Server](https://discord.gg/GjBS3EeMzp) to connect with the community

docs/local-setup.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ sidebar_position: 3
33
displayed_sidebar: pageSidebar
44
---
55

6-
# Local Setup
6+
# Control Panel Local Setup
77

88
This guide walks you through setting up Cuemeet locally for development and testing purposes.
99

@@ -93,24 +93,24 @@ REDIS_HOST=redis
9393
REDIS_PORT=6379
9494
9595
96-
# AWS
97-
AWS_ACCESS_KEY=
98-
AWS_SECRET_KEY=
96+
# AWS (Must be filled in from AWS setup steps)
97+
AWS_ACCESS_KEY= # Your AWS Access Key from AWS setup
98+
AWS_SECRET_KEY= # Your AWS Secret Key from AWS setup
9999
100100
## S3
101-
AWS_BUCKET_REGION=
102-
AWS_MEETING_BOT_BUCKET_NAME=
101+
AWS_BUCKET_REGION= # Your S3 bucket region
102+
AWS_MEETING_BOT_BUCKET_NAME= # Your S3 bucket name
103103
104-
## ECS
105-
AWS_ECS_CLUSTER_NAME=
106-
AWS_SECURITY_GROUP=
107-
AWS_VPS_SUBNET=
108-
ECS_TASK_DEFINITION_GOOGLE=
109-
ECS_CONTAINER_NAME_GOOGLE=
110-
ECS_TASK_DEFINITION_ZOOM=
111-
ECS_CONTAINER_NAME_ZOOM=
112-
ECS_TASK_DEFINITION_TEAMS=
113-
ECS_CONTAINER_NAME_TEAMS=
104+
## ECS (Must match the AWS configurations)
105+
AWS_ECS_CLUSTER_NAME= # Your AWS ECS Cluster Name
106+
AWS_SECURITY_GROUP= # Your AWS Security Group ID
107+
AWS_VPS_SUBNET= # Your AWS Subnet ID
108+
ECS_TASK_DEFINITION_GOOGLE= # Task Definition for Google Meet bots
109+
ECS_CONTAINER_NAME_GOOGLE= # Container Name for Google Meet bots
110+
ECS_TASK_DEFINITION_ZOOM= # Task Definition for Zoom bots
111+
ECS_CONTAINER_NAME_ZOOM= # Container Name for Zoom bots
112+
ECS_TASK_DEFINITION_TEAMS= # Task Definition for Microsoft Teams bots
113+
ECS_CONTAINER_NAME_TEAMS= # Container Name for Microsoft Teams bots
114114
115115
116116
# Meeting Bot
@@ -120,7 +120,7 @@ MEETING_BOT_RETRY_COUNT=2
120120
# Worker Backend gRPC URL
121121
WORKER_BACKEND_GRPC_URL=worker-grpc
122122
```
123-
123+
⚠️ Important: The AWS-related environment variables must be obtained from the AWS Setup Guide. Complete the AWS setup first and copy the relevant values into this file.
124124
</details>
125125

126126
<details>
@@ -151,12 +151,12 @@ REDIS_DB=2
151151
152152
153153
# AWS Configuration
154-
AWS_ACCESS_KEY_ID=
155-
AWS_SECRET_ACCESS_KEY=
154+
AWS_ACCESS_KEY_ID= # Your AWS Access Key from AWS setup
155+
AWS_SECRET_ACCESS_KEY= # Your AWS Access Key from AWS setup
156156
157157
## AWS S3
158-
AWS_REGION=
159-
AWS_STORAGE_BUCKET_NAME=
158+
AWS_REGION= # Your S3 bucket region
159+
AWS_STORAGE_BUCKET_NAME= # Your S3 bucket name
160160
161161
_SIGNED_URL_EXPIRY_TIME=60
162162
@@ -166,7 +166,7 @@ HIGHLIGHT_ENVIRONMENT_NAME=""
166166
167167
168168
## ASSEMBLY AI
169-
ASSEMBLY_AI_API_KEY=""
169+
ASSEMBLY_AI_API_KEY="" # https://www.assemblyai.com API KEY
170170
```
171171

172172
</details>

docs/system-architecture.mdx

Lines changed: 112 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@ sidebar_position: 3
33
displayed_sidebar: pageSidebar
44
---
55

6-
# System Design Diagram
6+
# System Architecture Overview
7+
8+
## **System Design Diagram:**
79

810
Below is a diagram of the system architecture for CueMeet and Meeting Bots Infrastructure and inner workings.
911

@@ -16,18 +18,119 @@ Below is a diagram of the system architecture for CueMeet and Meeting Bots Infra
1618

1719
<br /><br />
1820

19-
This architecture diagram illustrates an on-demand meeting bot system deployed on Amazon Web Services (AWS). It integrates several cloud-native and backend technologies to manage the lifecycle of meeting bots, from task orchestration to containerized execution and error-handling mechanisms. Let’s break this down in digestible chunks.
21+
The CueMeet.ai system is designed to automate and manage meeting bots across various platforms using a robust, cloud-native architecture on Amazon Web Services (AWS). This system orchestrates bot execution, data processing, and error handling to provide a seamless meeting automation experience.
22+
23+
## **System Overview:**
24+
25+
**1. Control Backend (NestJS with BullMQ):**
26+
27+
- **Purpose:** Orchestrates the lifecycle of meeting bot jobs and directly manages ECS tasks.
28+
- **Technology:** Built using NestJS with BullMQ for task queuing and scheduling.
29+
- **Functionality:**
30+
- Exposes APIs for managing meeting bots (no dedicated frontend).
31+
- Schedules and manages bot jobs.
32+
- Uses the AWS SDK to directly interact with AWS ECS.
33+
- Stores meeting bot configuration and status in a PostgreSQL database.
34+
- Utilizes BullMQ for task queuing and processing.
35+
- Includes cron jobs for:
36+
- `syncTaskStatus`: Periodically checks the status of running ECS tasks and updates the database accordingly.
37+
- `initiateScheduledBot`: Triggers scheduled meeting bot tasks.
38+
- Manages the initiation and re-initiation of ECS tasks.
39+
- **ECS Client Service:**
40+
- Manages ECS task lifecycle (run, stop, list, describe).
41+
- Handles error scenarios and retries.
42+
- Stores and retrieves ECS task information.
43+
44+
**2. AWS Fargate (Container Execution):**
45+
46+
- **Purpose:** Provides a serverless compute platform for running Docker containers.
47+
- **Functionality:**
48+
- Dynamically spins up containers based on requests from the Control Backend.
49+
- Executes the MeetingBots, which are containerized Python applications.
50+
- Scales automatically based on demand.
51+
- **Amazon ECR (Elastic Container Registry):** Stores the Docker images required for the MeetingBots, ensuring efficient deployment and version control.
52+
53+
**3. MeetingBots (Python with [Selenium](https://selenium-python.readthedocs.io/) and [FFmpeg](https://www.ffmpeg.org/)):**
54+
55+
- **Purpose:** Automates meeting participation and data capture.
56+
- **Technology:** Built using Python, [Selenium](https://selenium-python.readthedocs.io/) (for browser automation), and [FFmpeg](https://www.ffmpeg.org/) (for audio recording).
57+
- **Functionality:**
58+
- Programmatically joins meetings on various platforms (Google Meet, Zoom, Microsoft Teams).
59+
- Captures high-quality audio recordings.
60+
- Extracts live captions and transcripts from the respective meeting platform.
61+
- Uploads the audio and transcript data in a combined `.tar` file and also standalone audio files (.opus) to Amazon S3 using presigned URLs.
62+
- The output files are stored in the `/out` directory inside the container.
63+
64+
**4. Amazon S3 (Object Storage):**
65+
66+
- **Purpose:** Stores the meeting data (audio and transcript files).
67+
- **Functionality:**
68+
- Receives uploaded data from the MeetingBots.
69+
- Provides durable and scalable storage.
70+
- Triggers SNS notifications on new file uploads.
71+
72+
**5. Worker Backend (Django with Celery):**
2073

21-
Core Workflow
74+
- **Purpose:** Processes the uploaded meeting data, generating enhanced transcripts and performing post-processing tasks.
75+
- **Technology:** Built using Django and Celery, a distributed task queue.
76+
- **Functionality:**
77+
- Receives notifications from S3 via SNS.
78+
- Downloads the `.tar` file from S3.
79+
- Extracts audio and transcript files.
80+
- Uses [AssemblyAI](https://www.assemblyai.com/) for high-quality audio transcription.
81+
- Uses [NLTK](https://www.nltk.org/) and [rapidfuzz](https://github.com/rapidfuzz/RapidFuzz) for speaker reconciliation, combining AssemblyAI's audio transcription with the original meeting captions.
82+
- Stores the processed data in a PostgreSQL database.
83+
- Exposes a gRPC interface for communication with the Control Backend.
84+
- Provides a Celery task to retry failed transcriptions.
85+
- **Post-Processing (Speaker Reconciliation):**
86+
- Uses [AssemblyAI](https://www.assemblyai.com/) for accurate audio transcription.
87+
- Aligns speaker names from the original meeting captions with the [AssemblyAI](https://www.assemblyai.com/) transcript.
88+
- Employs [NLTK](https://www.nltk.org/) for sentence tokenization and [rapidfuzz](https://github.com/rapidfuzz/RapidFuzz) for fuzzy string matching to reconcile speakers.
89+
- Handles timestamp alignment and text similarity comparisons.
90+
- **Django Celery Tasks:**
91+
- `_transcript_generator_worker`:
92+
- Downloads the `.tar` file from S3.
93+
- Extracts audio and transcript files.
94+
- Transcribes the audio using [AssemblyAI](https://www.assemblyai.com/).
95+
- Reconciles speaker labels using the meeting metadata and NLTK/rapidfuzz.
96+
- Stores the transcription data in the PostgreSQL database.
97+
- Handles error logging and retries.
98+
- `_transcript_retry_cronjob`:
99+
- Periodically checks for failed transcription tasks.
100+
- Retries failed tasks, up to a maximum number of retries.
101+
- Sends failure notifications if retries are exhausted.
102+
- **PostgreSQL Database:** Stores processed meeting data, including transcripts and metadata.
103+
- **Redis Integration:** Celery uses Redis as a message broker and result backend.
104+
- **gRPC Interface:** Enables communication between the Worker Backend and the Control Backend.
105+
- **SNS/SQS Integration:** S3 triggers SNS notifications upon file upload, which are then queued in SQS for processing by the Worker Backend. This system includes a Dead Letter Queue (DLQ) for handling failed messages.
22106

23-
At the heart of the system lies a Control Backend, built with BullMQ (a Redis-based task queue) and a reactive framework (likely NestJS, judging by the pink logo). This backend schedules and manages bot jobs, using AWS SDK to dynamically spin up containers in AWS Fargate a serverless compute engine for containers. These containers run the MeetingBot, likely a Python-based automation tool, utilizing Selenium for browser control.
107+
**6. Error Monitoring and Observability:**
24108

25-
Execution & Persistence
109+
- **[highlight.io](https://www.highlight.io/):**
110+
- Monitors the system for errors and exceptions for both meeting bots and each of the control panel services.
111+
- Provides real-time alerts via Slack.
112+
- Captures detailed logs and error reports.
113+
- **CloudWatch:**
114+
- Monitors ECS logs and application-level metrics.
115+
- Aids in debugging and performance optimization.
26116

27-
When a bot job is dispatched, Fargate pulls the required Docker image from Amazon ECR (Elastic Container Registry). These bots, upon execution, may generate artifacts (e.g., screenshots, logs) that are uploaded to Amazon S3. This decouples the data from the execution lifecycle and ensures durability. Meanwhile, the Worker Backend (built on Django and powered by Celery) interacts with the Control Backend using gRPC. It’s likely responsible for long-running jobs, failure recovery, and scheduling. Redis and PostgreSQL back these services with fast in-memory caching and persistent state, respectively.
117+
### **Workflow Summary:**
28118

29-
Failure Handling & Observability
119+
1. A meeting bot task is scheduled or triggered via the Control Backend API.
120+
2. The Control Backend uses the ECS Client Service to launch an ECS task in Fargate.
121+
3. The MeetingBot joins the meeting, records audio, and extracts captions.
122+
4. The MeetingBot uploads the data to S3.
123+
5. S3 triggers an SNS notification.
124+
6. The Worker Backend, via Celery, processes the uploaded data, transcribes it using [AssemblyAI](https://www.assemblyai.com/), and reconciles speakers using [NLTK](https://www.nltk.org/) and [rapidfuzz](https://github.com/rapidfuzz/RapidFuzz).
125+
7. The processed data is stored in the PostgreSQL database.
126+
8. [highlight.io](https://www.highlight.io/) and CloudWatch monitor the system for errors and performance issues.
30127

31-
In the event of issues like failure to publish events or data corruption notifications are sent to an SNS Topic (Simple Notification Service). To ensure reliability, the SNS topic is paired with a Dead Letter Queue (DLQ) via SQS (Simple Queue Service), catching unprocessed or failed messages for further inspection or reprocessing.
128+
### **Key Technologies**
32129

33-
CloudWatch (not shown here but implied in AWS architecture) is typically used to monitor ECS logs and application-level metrics, aiding in debugging and performance optimization.
130+
* **Containerization:** Docker, AWS Fargate, Amazon ECR
131+
* **Orchestration:** NestJS, BullMQ, AWS SDK
132+
* **Automation:** Python, [Selenium](https://selenium-python.readthedocs.io/), [FFmpeg](https://www.ffmpeg.org/)
133+
* **Data Processing:** Django, Celery, [AssemblyAI](https://www.assemblyai.com/), [NLTK](https://www.nltk.org/)
134+
* **Storage:** Amazon S3, PostgreSQL, Redis
135+
* **Messaging:** Amazon SNS, Amazon SQS
136+
* **Monitoring:** [highlight.io](https://www.highlight.io/), AWS CloudWatch

docusaurus.config.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ const config: Config = {
7474
items: [
7575
{
7676
label: "Discord",
77-
href: "https://discord.gg/55FbAHA9",
77+
href: "https://discord.gg/GjBS3EeMzp",
7878
},
7979
{
8080
label: "Twitter",

sidebars.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@ const sidebars: SidebarsConfig = {
1919
collapsible: true,
2020
collapsed: false,
2121
items: [
22-
{type: "doc", id: "local-setup"},
2322
{type: "doc", id: "aws-setup"},
23+
{type: "doc", id: "local-setup"},
2424
],
2525
},
2626
{

src/pages/index.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,30 +52,30 @@ This page provides documentation for our API, including setup instructions and c
5252
<div className="col col--6 margin-bottom--lg">
5353
<div className="card">
5454
<div className="card__header">
55-
<h3>Local Setup</h3>
55+
<h3>AWS Meeting Bots Configuration</h3>
5656
</div>
5757
<div className="card__body">
5858
<p>
59-
Learn how to set up the API locally for development and testing purposes.
59+
Instructions for deploying and hosting the meeting bots on AWS.
6060
</p>
6161
</div>
6262
<div className="card__footer">
63-
<a href="/cuemeet-documentation/docs/local-setup" className="button button--primary button--block">Learn More</a>
63+
<a href="/cuemeet-documentation/docs/aws-setup" className="button button--primary button--block">Learn More</a>
6464
</div>
6565
</div>
6666
</div>
6767
<div className="col col--6 margin-bottom--lg">
6868
<div className="card">
6969
<div className="card__header">
70-
<h3>AWS Meeting Bots Configuration</h3>
70+
<h3>Control Panel Local Setup</h3>
7171
</div>
7272
<div className="card__body">
7373
<p>
74-
Instructions for deploying and hosting the API on your own infrastructure.
74+
Learn how to set up the API locally for development and testing purposes.
7575
</p>
7676
</div>
7777
<div className="card__footer">
78-
<a href="/cuemeet-documentation/docs/aws-setup" className="button button--primary button--block">Learn More</a>
78+
<a href="/cuemeet-documentation/docs/local-setup" className="button button--primary button--block">Learn More</a>
7979
</div>
8080
</div>
8181
</div>

0 commit comments

Comments
 (0)