|
| 1 | +# Nova Sonic Solution |
| 2 | + |
| 3 | +## Table of Contents |
| 4 | + |
| 5 | +- [Overview](#overview) |
| 6 | +- [Architecture](#architecture) |
| 7 | +- [Project Structure](#project-structure) |
| 8 | +- [Prerequisites](#prerequisites) |
| 9 | +- [Deployment](#deployment) |
| 10 | +- [User creation](#user-creation) |
| 11 | +- [Usage](#usage) |
| 12 | +- [Load testing](#load-testing) |
| 13 | +- [Clean Up](#clean-up) |
| 14 | +- [Content Security Legal Disclaimer](#content-security-legal-disclaimer) |
| 15 | +- [Operational Metrics Collection](#operational-metrics-collection) |
| 16 | + |
| 17 | +## Overview |
| 18 | + |
| 19 | +A real-time speech-to-speech communication platform powered by Amazon Bedrock's Nova model for advanced language processing and AWS real-time messaging capabilities, featuring a Java WebSocket server and React frontend. Nova enables natural, context-aware speech-to-speech conversations through its state-of-the-art language understanding and generation capabilities. |
| 20 | + |
| 21 | +## Architecture |
| 22 | + |
| 23 | + |
| 24 | + |
| 25 | +The solution consists of three main components: |
| 26 | + |
| 27 | +1. **Frontend Application** |
| 28 | + - React + TypeScript application |
| 29 | + - Real-time WebSocket communication |
| 30 | + - AWS Amplify for authentication |
| 31 | + - Tailwind CSS for styling |
| 32 | + |
| 33 | +2. **Backend Infrastructure** |
| 34 | + - AWS CDK for infrastructure as code |
| 35 | + - Java WebSocket server running on AWS Fargate |
| 36 | + - Amazon Cognito for user authentication |
| 37 | + - CloudFront for content delivery |
| 38 | + - S3 for static website hosting |
| 39 | + - Network Load Balancer for WebSocket traffic |
| 40 | + |
| 41 | +3. **Development Tools** |
| 42 | + - Load testing suite for WebSocket performance testing |
| 43 | + - Automated deployment pipeline |
| 44 | + |
| 45 | +## Project Structure |
| 46 | + |
| 47 | +``` |
| 48 | +. |
| 49 | +├── frontend/ # React + TypeScript frontend application |
| 50 | +├── backend/ # AWS CDK infrastructure and Java WebSocket server |
| 51 | +│ ├── app/ # Java WebSocket server implementation |
| 52 | +│ ├── stack/ # CDK infrastructure code |
| 53 | +│ └── load-test/ # WebSocket load testing suite |
| 54 | +└── images/ # Architecture diagrams and documentation images |
| 55 | +``` |
| 56 | + |
| 57 | +## Prerequisites |
| 58 | + |
| 59 | +- [Python](https://www.python.org/downloads/) 3.11 or higher |
| 60 | +- [Docker Desktop](https://docs.docker.com/desktop/install/) |
| 61 | +- [Gradle](https://gradle.org/install/) 7.x or higher |
| 62 | +- [Git](https://git-scm.com/downloads) |
| 63 | +- [AWS CDK Toolkit](https://docs.aws.amazon.com/cdk/v2/guide/cli.html) |
| 64 | +- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) |
| 65 | +``` |
| 66 | +aws configure --profile [your-profile] |
| 67 | +AWS Access Key ID [None]: xxxxxx |
| 68 | +AWS Secret Access Key [None]:yyyyyyyyyy |
| 69 | +Default region name [None]: us-east-1 |
| 70 | +Default output format [None]: json |
| 71 | +``` |
| 72 | +- Node.js: v18.12.1 or higher |
| 73 | +- npm 8.x or higher |
| 74 | +- Ensure you enable model access to Amazon Nova Sonic in the [Bedrock console](https://us-east-1.console.aws.amazon.com/bedrock/home?region=us-east-1#/modelaccess) in the region you intend to deploy this sample. For an up to date list of supported regions for Amazon Nova Sonic, please refer to the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html) |
| 75 | +- Chrome, Safari, or Edge browser environment (Firefox is currently not supported) |
| 76 | +- Microphone and speakers |
| 77 | + |
| 78 | +## Deployment |
| 79 | + |
| 80 | +1. If not done already, clone this repository: |
| 81 | + |
| 82 | + ```shell |
| 83 | + $ git clone https://github.com/aws-samples/generative-ai-cdk-constructs-samples.git |
| 84 | + ``` |
| 85 | + |
| 86 | +2. Enter the sample directory: |
| 87 | + |
| 88 | + ```shell |
| 89 | + $ cd samples/speech-to-speech |
| 90 | + ``` |
| 91 | + |
| 92 | +3. Build the frontend first: |
| 93 | + |
| 94 | + ```shell |
| 95 | + $ cd frontend |
| 96 | + ``` |
| 97 | + |
| 98 | + Install dependencies: |
| 99 | + |
| 100 | + ```shell |
| 101 | + $ npm install |
| 102 | + ``` |
| 103 | + |
| 104 | + Build the web application |
| 105 | + |
| 106 | + ```shell |
| 107 | + $ npm run build |
| 108 | + ``` |
| 109 | + |
| 110 | +The build output in `frontend/dist/` directory will be automatically deployed by the backend CDK stack to S3 and served through CloudFront. The environment variables are automatically configured by the `custom_resource_construct.py` in the CDK stack, which updates the frontend configuration during deployment. |
| 111 | + |
| 112 | +4. Go to the backend directory: |
| 113 | + |
| 114 | + ```shell |
| 115 | + $ cd ../backend |
| 116 | + ``` |
| 117 | + |
| 118 | +5. Create a virtualenv on MacOS and Linux: |
| 119 | + |
| 120 | + ```shell |
| 121 | + $ python3 -m venv .venv |
| 122 | + ``` |
| 123 | + |
| 124 | + After the init process completes and the virtualenv is created, you can use the following |
| 125 | + step to activate your virtualenv. |
| 126 | + |
| 127 | + ```shell |
| 128 | + $ source .venv/bin/activate |
| 129 | + ``` |
| 130 | + |
| 131 | + If you are a Windows platform, you would activate the virtualenv like this: |
| 132 | + |
| 133 | + ```shell |
| 134 | + $ .venv\Scripts\activate.bat |
| 135 | + ``` |
| 136 | + |
| 137 | +6. Once the virtualenv is activated, you can install the required dependencies. |
| 138 | + |
| 139 | + ```shell |
| 140 | + $ pip install -r requirements.txt |
| 141 | + ``` |
| 142 | + |
| 143 | +7. Run the following to bootstrap your account: |
| 144 | + |
| 145 | + ```shell |
| 146 | + $ cdk bootstrap |
| 147 | + ``` |
| 148 | + |
| 149 | +8. Run AWS CDK Toolkit to deploy the Backend stack with the runtime resources. |
| 150 | + |
| 151 | + ```shell |
| 152 | + $ cdk deploy --require-approval=never |
| 153 | + ``` |
| 154 | + |
| 155 | + Any modifications made to the code can be applied to the deployed stack by running the same command again. |
| 156 | + |
| 157 | + ```shell |
| 158 | + cdk deploy --require-approval=never |
| 159 | + ``` |
| 160 | + |
| 161 | +The command above will deploy one stack in your account. With the default configuration of this sample, the observed deployment time was ~646 seconds (10.5 minutes). |
| 162 | + |
| 163 | +Get the CloudFront domain name: |
| 164 | + |
| 165 | +```shell |
| 166 | +aws cloudformation describe-stacks \ |
| 167 | + --stack-name NovaSonicSolutionBackendStack \ |
| 168 | + --query 'Stacks[0].Outputs[?OutputKey==`CloudFrontDistributionDomainName`].OutputValue' \ |
| 169 | + --output text |
| 170 | +``` |
| 171 | + |
| 172 | +The frontend can be accessed at the domain name above (XXXX.cloudfront.net). |
| 173 | + |
| 174 | +## User creation |
| 175 | + |
| 176 | +First, locate the Cognito User Pool ID, through the AWS CLI: |
| 177 | + |
| 178 | +```shell |
| 179 | +$ aws cloudformation describe-stacks --stack-name NovaSonicSolutionBackendStack --query "Stacks[0].Outputs[?contains(OutputKey, 'UserPoolId')].OutputValue" |
| 180 | + |
| 181 | +[ |
| 182 | + "OutputValue": "<region>_a1aaaA1Aa" |
| 183 | +] |
| 184 | +``` |
| 185 | + |
| 186 | +1. Navigate to AWS Console: |
| 187 | +2. Search for "Cognito" in the AWS Console search bar, Click on "Cognito" under Services, Click on "User Pools" in the left navigation. |
| 188 | + Find and click on the User Pool created by the CDK stack you recovered above. |
| 189 | +3. In the User Pool dashboard, click "Users" in the left navigation. Click the "Create user" button and create user with password. |
| 190 | + |
| 191 | +## Usage |
| 192 | + |
| 193 | +1. Open your browser and go to the application URL (CloudFront domain from CDK outputs) previously recovered. |
| 194 | +2. Click on "Speech to Speech" in the sidebar navigation menu. |
| 195 | +3. Click the "Start Streaming" button. When prompted, allow access to your microphone. |
| 196 | +4. Begin speaking - you should see your speech being transcribed in real-time on the UI |
| 197 | +5. The assistant will automatically process your message and respond through speech |
| 198 | +6. Click "Stop Streaming" when you're done |
| 199 | + |
| 200 | + |
| 201 | + |
| 202 | +> Note: Ensure your microphone is properly connected and working before testing. The browser may require you to grant microphone permissions the first time you use the feature. |
| 203 | +
|
| 204 | +## Load testing |
| 205 | + |
| 206 | +The [backend/load-test](backend/load-test/) directory contains [Artillery](https://www.artillery.io/docs) scripts for WebSocket performance testing. This will require the installation of [Artillery](https://www.artillery.io/docs/get-started/get-artillery). |
| 207 | + |
| 208 | +1. Set up load testing: |
| 209 | + |
| 210 | + ```shell |
| 211 | + $ cd backend/load-test |
| 212 | + $ npm install |
| 213 | + $ ./setup-load-test.sh |
| 214 | + ``` |
| 215 | + |
| 216 | +2. Run load tests: |
| 217 | + |
| 218 | + ```shell |
| 219 | + $ ./run-load-test.sh |
| 220 | + ``` |
| 221 | + |
| 222 | +3. Generate HTML report |
| 223 | + |
| 224 | + ```shell |
| 225 | + $ artillery report report.json |
| 226 | + ``` |
| 227 | + |
| 228 | +## Clean Up |
| 229 | + |
| 230 | +Do not forget to delete the stack to avoid unexpected charges. |
| 231 | + |
| 232 | +```shell |
| 233 | +cdk destroy NovaSonicSolutionBackendStack |
| 234 | +``` |
| 235 | + |
| 236 | +Delete the associated logs created by the different services in Amazon CloudWatch logs. |
| 237 | + |
| 238 | +Ensure S3 buckets are emptied before deletion. |
| 239 | + |
| 240 | +## Content Security Legal Disclaimer |
| 241 | + |
| 242 | +The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage. |
| 243 | + |
| 244 | +## Operational Metrics Collection |
| 245 | + |
| 246 | +This solution collects anonymous operational metrics to help AWS improve the quality and features of the solution. Data collection is subject to the AWS Privacy Policy (https://aws.amazon.com/privacy/). To opt out of this feature, simply remove the tag(s) starting with “uksb-” or “SO” from the description(s) in any CloudFormation templates or CDK TemplateOptions. |
0 commit comments