Skip to content

ISL-INTELLIGENT-SYSTEMS-LAB/TC3-RT-Documentation

Repository files navigation

Tourniquet Detection and Tracking System

Python YOLO Whisper PyTorch Tkinter Audio NLP Status

This application provides a real-time computer vision and audio processing system for detecting, tracking, and verifying the correct application of tourniquets in live video feeds. It integrates object detection, motion tracking, speech transcription, and contextual analysis.

Table of Contents

Features

  • Device Adaptability: Automatically detects Intel RealSense cameras, with fallback to standard webcams.
  • Integrated video and audio pipelines.
  • Live graphical interface showing video and detection status.
  • Real-time audio recording, transcription, and natural language processing.

Video Processing

  • Real-time video capture from supported cameras.
  • Multi-scale object detection using YOLO models to identify tourniquets.
  • Feature-based tracking to monitor tourniquet movement and position.
  • Stability detection to assess proper application based on minimal motion.
  • GUI displaying:
    • Live video feed
    • Bounding boxes and labels for detected tourniquets
    • Tracking stability information
  • Optional debug logging and video recording.

Example Output in GUI:

Video Output Example

The GUI highlights detected tourniquets with labeled bounding boxes and stability indicators. A color-coded overlay (e.g., green for stable, red for unstable) indicates whether the detected item is considered properly applied. The detection confidence and object ID are also shown.


Audio Processing

  • Real-time audio capture during video monitoring sessions.
  • Speech detection using adaptive volume thresholds.
  • Transcription performed using the Whisper model for high-accuracy speech-to-text conversion.
  • Natural language processing (NLP) applied to extract contextual information from the transcribed text.
  • Timestamping and duration logging for all detected speech segments.
  • Optional saving of transcriptions and audio logs for later review.

Example Output in GUI:

Audio Output Example

Transcribed speech is displayed in a side panel with associated timestamps and durations. Contextual tags or summaries are extracted using NLP and shown beneath each transcription block to help users understand key spoken content related to the scene (e.g., commands, status updates).


Requirements

Whisper Model

Pre-Trained Whisper Model OR Train your own


Installation

  1. Clone the repository:

    git clone https://github.com/Matt-y-Ice/TC3-RT-Documentation.git
  2. Install the required packages:

    pip install -r requirements.txt
  3. (Optional) Install RealSense support:

    pip install pyrealsense2

Usage

Run the application with:

python gui_integration.py

The system will:

  • Attempt to initialize a RealSense camera, or fallback to a standard webcam.
  • Begin video detection and tracking of tourniquets in real time.
  • If audio features are enabled:
    • Begin recording audio.
    • Transcribe speech using Whisper.
    • Analyze transcripts for context using NLP.
    • Display or save results for review.

Camera Troubleshooting

For RealSense cameras:

  • Ensure SDK installation (pyrealsense2) is complete.
  • Use USB 3.0 ports for stable bandwidth.
  • Verify device visibility in system tools.

For standard webcams:

  • Check physical connections and close other applications using the camera.
  • On Linux: add user to the video group (groups $USER)
  • On Windows: use Device Manager for diagnostics.
  • On macOS: allow camera permissions via System Settings.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •