Skip to content

umitkacar/Ear-segmentation-ai

Repository files navigation

🦻 Ear Segmentation AI

License: MIT PyPI Python Downloads Code style: black Ruff

A state-of-the-art ear segmentation library powered by deep learning. Detect and segment human ears in images and video streams with high accuracy and real-time performance.

Ear Segmentation Demo

✨ Features

  • πŸš€ High Performance: Optimized for both CPU and GPU processing
  • 🎯 Accurate Detection: State-of-the-art U-Net architecture with ResNet18 encoder
  • πŸ“· Multiple Input Sources: Images, videos, webcam, and URLs
  • πŸ”„ Real-time Processing: Smooth webcam segmentation with temporal smoothing
  • πŸ“Š Batch Processing: Efficient processing of multiple images
  • πŸ› οΈ Easy to Use: Simple Python API and CLI interface
  • 🎨 Visualization Tools: Built-in mask overlay and heatmap visualization
  • πŸ“¦ Lightweight: Minimal dependencies, easy to install

πŸš€ Quick Start

Installation

# Using pip
pip install earsegmentationai

# Using poetry (recommended)
poetry add earsegmentationai

For detailed installation instructions, see Installation Guide.

Basic Usage

Python API

from earsegmentationai import ImageProcessor

# Initialize processor
processor = ImageProcessor(device="cpu")  # or "cuda:0" for GPU

# Process single image
result = processor.process("path/to/image.jpg")
print(f"Ear detected: {result.has_ear}")
print(f"Ear area: {result.ear_percentage:.2f}% of image")

# Process with visualization
result = processor.process(
    "path/to/image.jpg",
    return_visualization=True
)

Command Line

# Process single image
earsegmentationai process-image path/to/image.jpg --save-viz

# Process directory
earsegmentationai process-image path/to/images/ -o output/

# Real-time webcam
earsegmentationai webcam --device cuda:0

# Process video
earsegmentationai process-video path/to/video.mp4 -o output.avi

πŸ“š Documentation

πŸ“š Advanced Usage

Batch Processing

from earsegmentationai import ImageProcessor

processor = ImageProcessor(device="cuda:0")

# Process multiple images
results = processor.process([
    "image1.jpg",
    "image2.jpg",
    "image3.jpg"
])

print(f"Detection rate: {results.detection_rate:.1f}%")
print(f"Average ear area: {results.average_ear_area:.0f} pixels")

Video Processing

from earsegmentationai import VideoProcessor

processor = VideoProcessor(
    device="cuda:0",
    skip_frames=2,      # Process every 3rd frame
    smooth_masks=True   # Temporal smoothing
)

# Process video file
stats = processor.process(
    "video.mp4",
    output_path="output.mp4",
    display=True
)

print(f"FPS: {stats['average_fps']:.1f}")
print(f"Detection rate: {stats['detection_rate']:.1f}%")

Custom Configuration

from earsegmentationai import ImageProcessor, Config

# Create custom configuration
config = Config(
    model={"architecture": "FPN", "encoder_name": "resnet50"},
    processing={"input_size": (640, 480), "batch_size": 8}
)

processor = ImageProcessor(config=config, threshold=0.7)

πŸ”§ Configuration

Model Settings

Parameter Default Description
architecture "Unet" Model architecture (Unet, FPN, PSPNet, DeepLabV3, DeepLabV3Plus)
encoder_name "resnet18" Encoder backbone
input_size (480, 320) Input image size (width, height)
threshold 0.5 Binary mask threshold

Processing Options

Parameter Default Description
device "cpu" Processing device (cpu, cuda:0)
batch_size 1 Batch size for processing
skip_frames 0 Frame skipping for video (0 = process all)
smooth_masks True Enable temporal smoothing for video

πŸ—οΈ Architecture

The library uses a modular architecture with clear separation of concerns:

earsegmentationai/
β”œβ”€β”€ core/           # Core model and prediction logic
β”œβ”€β”€ preprocessing/  # Image preprocessing and validation
β”œβ”€β”€ postprocessing/ # Visualization and export utilities
β”œβ”€β”€ api/           # High-level Python API
β”œβ”€β”€ cli/           # Command-line interface
└── utils/         # Logging, exceptions, and helpers

πŸ§ͺ Testing

# Run all tests
make test

# Run with coverage
make test-cov

# Run specific test suite
poetry run pytest tests/unit/test_transforms.py

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

# Setup development environment
make install-dev

# Run linting and formatting
make format
make lint

# Run pre-commit hooks
make pre-commit

πŸ“ˆ Performance

Device Image Size FPS Memory
CPU (i7-9700K) 480Γ—320 15 200 MB
GPU (RTX 3080) 480Γ—320 120 400 MB
GPU (RTX 3080) 1920Γ—1080 45 800 MB

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

πŸ“ž Support


Made with ❀️ by the Ear Segmentation AI Team

About

Efficient and Lightweight Ear Segmentation AI Model

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •