A state-of-the-art ear segmentation library powered by deep learning. Detect and segment human ears in images and video streams with high accuracy and real-time performance.
- π High Performance: Optimized for both CPU and GPU processing
- π― Accurate Detection: State-of-the-art U-Net architecture with ResNet18 encoder
- π· Multiple Input Sources: Images, videos, webcam, and URLs
- π Real-time Processing: Smooth webcam segmentation with temporal smoothing
- π Batch Processing: Efficient processing of multiple images
- π οΈ Easy to Use: Simple Python API and CLI interface
- π¨ Visualization Tools: Built-in mask overlay and heatmap visualization
- π¦ Lightweight: Minimal dependencies, easy to install
# Using pip
pip install earsegmentationai
# Using poetry (recommended)
poetry add earsegmentationaiFor detailed installation instructions, see Installation Guide.
from earsegmentationai import ImageProcessor
# Initialize processor
processor = ImageProcessor(device="cpu") # or "cuda:0" for GPU
# Process single image
result = processor.process("path/to/image.jpg")
print(f"Ear detected: {result.has_ear}")
print(f"Ear area: {result.ear_percentage:.2f}% of image")
# Process with visualization
result = processor.process(
"path/to/image.jpg",
return_visualization=True
)# Process single image
earsegmentationai process-image path/to/image.jpg --save-viz
# Process directory
earsegmentationai process-image path/to/images/ -o output/
# Real-time webcam
earsegmentationai webcam --device cuda:0
# Process video
earsegmentationai process-video path/to/video.mp4 -o output.avi- Installation Guide
- Quick Start Guide
- Architecture Overview
- API Reference
- Contributing Guide
- Migration Guide
from earsegmentationai import ImageProcessor
processor = ImageProcessor(device="cuda:0")
# Process multiple images
results = processor.process([
"image1.jpg",
"image2.jpg",
"image3.jpg"
])
print(f"Detection rate: {results.detection_rate:.1f}%")
print(f"Average ear area: {results.average_ear_area:.0f} pixels")from earsegmentationai import VideoProcessor
processor = VideoProcessor(
device="cuda:0",
skip_frames=2, # Process every 3rd frame
smooth_masks=True # Temporal smoothing
)
# Process video file
stats = processor.process(
"video.mp4",
output_path="output.mp4",
display=True
)
print(f"FPS: {stats['average_fps']:.1f}")
print(f"Detection rate: {stats['detection_rate']:.1f}%")from earsegmentationai import ImageProcessor, Config
# Create custom configuration
config = Config(
model={"architecture": "FPN", "encoder_name": "resnet50"},
processing={"input_size": (640, 480), "batch_size": 8}
)
processor = ImageProcessor(config=config, threshold=0.7)| Parameter | Default | Description |
|---|---|---|
architecture |
"Unet" |
Model architecture (Unet, FPN, PSPNet, DeepLabV3, DeepLabV3Plus) |
encoder_name |
"resnet18" |
Encoder backbone |
input_size |
(480, 320) |
Input image size (width, height) |
threshold |
0.5 |
Binary mask threshold |
| Parameter | Default | Description |
|---|---|---|
device |
"cpu" |
Processing device (cpu, cuda:0) |
batch_size |
1 |
Batch size for processing |
skip_frames |
0 |
Frame skipping for video (0 = process all) |
smooth_masks |
True |
Enable temporal smoothing for video |
The library uses a modular architecture with clear separation of concerns:
earsegmentationai/
βββ core/ # Core model and prediction logic
βββ preprocessing/ # Image preprocessing and validation
βββ postprocessing/ # Visualization and export utilities
βββ api/ # High-level Python API
βββ cli/ # Command-line interface
βββ utils/ # Logging, exceptions, and helpers
# Run all tests
make test
# Run with coverage
make test-cov
# Run specific test suite
poetry run pytest tests/unit/test_transforms.pyWe welcome contributions! Please see our Contributing Guide for details.
# Setup development environment
make install-dev
# Run linting and formatting
make format
make lint
# Run pre-commit hooks
make pre-commit| Device | Image Size | FPS | Memory |
|---|---|---|---|
| CPU (i7-9700K) | 480Γ320 | 15 | 200 MB |
| GPU (RTX 3080) | 480Γ320 | 120 | 400 MB |
| GPU (RTX 3080) | 1920Γ1080 | 45 | 800 MB |
This project is licensed under the MIT License - see the LICENSE file for details.
- Built with PyTorch and segmentation-models-pytorch
- Inspired by state-of-the-art segmentation research
- Thanks to all contributors and the open-source community
- π§ Email: umitkacar.phd@gmail.com
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
Made with β€οΈ by the Ear Segmentation AI Team
