Skip to content

enhance benchmark with dataset discovery, validation, performance monitoring, and improved Docker support #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jul 13, 2025

Conversation

fcostaoliveira
Copy link

@fcostaoliveira fcostaoliveira commented Jul 12, 2025

Key Features

Dataset & Engine Discovery

  • Add --describe datasets and --describe engines commands for easy exploration
  • Columnar table display with dimensions, vector counts, descriptions, and schema information
  • Smart dataset sorting by dimension → vector count → name

Real-time Performance Monitoring

  • Single-line performance summaries after each benchmark iteration
  • Display QPS, P50/P95 latency, and precision metrics
  • Standardized precision formatting with proper rounding rules

Enhanced Docker Experience

  • Improved Docker build and run scripts with better error handling
  • Updated Docker configurations for Redis testing with proper defaults
  • Streamlined Docker test workflows and container management
  • Enhanced .dockerignore for optimized build contexts
  • Removed deprecated docker-compose.yml in favor of simplified Docker workflows

Comprehensive Data Validation

  • GitHub Actions for automatic dataset validation on changes
  • Local validation script with comprehensive checks for JSON structure, required fields, and data consistency
  • Multi-Python version testing (3.10-3.13) with Redis container integration
  • Validates --describe functionality to ensure reliability

Enhanced Dataset Metadata

  • Complete vector_count and description fields for all 42 datasets
  • Improved dataset configuration with backward compatibility
  • Better data organization and consistency

Improved Reliability

  • Enhanced download functionality with proper HTTP headers to avoid 403 errors
  • Robust error handling and validation across Docker and native environments
  • Updated Redis configurations and container specifications

Technical Details

  • Docker Changes: Updated build scripts, run configurations, test workflows, and container specifications
  • Files Changed: Core functionality (run.py, engine clients, dataset handling), Docker infrastructure, CI/CD workflows
  • Testing: Added comprehensive GitHub Actions for validation and basic functionality testing with Redis containers
  • Documentation: Added validation documentation and updated Docker setup guides

- Enhanced Dockerfile with multi-stage build and security best practices
- Added Docker build, run, and test scripts with Redis-specific configurations
- Created GitHub Actions workflows for PR validation, master publishing, and release publishing
- Added docker-compose.yml for local development with Redis
- Updated documentation with Docker usage examples
- Configured for redis-performance/vector-db-benchmark Docker Hub repository
- Default configuration: engines=redis, dataset=random-100, experiment=redis-m-16-ef-64
- Multi-platform support (linux/amd64, linux/arm64)
- Security scanning with Trivy for releases
- Updated PR validation to trigger on update-redisearch branch
- Updated publishing workflow to use update-redisearch branch instead of master
- Updated Docker tags to use update-redisearch-{sha} format
- Updated documentation to reflect correct default branch
Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: e64ba801ffda2b24505edd6c2555240192b4e612

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

…ion, and performance monitoring

- Add --describe command for datasets and engines with columnar display
- Implement real-time performance summaries (QPS, P50/P95 latency)
- Add comprehensive dataset validation system with GitHub Actions
- Complete dataset metadata with vector_count and description fields
- Improve download reliability with proper HTTP headers
- Standardize precision formatting (0.01 increments up to 0.97, then 0.0025)
- Enhanced Docker configurations for better Redis testing defaults
- Add validation documentation and automated CI/CD checks

This maintains backward compatibility while significantly improving usability,
data quality, and performance insights for vector database benchmarking.
Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: 5b3a17ab6554afaf59fb53787e189b0c8b25d36f

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: dcadcfb85edeffdd765fd95fdd525cca706dac89

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

- Add Poetry installation to validate-datasets workflow
- Use --no-root to install dependencies without packaging the project
- Run validation script with 'poetry run' to access all dependencies
- Fixes ModuleNotFoundError for stopit and other dependencies when testing --describe functionality
Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: c8efba615241691af2b64cda1398e91789aa6d5a

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: 0b3e068fbe599038ce9bdd240e43f8e9c777995b

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: 57b807a36e71ae3c7be891fc8c021aa8ad60b653

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: 6b3ace1b605195d96f06f842e885a820f5de0745

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: dfd5659c90b9d1c6215f440b5369967104a7895c

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: 7e5ae94d494589fb78f36b0b790a52cc89bb61e4

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: b8f601b66bd324b5345aa9741b32c0846a4b520d

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: 54d69f27b12de490b26da3fa8d10c5702b74c7d1

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: b3b21b3310455766a65cda6b251cae951cd38fae

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

Copy link

🐳 Docker Build Validation

Docker build successful!

Platforms tested:

  • ✅ linux/amd64 (built and tested)
  • ✅ linux/arm64 (build validated)

Git SHA: ac5a268ca67f653a49b0918ea4c7c7fc7303247e

Docker Hub Status: ✅ Docker Hub credentials configured

Image details:

  • Single platform: vector-db-benchmark-pr:pr-32
  • Multi-platform: vector-db-benchmark-pr:pr-32-multiplatform

Tests performed:

  • ✅ Docker Hub credentials check
  • ✅ Help command execution
  • ✅ Python environment validation
  • ✅ Redis connectivity test
  • ✅ Benchmark execution test (redis-m-16-ef-64)
  • ✅ Multi-platform build validation

The Docker image is ready for deployment! 🚀

@fcostaoliveira fcostaoliveira changed the title Add comprehensive Docker CI/CD pipeline enhance benchmark with dataset discovery, validation, performance monitoring, and improved Docker support Jul 13, 2025
@fcostaoliveira fcostaoliveira merged commit ad7d53d into update.redisearch Jul 13, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant