Create high-quality training datasets for image generation models by automatically generating optimized prompts from your images using advanced GPT models.
- π€ AI-Powered Analysis: Automatically generate optimized prompts using GPT models
- π― Context-Aware: Include custom context to guide prompt generation
- π¦ Batch Processing: Handle multiple images simultaneously
- π·οΈ Smart Tagging: Auto-generate relevant tags for better organization
- π€ Standard Format: Export datasets in JSONL format ready for fine-tuning
- π Live Preview: Real-time visualization of generated prompts and tags
- π Progress Tracking: Monitor batch processing with detailed progress updates
- Node.js (v18 or later)
- NPM (v8 or later)
- OpenAI API key
- PostgreSQL database (setup guide)
- Clone and install dependencies:
git clone https://github.com/yourusername/reverse-image-dataset-generator.git
cd reverse-image-dataset-generator
npm install
- Set up your environment:
cp .env.example .env
- Add your OpenAI API key to
.env
:
OPENAI_API_KEY=your_api_key_here
- Start the development server:
npm run dev
Visit http://localhost:5050
to start using the application.
- Create custom training datasets for models like OmniGen
- Generate high-quality prompts from existing image collections
- Maintain consistency in prompt style across datasets
- Convert image libraries into structured training data
- Add context-aware descriptions to image collections
- Generate semantic tags for better organization
- Generate optimized prompts for existing images
- Create consistent image-prompt pairs
- Build reference libraries for prompt engineering
-
Upload Images
- Drag & drop or select multiple images
- Add optional context for better descriptions
- Preview selected images instantly
-
AI Processing
- Images analyzed by GPT models
- Context-aware prompt generation
- Automatic tag extraction
- Real-time progress tracking
-
Dataset Creation
- JSONL file generation with prompts
- Original images included
- ZIP archive for easy download
- Standard format for fine-tuning
Example output format:
{
"task_type": "text_to_image",
"instruction": "a serene landscape with snow-capped mountains reflecting in a crystal-clear lake at sunset",
"input_images": [],
"output_image": "mountain_lake.jpg"
}
- Frontend: React + TypeScript + Shadcn UI
- Backend: Express.js + Multer
- Database: PostgreSQL (configuration guide)
- AI: OpenAI GPT-4o series models
- Storage: File system with organized structure
gpt-4o-mini
: Fast, efficient processinggpt-4o
: Balanced performancegpt-4o-2024-11-20
: Latest model with enhanced capabilities
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Add user authentication
- Implement dataset management UI
- Add advanced image processing options
- Optimize batch processing
- Add dataset search and filtering
The application uses the following default ports:
5050
: Main application server5173
: Vite development server (development only)5432
: PostgreSQL database
To avoid port conflicts, you can customize these ports in your environment files:
PORT=5050 # Main application port
VITE_PORT=5173 # Vite dev server (development only)
DB_PORT=5432 # PostgreSQL port
-
Copy environment files:
cp .env.development.example .env.development
-
Adjust ports if needed in
.env.development
-
Start development environment:
docker compose -f docker-compose.dev.yml --env-file .env.development up --build
The development environment includes:
- Hot reloading for both frontend and backend
- PostgreSQL database with automatic schema initialization
- Vite dev server for frontend development
-
Copy environment files:
cp .env.production.example .env.production
-
Update production environment variables:
- Set secure PostgreSQL password
- Configure OpenAI API key
- Adjust ports if needed
- Adjust other settings as needed
-
Start production environment:
docker compose -f docker-compose.prod.yml --env-file .env.production up -d --build
The production environment includes:
- Optimized production build
- PostgreSQL database with persistent storage
- Automatic container restart
- Health checks for all services
The database is automatically initialized with the schema and migrations when the container starts. To manage the database:
-
Generate new migrations:
docker compose -f docker-compose.dev.yml --env-file .env.development exec app npm run db:generate
-
Apply migrations:
docker compose -f docker-compose.dev.yml --env-file .env.development exec app npm run db:push
-
Access database:
docker compose -f docker-compose.dev.yml --env-file .env.development exec db psql -U postgres
See .env.development.example
and .env.production.example
for required environment variables.
- Development:
pgdata_dev
- PostgreSQL data (development) - Production:
pgdata_prod
- PostgreSQL data (production)
- Server: 5050
- Vite Dev Server: 5173 (development only)
- PostgreSQL: 5432