Skip to content

Commit 852ce42

Browse files
Augusto 'Lex' Ochoa UghiniAugusto 'Lex' Ochoa Ughini
authored andcommitted
docs(video-analyzer): add Video Analyzer application example
1 parent eee21a7 commit 852ce42

File tree

1 file changed

+88
-0
lines changed

1 file changed

+88
-0
lines changed
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
# Video Analyzer – Unlock Insights from Your Videos
2+
3+
> **Status:** Draft – initial contribution by @ochoaughini
4+
> This page introduces a reference application that demonstrates how to use the Gemini API to transcribe, summarise and search video content.
5+
6+
## What the app does
7+
8+
* Drag-and-drop a local video (mp4 / mov / webm).
9+
* Automatically extracts audio → **speech-to-text** using Gemini audio transcription.
10+
* Captures still frames every *N* seconds and sends them to Gemini **multimodal** endpoint for scene description.
11+
* Generates:
12+
* An SRT subtitle file.
13+
* A bullet-point **summary** (topics, key moments).
14+
* Embeddings index allowing **semantic search** over the transcript.
15+
16+
## Quick start
17+
18+
```bash
19+
pip install google-generativeai moviepy ffmpeg-python
20+
python video_analyzer.py --input my_video.mp4 --model gemini-pro-vision
21+
```
22+
23+
The script will write:
24+
25+
* `my_video.srt` – subtitles
26+
* `my_video.summary.txt` – text summary
27+
* `my_video.index.json` – embedding index for search
28+
29+
## Core code snippets
30+
31+
### 1 · Extract audio and transcribe
32+
```python
33+
import google.generativeai as genai
34+
from moviepy.editor import VideoFileClip
35+
36+
audio_path = "tmp_audio.mp3"
37+
VideoFileClip(video_path).audio.write_audiofile(audio_path, logger=None)
38+
39+
model = genai.GenerativeModel("gemini-pro")
40+
transcription = model.generate_content(Path(audio_path).read_bytes(), mime_type="audio/mpeg")
41+
```
42+
43+
### 2 · Describe video frames
44+
```python
45+
from pathlib import Path
46+
from PIL import Image
47+
48+
def sample_frames(video_path, every_sec=5):
49+
clip = VideoFileClip(video_path)
50+
for t in range(0, int(clip.duration), every_sec):
51+
frame = clip.get_frame(t)
52+
img = Image.fromarray(frame)
53+
fname = f"frame_{t:04}.png"
54+
img.save(fname)
55+
yield fname
56+
57+
vision_model = genai.GenerativeModel("gemini-pro-vision")
58+
59+
scene_descriptions = []
60+
for frame_file in sample_frames(video_path):
61+
desc = vision_model.generate_content(Path(frame_file).read_bytes(), mime_type="image/png")
62+
scene_descriptions.append(desc.text)
63+
```
64+
65+
### 3 · Summarise and index
66+
```python
67+
summary = model.generate_content(
68+
"Summarise this transcript:\n" + transcription.text
69+
).text
70+
71+
embeddings = model.embed_content(transcription.text.split("\n"))
72+
```
73+
74+
## Folder layout
75+
```
76+
video-analyzer/
77+
├── video_analyzer.py # main script
78+
├── templates/ # optional web UI
79+
└── README.md # setup & usage docs
80+
```
81+
82+
## Next steps
83+
* Add a Streamlit front-end.
84+
* Integrate **Gemini function-calling** for automatic action extraction.
85+
* Accept YouTube URLs (download + analyse).
86+
87+
---
88+
**Contributing** – please feel free to open issues or PRs to improve this example.

0 commit comments

Comments
 (0)