Gemini Computer Use Agent

A minimal browser automation agent using Google's Gemini 2.5 Computer Use Preview model and Playwright for web browser control.

Features

Visual Browser Control: Uses screenshots to "see" and interact with web pages
Automated Actions: Supports mouse clicks, keyboard input, scrolling, navigation, and more
Safety Controls: Built-in confirmation prompts for risky actions
Human-in-the-Loop: Optional user confirmation for sensitive operations

Supported Actions

open_web_browser, navigate, search
click_at, hover_at, type_text_at
key_combination, scroll_document, scroll_at
drag_and_drop, go_back, go_forward
wait_5_seconds

Setup

1. Create and activate environment

conda create -n gemcu python=3.11 -y
conda activate gemcu

2. Install packages

python -m pip install --upgrade pip
python -m pip install google-genai playwright termcolor

3. Install Playwright browser

playwright install chromium

4. Set API key

# Windows PowerShell
$env:GEMINI_API_KEY="PASTE_YOUR_KEY_HERE"

# Linux/Mac
export GEMINI_API_KEY="PASTE_YOUR_KEY_HERE"

Usage

python agent.py "Find Wikipedia article about Niagara Falls and open History section"

Requirements

Python 3.11+
Gemini API key (Get API key)
Chrome/Chromium browser

Safety

This agent runs in a controlled browser environment. For production use, consider running in a sandboxed virtual machine or container for additional security.

Based on Google's Gemini Computer Use API.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
agent.py		agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gemini Computer Use Agent

Features

Supported Actions

Setup

1. Create and activate environment

2. Install packages

3. Install Playwright browser

4. Set API key

Usage

Requirements

Safety

About

Uh oh!

Languages

pmbstyle/gemini-computer-use

Folders and files

Latest commit

History

Repository files navigation

Gemini Computer Use Agent

Features

Supported Actions

Setup

1. Create and activate environment

2. Install packages

3. Install Playwright browser

4. Set API key

Usage

Requirements

Safety

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages