Skip to content

A minimal browser automation agent using Google's Gemini 2.5 Computer Use Preview model and Playwright for web browser control.

Notifications You must be signed in to change notification settings

pmbstyle/gemini-computer-use

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Gemini Computer Use Agent

A minimal browser automation agent using Google's Gemini 2.5 Computer Use Preview model and Playwright for web browser control.

image

Features

  • Visual Browser Control: Uses screenshots to "see" and interact with web pages
  • Automated Actions: Supports mouse clicks, keyboard input, scrolling, navigation, and more
  • Safety Controls: Built-in confirmation prompts for risky actions
  • Human-in-the-Loop: Optional user confirmation for sensitive operations

Supported Actions

  • open_web_browser, navigate, search
  • click_at, hover_at, type_text_at
  • key_combination, scroll_document, scroll_at
  • drag_and_drop, go_back, go_forward
  • wait_5_seconds

Setup

1. Create and activate environment

conda create -n gemcu python=3.11 -y
conda activate gemcu

2. Install packages

python -m pip install --upgrade pip
python -m pip install google-genai playwright termcolor

3. Install Playwright browser

playwright install chromium

4. Set API key

# Windows PowerShell
$env:GEMINI_API_KEY="PASTE_YOUR_KEY_HERE"

# Linux/Mac
export GEMINI_API_KEY="PASTE_YOUR_KEY_HERE"

Usage

python agent.py "Find Wikipedia article about Niagara Falls and open History section"

Requirements

  • Python 3.11+
  • Gemini API key (Get API key)
  • Chrome/Chromium browser

Safety

This agent runs in a controlled browser environment. For production use, consider running in a sandboxed virtual machine or container for additional security.

Based on Google's Gemini Computer Use API.

About

A minimal browser automation agent using Google's Gemini 2.5 Computer Use Preview model and Playwright for web browser control.

Topics

Resources

Stars

Watchers

Forks

Languages