Skip to content

Conversation

tkattkat
Copy link
Collaborator

@tkattkat tkattkat commented Oct 7, 2025

why

Currently, anthropic does not have a goto tool, limiting its ability to navigate the web

what changed

added goto tool to the AnthropicCuaClient

test plan

tested locally & on browserbase

Copy link

changeset-bot bot commented Oct 7, 2025

🦋 Changeset detected

Latest commit: 044763d

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This pull request adds a new "goto" tool to the AnthropicCUAClient to enable URL navigation capabilities. The implementation follows the established pattern for tool integration within the client, adding three key components:
  1. Tool Definition: A new goto tool is added to the API request configuration with proper JSON schema validation, requiring a URL parameter as a string input.

  2. Tool Execution Handling: The takeAction method now includes logic to process goto tool requests, providing basic success/error messaging when the tool is invoked.

  3. Action Conversion: The convertToolUseToAction method can now convert goto tool use items into AgentAction objects with the appropriate function name and URL arguments.

This change addresses a fundamental limitation where the Anthropic CUA client lacked basic web navigation capabilities, which is essential for web automation tasks that need to move between different URLs during execution. The implementation integrates seamlessly with the existing tool architecture and follows the same patterns used by other built-in tools in the client.

Important Files Changed

Changed Files
Filename Score Overview
lib/agent/AnthropicCUAClient.ts 2/5 Added goto tool definition, execution handling, and action conversion but missing actual navigation implementation

Confidence score: 2/5

  • This PR has a critical implementation gap that prevents it from functioning as intended
  • Score reflects incomplete functionality where the goto tool is defined but doesn't perform actual navigation
  • Pay close attention to lib/agent/AnthropicCUAClient.ts as it needs the missing action handler call to complete the navigation functionality

Sequence Diagram

sequenceDiagram
    participant User
    participant AnthropicCUAClient
    participant AnthropicAPI
    participant ScreenshotProvider
    participant ActionHandler

    User->>AnthropicCUAClient: "execute(executionOptions)"
    AnthropicCUAClient->>AnthropicCUAClient: "createInitialInputItems(instruction)"
    
    loop "while !completed && currentStep < maxSteps"
        AnthropicCUAClient->>AnthropicCUAClient: "executeStep(inputItems, logger)"
        AnthropicCUAClient->>AnthropicCUAClient: "getAction(inputItems)"
        AnthropicCUAClient->>AnthropicAPI: "beta.messages.create(requestParams with goto tool)"
        AnthropicAPI-->>AnthropicCUAClient: "response with content blocks"
        
        alt "tool_use block found"
            AnthropicCUAClient->>AnthropicCUAClient: "convertToolUseToAction(toolUseItem)"
            
            alt "goto tool"
                AnthropicCUAClient->>AnthropicCUAClient: "return goto action"
            else "computer tool"
                AnthropicCUAClient->>AnthropicCUAClient: "return computer action"
            else "custom tool"
                AnthropicCUAClient->>AnthropicCUAClient: "return custom_tool action"
            end
            
            opt "actionHandler exists"
                AnthropicCUAClient->>ActionHandler: "actionHandler(action)"
                ActionHandler-->>AnthropicCUAClient: "action executed"
            end
            
            AnthropicCUAClient->>AnthropicCUAClient: "takeAction(toolUseItems, logger)"
            
            alt "computer tool"
                AnthropicCUAClient->>ScreenshotProvider: "captureScreenshot()"
                ScreenshotProvider-->>AnthropicCUAClient: "base64 screenshot"
            else "goto tool"
                AnthropicCUAClient->>AnthropicCUAClient: "create success message"
            else "custom tool"
                AnthropicCUAClient->>AnthropicCUAClient: "execute custom tool"
            end
        else "text block found"
            AnthropicCUAClient->>AnthropicCUAClient: "extract message text"
        end
        
        AnthropicCUAClient->>AnthropicCUAClient: "update nextInputItems with assistant response"
    end
    
    AnthropicCUAClient-->>User: "AgentResult with success, actions, message, completed"
Loading

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +604 to +610
if (item.name === "goto") {
try {
const url = item.input.url;
toolResult = `Successfully navigated to ${url}`;
} catch (error) {
toolResult = `Error with goto: ${error instanceof Error ? error.message : String(error)}`;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The goto tool claims success but doesn't actually perform navigation. It only extracts the URL and returns a success message without calling any navigation logic.

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/agent/AnthropicCUAClient.ts
Line: 604:610

Comment:
**logic:** The goto tool claims success but doesn't actually perform navigation. It only extracts the URL and returns a success message without calling any navigation logic.

How can I resolve this? If you propose a fix, please make it concise.

@tkattkat tkattkat marked this pull request as draft October 7, 2025 21:17
@tkattkat tkattkat marked this pull request as ready for review October 7, 2025 21:43
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Summary

This PR adds a `goto` tool to the `AnthropicCUAClient` class to enable web navigation capabilities for the Anthropic Computer Use Agent (CUA). The implementation follows the established pattern in the codebase by adding the tool definition to the tools array, handling execution in the `takeAction` method, and converting it to an `AgentAction` in the `convertToolUseToAction` method.

The change addresses a stated limitation where Anthropic's CUA lacks navigation functionality, which is essential for web automation tasks. The goto tool is defined with a simple schema requiring only a URL parameter and integrates into the existing tool execution pipeline alongside the computer tool and custom tools.

However, the current implementation is fundamentally incomplete - it only extracts the URL parameter and returns a success message without actually performing any navigation. This means the tool will report successful navigation while the browser remains on the current page, creating a mismatch between the agent's understanding and the actual browser state.

Important Files Changed

Changed Files
Filename Score Overview
lib/agent/AnthropicCUAClient.ts 1/5 Added goto tool definition and handling, but implementation only simulates navigation without performing actual browser navigation

Confidence score: 1/5

  • This PR is not safe to merge due to a critical logic flaw that will cause agent confusion
  • Score reflects incomplete implementation that claims success while performing no actual navigation
  • Pay close attention to the goto tool execution logic in AnthropicCUAClient.ts which needs actual navigation implementation

Sequence Diagram

sequenceDiagram
    participant User
    participant AnthropicCUAClient
    participant Anthropic_API
    participant ActionHandler
    participant ScreenshotProvider

    User->>AnthropicCUAClient: "execute(instruction)"
    AnthropicCUAClient->>AnthropicCUAClient: "createInitialInputItems(instruction)"
    
    loop "Until completed or maxSteps reached"
        AnthropicCUAClient->>AnthropicCUAClient: "executeStep(inputItems)"
        AnthropicCUAClient->>Anthropic_API: "client.beta.messages.create(requestParams)"
        Note over AnthropicCUAClient,Anthropic_API: "Includes computer tool and new goto tool"
        Anthropic_API-->>AnthropicCUAClient: "response with content blocks"
        
        AnthropicCUAClient->>AnthropicCUAClient: "convertToolUseToAction(toolUseItem)"
        
        opt "If actionHandler exists and actions found"
            loop "For each action"
                AnthropicCUAClient->>ActionHandler: "actionHandler(action)"
                ActionHandler-->>AnthropicCUAClient: "action completed"
            end
        end
        
        alt "Tool is computer"
            AnthropicCUAClient->>ScreenshotProvider: "captureScreenshot()"
            ScreenshotProvider-->>AnthropicCUAClient: "base64 screenshot"
        else "Tool is goto"
            Note over AnthropicCUAClient: "Process goto URL navigation"
        else "Tool is custom tool"
            AnthropicCUAClient->>AnthropicCUAClient: "tools[toolName].execute(input)"
        end
        
        AnthropicCUAClient->>AnthropicCUAClient: "takeAction(toolUseItems)"
        AnthropicCUAClient->>AnthropicCUAClient: "Update nextInputItems for conversation"
    end
    
    AnthropicCUAClient-->>User: "AgentResult with success, actions, message, completed"
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant