Back to Course|Claude Computer Use: Building Autonomous Desktop & Browser Agents
Lab

Analyze a Computer Use API Response

20 min
Beginner
3 Free Attempts

Instructions

Objective

In this lab, you'll write a Python function that parses Computer Use API responses and extracts key information about the agent's actions.

Background

When Claude uses the Computer Use tool, it returns structured JSON describing actions like mouse clicks, keyboard input, and screenshots. Understanding this structure is essential for building monitoring and debugging systems.

Requirements

Create a function parse_computer_use_response(response: dict) -> dict that:

  1. Extracts action type: Identify whether it's a mouse_move, left_click, type, key, screenshot, scroll, or wait action

  2. Extracts coordinates: For mouse actions, return {"x": int, "y": int}

  3. Extracts text content: For type actions, return the typed text

  4. Identifies tool version: Extract computer_20250124 or older version

  5. Returns structured output:

    {
        "action_type": str,
        "coordinates": {"x": int, "y": int} | None,
        "text": str | None,
        "tool_version": str,
        "is_screenshot_request": bool
    }
    

Example Input

response = {
    "type": "tool_use",
    "name": "computer",
    "input": {
        "action": "left_click",
        "coordinate": [512, 384]
    }
}

Example Output

{
    "action_type": "left_click",
    "coordinates": {"x": 512, "y": 384},
    "text": None,
    "tool_version": "computer_20250124",
    "is_screenshot_request": False
}

Hints

  • The coordinate field is a list [x, y], not a dict
  • Screenshot actions have action: "screenshot"
  • The type action has a text field with the content to type
  • Handle missing fields gracefully with defaults

Grading Rubric

Correctly extracts action type from response25 points
Properly parses coordinate array to x/y dict25 points
Extracts text content for type actions25 points
Handles edge cases and missing fields gracefully25 points

Your Solution

Use any programming language
3 free attempts remaining