Build a Web Data Extraction Agent
Instructions
Objective
Build a Python agent that uses Claude's Computer Use capability to navigate a website and extract structured data (like product listings) into JSON format.
Background
Unlike traditional web scraping that parses HTML, Computer Use agents interact with websites visually. This works even on JavaScript-heavy sites where traditional scrapers fail.
Requirements
Create a class WebDataExtractor with the following methods:
1. generate_navigation_sequence(url: str) -> list[dict]
Returns Computer Use tool calls to:
- Focus the URL bar (Ctrl+L)
- Type the URL
- Press Enter to navigate
- Wait for page load
2. generate_scroll_sequence(direction: str, amount: int) -> list[dict]
Returns tool calls to scroll:
direction: "up" or "down"amount: number of scroll units
Use the scroll action from computer_20250124.
3. parse_screenshot_data(screenshot_description: str) -> list[dict]
Given a text description of what Claude sees on screen, extract structured data:
[
{
"title": "Product Name",
"price": "$99.99",
"rating": "4.5/5",
"in_stock": True
}
]
4. generate_extraction_workflow(url: str, scroll_pages: int) -> list[dict]
Returns a complete workflow:
- Navigate to URL
- Take screenshot
- Scroll down
- Take screenshot
- Repeat for
scroll_pagesiterations
Tool Call Formats
# URL bar focus
{"type": "tool_use", "name": "computer", "input": {"action": "key", "text": "ctrl+l"}}
# Type URL
{"type": "tool_use", "name": "computer", "input": {"action": "type", "text": "https://example.com"}}
# Press Enter
{"type": "tool_use", "name": "computer", "input": {"action": "key", "text": "Return"}}
# Wait for load
{"type": "tool_use", "name": "computer", "input": {"action": "wait", "duration": 3000}}
# Scroll down
{"type": "tool_use", "name": "computer", "input": {"action": "scroll", "coordinate": [512, 384], "direction": "down", "amount": 3}}
# Take screenshot
{"type": "tool_use", "name": "computer", "input": {"action": "screenshot"}}
Hints
- Always wait after navigation for page to load
- Use
coordinatein scroll to specify where to scroll - Screenshots should be taken after wait actions
- Handle both Mac (cmd) and Linux (ctrl) keyboard shortcuts
What to Submit
Your submission should contain 1 file section in the editor below: a Python file with the complete WebDataExtractor class.