Browser Automation with Claude

Web Navigation Fundamentals

5 min read

Browser automation is one of the most powerful use cases for Computer Use. Claude can navigate websites, fill forms, and extract data from any web page.

Basic Navigation

task = """
Open Firefox and:
1. Go to news.ycombinator.com
2. Find the top story
3. Click on it to read more
4. Report the headline and first paragraph
"""

Claude handles navigation by:

  • Clicking the URL bar
  • Typing the address
  • Pressing Enter
  • Waiting for page load

Page Understanding

Claude's vision capabilities enable:

Capability Example
Text reading Headlines, articles, labels
Button identification "Submit", "Login", icons
Form detection Input fields, dropdowns
Layout understanding Sidebars, navigation, content

Waiting for Content

Web pages load dynamically. Handle this with waits:

task = """
Go to a weather site and:
1. Wait for the page to fully load
2. Find the current temperature
3. If there's a loading spinner, wait for it to disappear
4. Report the weather conditions
"""

Handling Page Elements

task = """
On the product page:
1. Click 'Add to Cart' button
2. Wait for cart update
3. Click the cart icon
4. Verify item is in cart
"""

Scrolling

Claude can scroll to find content:

task = """
Go to the documentation page and:
1. Scroll down to the 'API Reference' section
2. Find the authentication endpoint
3. Extract the example code
"""

Tab Management

task = """
1. Open google.com in a new tab (Ctrl+T)
2. Search for 'Claude AI'
3. Open the first result in a new tab
4. Switch back to the first tab
5. Report what you found
"""

Common Challenges

Challenge Solution
Slow loading Add wait actions
Pop-ups Click close or accept
Cookie banners Dismiss or accept
Infinite scroll Scroll incrementally
Dynamic content Wait for stability

Tip: Use Ctrl+L to quickly focus the URL bar in most browsers.

Next, we'll learn how to fill forms and handle authentication. :::

Quiz

Module 4: Browser Automation

Take Quiz