--- name: browser-tools description: Headless browser automation via Playwright. Use when you need to navigate pages, inspect DOM state, extract content, or capture screenshots on a headless machine. --- # Browser Tools Headless browser tools for agent-assisted web automation. On this machine they run with a persistent Playwright Chromium profile and no display. ## Setup Run once before first use: ```bash cd {baseDir}/browser-tools npm install ``` ## Initialize Browser Profile ```bash {baseDir}/browser-start.js # Fresh profile {baseDir}/browser-start.js --profile # Copy user's profile (cookies, logins) ``` Prepare the persistent headless browser profile. Use `--profile` to sync an existing local Chrome/Chromium profile when available. ## Navigate ```bash {baseDir}/browser-nav.js https://example.com {baseDir}/browser-nav.js https://example.com --new ``` Navigate to URLs. Use `--new` flag to open in a new tab instead of reusing current tab. ## Evaluate JavaScript ```bash {baseDir}/browser-eval.js 'document.title' {baseDir}/browser-eval.js 'document.querySelectorAll("a").length' ``` Execute JavaScript in the active tab. Code runs in async context. Use this to extract data, inspect page state, or perform DOM operations programmatically. ## Screenshot ```bash {baseDir}/browser-screenshot.js ``` Capture current viewport and return temporary file path. Use this to visually inspect page state or verify UI changes. ## Cookies ```bash {baseDir}/browser-cookies.js ``` Display all cookies for the current tab including domain, path, httpOnly, and secure flags. Use this to debug authentication issues or inspect session state. ## Extract Page Content ```bash {baseDir}/browser-content.js https://example.com ``` Navigate to a URL and extract readable content as markdown. Uses Mozilla Readability for article extraction and Turndown for HTML-to-markdown conversion. Works on pages with JavaScript content (waits for page to load). ## When to Use - Testing frontend code in a headless browser - Interacting with pages that require JavaScript - Debugging authentication or session issues - Scraping dynamic content that requires JS execution - Capturing DOM state and screenshots on a server with no display --- ## Efficiency Guide ### DOM Inspection Over Screenshots **Don't** take screenshots to see page state. **Do** parse the DOM directly: ```javascript // Get page structure document.body.innerHTML.slice(0, 5000) // Find interactive elements Array.from(document.querySelectorAll('button, input, [role="button"]')).map(e => ({ id: e.id, text: e.textContent.trim(), class: e.className })) ``` ### Complex Scripts in Single Calls Wrap everything in an IIFE to run multi-statement code: ```javascript (function() { // Multiple operations const data = document.querySelector('#target').textContent; const buttons = document.querySelectorAll('button'); // Interactions buttons[0].click(); // Return results return JSON.stringify({ data, buttonCount: buttons.length }); })() ``` ### Batch Interactions **Don't** make separate calls for each click. **Do** batch them: ```javascript (function() { const actions = ["btn1", "btn2", "btn3"]; actions.forEach(id => document.getElementById(id).click()); return "Done"; })() ``` ### Typing/Input Sequences ```javascript (function() { const text = "HELLO"; for (const char of text) { document.getElementById("key-" + char).click(); } document.getElementById("submit").click(); return "Submitted: " + text; })() ``` ### Reading App/Game State Extract structured state in one call: ```javascript (function() { const state = { score: document.querySelector('.score')?.textContent, status: document.querySelector('.status')?.className, items: Array.from(document.querySelectorAll('.item')).map(el => ({ text: el.textContent, active: el.classList.contains('active') })) }; return JSON.stringify(state, null, 2); })() ``` ### Waiting for Updates If DOM updates after actions, add a small delay with bash: ```bash sleep 0.5 && {baseDir}/browser-eval.js '...' ``` ### Investigate Before Interacting Always start by understanding the page structure: ```javascript (function() { return { title: document.title, forms: document.forms.length, buttons: document.querySelectorAll('button').length, inputs: document.querySelectorAll('input').length, mainContent: document.body.innerHTML.slice(0, 3000) }; })() ``` Then target specific elements based on what you find.