1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
# Browser Automation via Mobile Phone

## Overview

Bridge is a system for programmatic browser automation that uses a real mobile browser (Firefox Android) as its execution engine. A Python server sends commands to a browser extension over WebSocket, enabling scraping and interaction with websites that rely on JavaScript rendering, SPAs, cookie state, or anti-bot detection — all through an actual browser session on a physical device.

Unlike headless browser tools (Puppeteer, Playwright), Bridge operates on a real phone with a real browser fingerprint, making it suitable for sites that block automated browsers.

## Architecture

```
 Python scripts (CLI)
       |
       | (Unix socket or in-process)
       v
 Python WebSocket server
       |
       | WSS (TLS via reverse proxy)
       v
 Firefox extension (phone browser)
       |
       | tabs.executeScript / content script
       v
 Target website DOM
```

### Components

1. **Firefox Extension** — Runs on Firefox Android (Nightly). Connects to the server via WebSocket, authenticates with a shared token, and executes commands (navigate, run JS, click, fill, wait, screenshot, etc.) against the active browser tab.

2. **WebSocket Server** — Python asyncio server using the `websockets` library. Accepts one extension connection, authenticates it, and exposes a command interface. Also listens on a Unix domain socket for local CLI commands.

3. **Python Client SDK** (`BridgeClient`) — Thin async wrapper around the server's command interface. Used by site-specific automation scripts. Can connect in-process (for long-running orchestration) or via the Unix control socket (for one-off CLI commands).

4. **Site Modules** — Per-website automation scripts that combine navigation, JS extraction, and result formatting. Each module is a self-contained CLI tool.

5. **Reverse Proxy** — Nginx terminates TLS (via Let's Encrypt) and proxies `wss://` to the local WebSocket server. This allows the phone extension to connect securely over the internet.

---

## Extension

### Manifest (v2, Firefox)

```json
{
  "manifest_version": 2,
  "name": "Bridge",
  "permissions": ["activeTab", "tabs", "<all_urls>", "cookies", "webNavigation", "storage"],
  "background": { "scripts": ["background.js"], "persistent": true },
  "content_scripts": [{ "matches": ["<all_urls>"], "js": ["content.js"], "run_at": "document_idle" }],
  "browser_specific_settings": { "gecko": { "id": "bridge@local" } }
}
```

The extension requires Manifest V2 because Firefox Android Nightly supports sideloading `.xpi` files signed via AMO (addons.mozilla.org) as unlisted add-ons.

### Background Script

The background script is the core of the extension. It:

1. **Loads a stored auth token** from `browser.storage.local`.
2. **Connects to the WebSocket server** and sends `{ type: "auth", token: "..." }`.
3. **Receives commands** as `{ type: "command", id, command, params }` messages.
4. **Dispatches** each command to a handler function.
5. **Returns results** as `{ type: "result", id, success, data?, error? }`.
6. **Emits events** (e.g. `pageLoaded`) when navigation completes.
7. **Auto-reconnects** with exponential backoff + jitter on disconnection.
8. **Sends heartbeat pings** every 25 seconds to keep the connection alive.

#### Supported Commands

| Command | Params | Description |
|---|---|---|
| `getPageInfo` | — | Returns `{ url, title }` of the active tab |
| `navigate` | `{ url }` | Navigates the active tab to a URL |
| `executeJs` | `{ code, context? }` | Executes JavaScript in the active tab. `context: "page"` runs in the page's own JS context (needed for accessing page-scope variables); default `"content"` runs via `tabs.executeScript` |
| `getHtml` | `{ selector? }` | Returns `outerHTML` of a selector match, or the full document |
| `click` | `{ selector }` | Clicks the first element matching a CSS selector |
| `fill` | `{ selector, value }` | Sets a form field's value and dispatches `input`/`change` events |
| `scroll` | `{ y?, selector? }` | Scrolls by `y` pixels, or scrolls an element into view |
| `waitFor` | `{ selector, timeout? }` | Waits for a CSS selector to appear (MutationObserver-based), default 10s timeout |
| `screenshot` | — | Returns a `data:image/png;base64,...` screenshot of the visible tab |
| `getCookies` | `{ domain? }` | Returns cookies for a domain or the current page |

#### Token Prompt

If no token is stored, the extension injects a full-screen overlay into the current page (via the content script) prompting the user to enter the token phrase. This is more reliable on Firefox Android than using extension popups. The token is persisted in `browser.storage.local`.

#### Content Script

The content script serves two purposes:

1. **Page-context JS execution** — When `executeJs` is called with `context: "page"`, the content script injects a `<script>` element into the page and communicates results back via `window.postMessage`. This allows access to the page's own JavaScript scope (e.g., `__NEXT_DATA__`, Angular services).

2. **Token prompt overlay** — Renders and manages the token input UI.

#### Firefox Android Compatibility

- `getActiveTab()` uses 3 fallback strategies because `browser.tabs.query({ currentWindow: true })` is unreliable on Android.
- Auth rejection (WebSocket close code 4001) clears the stored token and re-prompts.
- All `ws.send()` calls are wrapped in try/catch.

### Building and Signing

The extension must be signed via the AMO API for Firefox Android to accept it:

```bash
web-ext sign \
  --source-dir=extension \
  --api-key="$AMO_API_KEY" \
  --api-secret="$AMO_API_SECRET" \
  --channel=unlisted \
  --artifacts-dir=dist
```

Install on phone: download the `.xpi`, then in Firefox Nightly: Settings > Advanced > Install add-on from file.

---

## Server

### WebSocket Server (`ws_server.py`)

A single-file asyncio server with two interfaces:

**WebSocket interface** (for the extension):
- Listens on `127.0.0.1:8767` (behind the reverse proxy).
- First message must be `{ type: "auth", token }` matching the `BRIDGE_TOKEN` environment variable.
- On auth failure, closes with code 4001.
- Uses `websockets` library with `ping_interval=10, ping_timeout=10` for dead connection detection.
- Tracks pending commands as `{ id: Future }` — each `send_command()` creates a Future resolved when the extension sends back a matching result.

**Unix socket interface** (for local CLI):
- Listens at `/tmp/bridge-control.sock` (mode 0600).
- Accepts JSON messages: `{ command, params, timeout }`.
- Forwards to the extension and returns the result.
- Enables one-off commands without embedding the server in each script.

### CLI Entry Point (`__main__.py`)

```bash
# Start the server (blocks, waits for extension)
python -m server

# Send a one-off command to the running server
python -m server cmd getPageInfo
python -m server cmd executeJs 'document.title'
python -m server cmd navigate 'https://example.com'
```

### Deployment

The server runs as a systemd service on any Linux machine reachable from the internet (a cloud server, home server, etc.):

**systemd unit:**
```ini
[Unit]
Description=Bridge WebSocket Server
After=network.target

[Service]
Type=simple
WorkingDirectory=/opt/bridge
EnvironmentFile=/opt/bridge/.env
ExecStart=/opt/bridge/.venv/bin/python -m server
Restart=always
RestartSec=3
```

**Environment file** (`.env`):
```
BRIDGE_TOKEN=your shared secret phrase
```

**Nginx reverse proxy:**
```nginx
location / {
    proxy_pass http://127.0.0.1:8767;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_read_timeout 86400s;
    proxy_send_timeout 86400s;
}
```

TLS is provided by Let's Encrypt / certbot with the nginx plugin.

---

## Python Client SDK

### `BridgeClient`

```python
from bridge.client import BridgeClient

# Connect via Unix control socket (talks to running server)
client = BridgeClient.connect()

# Use in an async context
info = await client.get_page_info()          # -> { url, title }
await client.navigate("https://example.com")
result = await client.execute_js("document.title")
await client.click("#submit-button")
await client.fill("#email", "user@example.com")
await client.wait_for(".results-loaded", timeout=15000)
html = await client.get_html(".product-card")
await client.scroll(y=500)
screenshot_data_url = await client.screenshot()
cookies = await client.get_cookies(domain=".example.com")
```

### `run_js_file()`

For complex extraction logic, JavaScript is stored in separate `.js` files and loaded at runtime:

```python
result = await client.run_js_file("sites/mysite/js/extract_data.js")
```

Supports `str.format()` placeholder substitution:

```python
result = await client.run_js_file("sites/mysite/js/search.js", query="coffee shops")
```

### Optional Command Logging

`BridgeClient` accepts a `run_id` for logging all commands to a SQLite database (command name, params, result, duration, errors). Useful for debugging and replay analysis.

---

## Site Modules

### Structure

Each site module lives under `sites/<sitename>/` with the following convention:

```
sites/
  mysite/
    __init__.py
    search.py          # Main CLI script
    detail.py           # Optional detail extraction
    js/
      extract_search.js # JS executed in the browser to extract data
      extract_detail.js
      accept_cookies.js
    MYSITE.md           # Documentation: DOM structure, selectors, quirks
```

### Pattern

Every site module follows the same pattern:

1. **Navigate** to the target URL (often with SPA cache-busting: navigate to `about:blank` first).
2. **Wait** for the page to load (`asyncio.sleep` or `client.wait_for`).
3. **Dismiss overlays** (cookie banners, login prompts).
4. **Execute JS** to extract structured data from the DOM.
5. **Return JSON** from the JS to Python.
6. **Pretty-print** results for CLI output.
7. **Optionally drill down** into detail views.

### JS Extraction Pattern

Extraction scripts are plain JavaScript that runs in the browser tab. They query the DOM, build a data structure, and return it as a JSON string:

```javascript
// sites/mysite/js/extract_search.js
var items = document.querySelectorAll(".product-card");
var results = [];
items.forEach(function(item) {
  results.push({
    name: item.querySelector(".title").textContent.trim(),
    price: item.querySelector(".price").textContent.trim(),
    url: item.querySelector("a").href
  });
});
JSON.stringify({ count: results.length, results: results });
```

The last expression in the script is the return value. `JSON.stringify()` is used because `tabs.executeScript` serializes return values, and complex DOM-derived objects may not survive serialization.

### CLI Interface Pattern

Each module is invoked as a Python module with consistent argument conventions:

```bash
# Basic search
python -m sites.mysite.search "query term"

# Drill into a specific result (by 0-based index)
python -m sites.mysite.search "query term" --detail 0

# Pagination
python -m sites.mysite.search "query term" --page 2

# Site-specific flags
python -m sites.mysite.search "query term" --direct --class business
```

The `--detail N` pattern is universal: search first, then drill into result N for more information.

### Example: Library Catalog Search

```bash
# Search for books
python -m sites.library.search "borges"
#   0. Collected fictions                    Jorge Luis Borges                    Book [available]
#   1. Labyrinths                            Jorge Luis Borges                    Book [available]
#   2. The Aleph and other stories           Jorge Luis Borges                    Book

# Show detail for first result (copies, availability, ISBN)
python -m sites.library.search "borges" --detail 0

# Page 2 of results
python -m sites.library.search "borges" --page 2

# Show 50 results per page
python -m sites.library.search "borges" --per-page 50
```

### Example: Flight Aggregator

```bash
# One-way flight search
python -m sites.flights.search AMS BKK 2026-03-17

# Return trip
python -m sites.flights.search AMS BKK 2026-03-17 --return 2026-03-24

# Direct flights only, business class
python -m sites.flights.search AMS BKK 2026-03-17 --direct --class business

# Show booking providers for flight #0
python -m sites.flights.search AMS BKK 2026-03-17 --detail 0

# Load more results
python -m sites.flights.search AMS BKK 2026-03-17 --more
```

### Example: Online Grocery Store (Search + Cart)

```bash
# Search products
python -m sites.grocery.search "milk"

# Add first result to cart
python -m sites.grocery.search "milk" --add 0

# Add 3 of something
python -m sites.grocery.search "milk" --add 0 -q 3

# Remove from cart (set quantity to 0)
python -m sites.grocery.search "milk" --add 0 -q 0

# Show product detail
python -m sites.grocery.search "milk" --detail 0
```

### Example: Google Maps Business Search

```bash
# Search for businesses, visits each result to extract full details
python -m sites.gmaps.search "coffee shops amsterdam"
# Outputs: name, website, phone, address, rating for each business
# Saves full results to data/gmaps_coffee_shops_amsterdam.json
```

---

## SPA and Anti-Detection Patterns

### SPA Cache Busting

Single-page applications cache state between navigations. To force a fresh page load:

```python
info = await client.get_page_info()
if 'targetsite.com' in info.get('url', ''):
    await client.navigate("about:blank")
    await asyncio.sleep(1)
await client.navigate(target_url)
```

### Cookie Banner Dismissal

Most sites show a cookie consent banner on first visit. Each module handles this with a small JS snippet:

```javascript
var btn = document.getElementById("cookie-accept");
if (btn) { btn.click(); "accepted"; } else { "no_banner"; }
```

### Angular / React Input Filling

Frameworks that use virtual DOMs or change detection often don't respond to `element.value = x`. Workarounds:

- **Angular (contenteditable):** Use `document.execCommand("insertText")` + dispatch `InputEvent`.
- **React:** Set the value, then dispatch `input` and `change` events with `{ bubbles: true }`.
- **Clear first:** Use Selection API + `execCommand("delete")` rather than setting `textContent = ""`.

### Avoiding Direct Navigation

Some sites return "Access Denied" when navigating directly to product URLs. The workaround is to search first, then click the product link in the search results page:

```python
# Don't do this — triggers bot detection:
await client.navigate("https://store.com/product/12345")

# Do this instead — click through from search:
await client.execute_js(f'document.querySelector(\'a[href="{href}"]\').click()')
```

### Extracting Data from Hidden Elements

Some SPAs render detail panels off-screen or with `visibility: hidden`. Use `textContent` instead of `innerText` (which respects CSS visibility):

```javascript
var panel = document.querySelector(".detail-panel");
// innerText returns "" if panel is hidden
// textContent returns the full text regardless of visibility
var data = panel.textContent;
```

---

## Database

An optional SQLite database (`data/bridge.db`) logs automation runs:

**Schema:**
- `runs` — Tracks each automation invocation (recipe name, start/end time, status, error).
- `results` — Stores extracted data per URL per run.
- `command_log` — Every command sent to the extension (command name, params, result, duration_ms, error).

This is useful for debugging failed extractions and measuring performance.

---

## Setup Checklist

### Server

1. Have a server reachable from the internet (cloud instance, home server, etc.) with a public IP and domain name.
2. Install Python 3.9+, nginx, certbot.
3. Create a Python venv and install `websockets`.
4. Set `BRIDGE_TOKEN` in a `.env` file.
5. Deploy the server code, nginx config, and systemd unit.
6. Obtain a TLS certificate with certbot.
7. Start the service: `systemctl start bridge`.

### Extension

1. Set the `WS_URL` constant in `background.js` to your `wss://` server URL.
2. Sign the extension via the AMO API (`web-ext sign`).
3. Install Firefox Nightly on an Android phone.
4. Enable the debug menu (Settings > About > tap logo 5x).
5. Install the `.xpi` via Settings > Advanced > Install add-on from file.
6. Visit any webpage — the token prompt appears.
7. Enter the same token phrase configured on the server.

### Adding a New Site Module

1. Create `sites/<sitename>/` with `__init__.py` and `search.py`.
2. Create `sites/<sitename>/js/` with extraction scripts.
3. Write a documentation file `sites/<sitename>/SITENAME.md` with:
   - DOM structure and CSS selectors used.
   - Known quirks and limitations.
   - SPA behavior notes.
4. Follow the standard pattern: navigate, wait, dismiss overlays, extract, format.
5. Use `BridgeClient.connect()` for the client.
6. Make it runnable as `python -m sites.<sitename>.search "query"`.

---

## Protocol Reference

### WebSocket Messages (Server <-> Extension)

**Extension -> Server:**
```json
{ "type": "auth", "token": "shared secret" }
{ "type": "pong", "id": "uuid" }
{ "type": "result", "id": "uuid", "success": true, "data": ... }
{ "type": "result", "id": "uuid", "success": false, "error": "message" }
{ "type": "event", "id": "uuid", "event": "pageLoaded", "data": { "url": "...", "title": "..." } }
{ "type": "ping", "id": "uuid" }
```

**Server -> Extension:**
```json
{ "type": "auth_result", "success": true }
{ "type": "auth_result", "success": false }
{ "type": "command", "id": "uuid", "command": "executeJs", "params": { "code": "..." } }
{ "type": "ping", "id": "uuid" }
{ "type": "pong", "id": "uuid" }
```

### Control Socket Messages (CLI -> Server)

**Request:**
```json
{ "command": "executeJs", "params": { "code": "document.title" }, "timeout": 30 }
```

**Response:**
```json
{ "success": true, "data": "Page Title" }
{ "success": false, "error": "Extension not connected" }
```

---

## Dependencies

- **Server:** Python 3.9+, `websockets` (single pip dependency)
- **Extension:** Firefox 68+ (Manifest V2), no external dependencies
- **Infrastructure:** nginx (reverse proxy + TLS), certbot (Let's Encrypt), systemd
- **Optional:** SQLite (command logging), `web-ext` (extension signing)

## Limitations

- **Single browser tab** — Commands target the active tab. Running multiple automations concurrently is not supported.
- **Single extension connection** — The server accepts one extension at a time.
- **Timing-dependent** — Extraction relies on `asyncio.sleep()` waits for page loads. Adjust delays per site and network conditions.
- **Phone must stay awake** — The browser must remain in the foreground (or at least active) during automation. Screen-off or app switching may disconnect the WebSocket.
- **Manual cookie/login state** — Login sessions are managed by the real browser. If a site requires login, log in manually first; the automation uses the existing session.