TLD2 Architecture Reference

Complete technical documentation for Chrome Extension Manifest V3 with local AI, neural TTS, and CSP-compliant architecture

Last updated: January 2025 • For developers & technical users

System Overview
Manifest V3 Structure
Component Architecture
Data Flow & Message Passing
Content Extraction (Readability.js)
AI Summarization Pipeline
TTS Integration (StreamingKokoroJS)
Storage Architecture
CSP Compliance & Static Bundling
Permissions & Security
Performance Optimizations
Technology Stack

System Overview

TLD2 is a privacy-first Chrome extension that provides AI-powered article summarization with high-quality neural text-to-speech. The architecture emphasizes local execution, CSP compliance, and modern web technologies.

Core Capabilities

Content Extraction: Readability.js parses web pages to isolate main article content
AI Summarization: Chrome's built-in AI or Google Gemini API (optional)
Neural TTS: StreamingKokoroJS with GPU acceleration (WebGPU/WASM)
Privacy-First: 100% local processing by default, no telemetry

High-Level Data Flow

┌─────────────┐ │ Web Page │ └──────┬──────┘ │ ↓ (User clicks TLD2 icon) ┌─────────────────────────┐ │ Content Script │ │ (Readability.js) │ │ Extracts article text │ └──────┬──────────────────┘ │ ↓ (Message passing) ┌─────────────────────────┐ │ Background Script │ │ (Service Worker) │ │ AI Summarization │ └──────┬──────────────────┘ │ ↓ (Streaming tokens) ┌─────────────────────────┐ │ Sidebar UI │ │ (chrome.sidePanel) │ │ Display + Controls │ └──────┬──────────────────┘ │ ↓ (Text chunking) ┌─────────────────────────┐ │ TTS Engine │ │ (StreamingKokoroJS) │ │ Audio synthesis │ └──────┬──────────────────┘ │ ↓ ┌─────────────────────────┐ │ Web Audio API │ │ Audio playback │ └─────────────────────────┘

Manifest V3 Structure

TLD2 uses Chrome Extension Manifest V3 with strict Content Security Policy for maximum security.

// manifest.json
{
  "manifest_version": 3,
  "name": "TLD2",
  "version": "1.0.2",
  "description": "Local AI article summarizer with neural TTS",

  // Required permissions
  "permissions": [
    "activeTab",      // Read current tab content
    "scripting",      // Inject content scripts
    "storage",        // Save user settings
    "sidePanel",      // Display sidebar UI
    "contextMenus"    // Right-click "Summarize" option
  ],

  // Background service worker (Manifest V3 requirement)
  "background": {
    "service_worker": "background/service-worker.js",
    "type": "module"
  },

  // Sidebar UI
  "side_panel": {
    "default_path": "sidepanel/sidepanel.html"
  },

  // Extension icon
  "action": {
    "default_icon": {
      "16": "logo/icon/icon16.png",
      "32": "logo/icon/icon32.png",
      "48": "logo/icon/icon48.png",
      "128": "logo/icon/icon128.png"
    }
  },

  // Content Security Policy - Critical for ONNX Runtime
  "content_security_policy": {
    "extension_pages": "script-src 'self' 'wasm-unsafe-eval'; object-src 'self'"
  },

  // Web-accessible resources for models and libraries
  "web_accessible_resources": [{
    "resources": [
      "lib/onnxruntime-web/*",
      "models/*"
    ],
    "matches": [""]
  }]
}
        

Critical CSP Configuration

'wasm-unsafe-eval' is required for ONNX Runtime Web to execute WebAssembly. This is the only CSP exception needed and is necessary for neural network inference in the browser.

Security Note: This directive only allows WebAssembly execution, not arbitrary JavaScript eval(). All JavaScript remains statically bundled and verified.

Component Architecture

1. Background Service Worker

File: background/service-worker.js

Role: Orchestrates extension lifecycle, message routing, and side effects

Responsibilities:

Initialize extension on install/startup
Create context menu items
Handle icon clicks (open sidebar)
Route messages between content scripts and sidebar
Manage chrome.storage operations

// Background script initialization
chrome.runtime.onInstalled.addListener(() => {
  initialize();
});

chrome.runtime.onStartup.addListener(() => {
  initialize();
});

async function initialize() {
  // Create right-click context menu
  await chrome.contextMenus.removeAll();
  chrome.contextMenus.create({
    id: 'tld2-summarize',
    title: 'TLD2 - Summarize Article',
    contexts: ['page', 'selection', 'link']
  });

  // Initialize default settings if first install
  const settings = await chrome.storage.local.get('settings');
  if (!settings.settings) {
    await chrome.storage.local.set({
      settings: {
        voice: 'af_sky',
        speed: 1.0,
        pitch: 1.0,
        autoplay: true,
        summaryLength: 'medium'
      }
    });
  }
}

// Handle extension icon click
chrome.action.onClicked.addListener(async (tab) => {
  await chrome.sidePanel.open({ tabId: tab.id });
});

// Handle context menu click
chrome.contextMenus.onClicked.addListener(async (info, tab) => {
  if (info.menuItemId === 'tld2-summarize') {
    await chrome.sidePanel.open({ tabId: tab.id });
  }
});
        

2. Content Script

File: content/content-script.js

Role: Executes in web page context to extract article content

Injection Strategy:

Dynamic Injection: Injected when sidebar opens (not on every page load)
Isolated World: Runs in separate JavaScript context from page scripts
Message-Based: Communicates with background via chrome.runtime.sendMessage

// Content script injection (from sidebar)
async function extractContent(tabId) {
  // Inject Readability.js and extraction script
  await chrome.scripting.executeScript({
    target: { tabId: tabId },
    files: ['lib/readability/Readability.js']
  });

  // Extract article content
  const result = await chrome.scripting.executeScript({
    target: { tabId: tabId },
    func: () => {
      const reader = new Readability(document.cloneNode(true));
      const article = reader.parse();
      return {
        title: article?.title || document.title,
        content: article?.textContent || document.body.innerText,
        excerpt: article?.excerpt || ''
      };
    }
  });

  return result[0].result;
}
        

3. Sidebar UI

Files: sidepanel/sidepanel.html, sidepanel/sidepanel.js, sidepanel/sidepanel.css

Role: Main user interface for displaying summaries and controlling TTS

Architecture Pattern:

Single Page Application (SPA): No page reloads, dynamic content updates
Event-Driven: Button clicks, slider changes trigger handlers
State Management: Local state object tracks playback, settings, content

Key UI Components:

Component	Purpose	Implementation
Menu Bar	Copy, share, settings actions	Button group with icon + text
Playbar	TTS playback controls	Play/pause, progress bar, shuttle controls
Summary Display	Show streaming summary text	Scrollable div with fade-in animation
Progress Bar	Visual playback position	Two-layer: played + buffered
Settings Panel	Configure TTS, AI, preferences	Slide-down modal overlay

Data Flow & Message Passing

Message Passing Architecture

Chrome Manifest V3 requires asynchronous message passing between extension components. TLD2 uses a structured message protocol:

// Message structure
{
  action: string,      // Action type (e.g., 'extractContent', 'summarize')
  data: object,        // Payload specific to action
  tabId?: number       // Target tab (if applicable)
}

// Example: Request content extraction
const response = await chrome.runtime.sendMessage({
  action: 'extractContent',
  data: { tabId: currentTab.id }
});

if (response.success) {
  const { title, content } = response.data;
  // Process extracted content
}
        

Message Flow Examples

1. Content Extraction Flow

Sidebar UI └──> sendMessage('extractContent', { tabId }) └──> Background Script └──> executeScript(tabId, readability.js) └──> Content Script (in page context) └──> Readability.parse() └──> Return { title, content } └──> Background forwards to Sidebar └──> Sidebar displays content

2. Settings Update Flow

Sidebar UI └──> User changes voice → sendMessage('updateSettings', { voice: 'af_nicole' }) └──> Background Script └──> chrome.storage.local.set({ settings: {...} }) └──> Return success └──> Sidebar confirms "Settings saved"

Content Extraction (Readability.js)

Readability.js Integration

Library: Mozilla's @mozilla/readability (v0.5.0)

Purpose: Parse web pages to extract main article content, removing ads, navigation, and clutter

How It Works:

Clone Document: Create deep clone of current DOM to avoid modifying page
Parse HTML: Algorithm analyzes HTML structure, content density, and semantic markup
Score Content: Assigns scores to page elements based on article likelihood
Extract Article: Returns highest-scoring content as plain text

// Readability.js usage
import { Readability } from '@mozilla/readability';

function extractArticle() {
  // Clone document to avoid side effects
  const documentClone = document.cloneNode(true);

  // Create Readability instance
  const reader = new Readability(documentClone, {
    charThreshold: 500,     // Minimum characters for valid article
    classesToPreserve: [],  // Don't preserve any CSS classes
    keepClasses: false      // Remove all classes for clean text
  });

  // Parse and extract
  const article = reader.parse();

  if (article) {
    return {
      title: article.title,
      content: article.textContent,      // Plain text, no HTML
      excerpt: article.excerpt,          // First ~200 chars
      length: article.length,            // Character count
      siteName: article.siteName
    };
  }

  // Fallback if extraction fails
  return {
    title: document.title,
    content: document.body.innerText,
    excerpt: ''
  };
}
        

Handling Edge Cases

Dynamic Content: Wait for lazy-loaded content before extraction
Single-Page Apps: May require manual scrolling to trigger content load
Paywalls: Extraction fails if content is behind login—returns error gracefully
Non-Article Pages: Search results, index pages return minimal content—user warned

AI Summarization Pipeline

Dual-Mode Architecture

TLD2 supports two summarization backends with automatic fallback:

Mode 1: Local AI (Chrome Built-in)

API: chrome.ai.summarizer (Chrome 120+)

Advantages: 100% private, offline, no API costs

// Check availability
const canSummarize = await chrome.aiOriginTrial.languageModel
  .capabilities();

if (canSummarize.available === 'readily') {
  const session = await chrome.aiOriginTrial.languageModel.create({
    systemPrompt: 'You are a concise article summarizer.'
  });

  const summary = await session.prompt(
    `Summarize this article in 200 words:\\n\\n${articleText}`
  );

  return summary;
}
        

Mode 2: Cloud AI (Google Gemini)

API: Google Gemini 2.5 Flash Lite

Advantages: Higher quality, better comprehension of technical content

// Gemini API integration
async function summarizeWithGemini(text, apiKey, length = 'medium') {
  const lengthMap = {
    short: '100-200 words',
    medium: '200-400 words',
    long: '400-600 words'
  };

  const response = await fetch(
    `https://generativelanguage.googleapis.com/v1/models/gemini-2.0-flash-exp:generateContent?key=${apiKey}`,
    {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        contents: [{
          parts: [{
            text: `Summarize the following article in ${lengthMap[length]}. Focus on key points and main ideas:\\n\\n${text}`
          }]
        }]
      })
    }
  );

  const data = await response.json();
  return data.candidates[0].content.parts[0].text;
}
        

Streaming Display

Summaries are displayed with streaming animation (simulated LLM-style output at 120 tokens/second):

// Stream summary to UI
function streamSummary(summary, targetElement) {
  const words = summary.split(' ');
  const msPerWord = 1000 / 2;  // ~120 tokens/sec = 2 words/sec

  let index = 0;
  const interval = setInterval(() => {
    if (index < words.length) {
      const word = words[index++];
      const span = document.createElement('span');
      span.textContent = word + ' ';
      span.style.opacity = '0';
      span.style.transition = 'opacity 0.3s';
      targetElement.appendChild(span);

      // Fade in word
      setTimeout(() => span.style.opacity = '1', 10);
    } else {
      clearInterval(interval);
      onStreamComplete();
    }
  }, msPerWord);
}
        

TTS Integration (StreamingKokoroJS)

See TTS Implementation Documentation for complete technical details.

High-Level Integration

Library: StreamingKokoroJS (Kokoro 82M ONNX model)
Model: model_q8f16.onnx (86MB, quantized)
Backend: WebGPU (preferred) or WASM (fallback)
Voices: af_sky, af_nicole, bm_fable, bm_lewis

// TTS initialization
import { KokoroTTS } from './lib/kokoro-js/kokoro.js';

const tts = await KokoroTTS.from_pretrained(
  chrome.runtime.getURL('models/kokoro/'),
  {
    dtype: 'q8',                           // Quantized model
    device: navigator.gpu ? 'webgpu' : 'wasm',
    progress_callback: (progress) => {
      updateStatus(`Loading model: ${Math.round(progress.loaded / progress.total * 100)}%`);
    }
  }
);

// Generate audio for summary
const audio = await tts.generate(summaryText, {
  voice: settings.voice,
  speed: settings.speed
});

// Play via Web Audio API
const audioContext = new AudioContext();
const source = audioContext.createBufferSource();
source.buffer = audio;
source.connect(audioContext.destination);
source.start();
        

Storage Architecture

chrome.storage.local vs. chrome.storage.sync

Storage Type	Use Case	Limit	Syncs Across Devices?
`local`	Cached models, large data	10 MB	No
`sync`	User settings, preferences	100 KB	Yes (if Chrome Sync enabled)

Storage Schema:

// Settings object (chrome.storage.sync)
{
  settings: {
    voice: 'af_sky',           // Selected TTS voice
    speed: 1.0,                // Playback speed (0.5-2.0)
    pitch: 1.0,                // Pitch adjustment
    pitchCorrection: true,     // Auto-adjust pitch with speed
    autoplay: true,            // Auto-play TTS after summary
    summaryLength: 'medium',   // short | medium | long
    geminiApiKey: '',          // Optional Gemini API key
    theme: 'dark',             // dark | light | auto
    infoPrintout: false        // Show status text
  }
}

// Model cache (chrome.storage.local)
{
  modelCached: true,           // Has model been downloaded?
  modelVersion: '1.0',         // Model version for cache invalidation
  lastUsed: 1704067200000      // Timestamp for cleanup
}
        

CSP Compliance & Static Bundling

Content Security Policy Challenges

Manifest V3's strict CSP prevents:

❌ Loading scripts from CDNs
❌ Using eval() or new Function()
❌ Inline scripts without hashes
❌ Remote model downloads at runtime

TLD2's CSP Solutions

1. Static Library Bundling

All dependencies bundled at build time using esbuild:

// build.js
import * as esbuild from 'esbuild';

await esbuild.build({
  entryPoints: [
    'sidepanel/sidepanel.js',
    'background/service-worker.js'
  ],
  bundle: true,
  outdir: 'dist',
  format: 'esm',
  target: 'chrome120',
  minify: true,
  // Bundle external dependencies
  external: [],  // Nothing external - bundle everything
});
        

2. Local Model Storage

ONNX models included in extension package:

extension/
├── models/
│   └── kokoro/
│       ├── model_q8f16.onnx       // 86MB
│       ├── config.json
│       ├── tokenizer.json
│       └── voices-v1.0.bin
├── lib/
│   ├── onnxruntime-web/
│   │   ├── ort-wasm.wasm
│   │   └── ort-wasm-simd.wasm
│   └── readability/
│       └── Readability.js
└── manifest.json
        

3. WASM Path Override

Point ONNX Runtime to local WASM files:

// Override WASM paths for CSP compliance
import * as ort from 'onnxruntime-web';

ort.env.wasm.wasmPaths = {
  'ort-wasm.wasm': chrome.runtime.getURL('lib/onnxruntime-web/ort-wasm.wasm'),
  'ort-wasm-simd.wasm': chrome.runtime.getURL('lib/onnxruntime-web/ort-wasm-simd.wasm'),
  'ort-wasm-threaded.wasm': chrome.runtime.getURL('lib/onnxruntime-web/ort-wasm-threaded.wasm')
};

// Disable remote model loading
ort.env.remoteModels = false;
        

Permissions & Security

Required Permissions Breakdown

Permission	Purpose	Security Implication
`activeTab`	Read content of active tab when TLD2 is triggered	Only accesses pages user explicitly activates TLD2 on
`scripting`	Inject Readability.js content script	Required for content extraction; runs in isolated context
`storage`	Save user settings and cache models	Local only; no remote sync without user consent
`sidePanel`	Display sidebar UI	No sensitive data access; UI-only permission
`contextMenus`	Add right-click "Summarize" option	No data access; UX enhancement only

Security Best Practices

✅ No remote code execution
✅ API keys stored locally with encryption (via chrome.storage)
✅ No analytics or telemetry
✅ No external network requests (in local AI mode)
✅ Minimal permissions (no <all_urls> or tabs)

Performance Optimizations

1. Lazy Loading

Models loaded only when TTS is first used:

// Lazy TTS initialization
let ttsInstance = null;

async function getTTS() {
  if (!ttsInstance) {
    ttsInstance = await KokoroTTS.from_pretrained(modelPath, options);
  }
  return ttsInstance;
}
        

2. Streaming Architecture

Summary text streams to UI while TTS chunks process in parallel:

Summary displays incrementally (perceived speed boost)
TTS generates first audio chunk while later text still streaming
Playback starts before full article synthesized

3. GPU Acceleration

WebGPU provides 2-10x speedup over WASM for TTS synthesis. See GPU Acceleration Guide for details.

Technology Stack

Technology	Version	Purpose
Chrome Extension API	Manifest V3	Extension framework
@mozilla/readability	0.5.0	Article content extraction
kokoro-js	1.0.0	Neural TTS engine
@xenova/transformers	2.17.2	ML model loading (Transformers.js)
onnxruntime-web	1.20.0	Neural network inference
esbuild	0.25.10	Build tool & bundler

Build Process

# Install dependencies
npm install

# Bundle extension
npm run build

# Output structure
dist/
├── manifest.json
├── background/
│   └── service-worker.js    // Bundled
├── sidepanel/
│   ├── sidepanel.html
│   ├── sidepanel.js         // Bundled
│   └── sidepanel.css
├── lib/                     // Static dependencies
└── models/                  // ONNX models