Skip to content

AppleLamps/Grokipedia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Grokipedia

Grokipedia is a proof-of-concept application that rewrites Wikipedia articles using the OpenRouter API (x-ai/grok-4-fast) while fetching source content through Firecrawl. The objective is to produce fact-focused, bias-aware summaries that align with Grokipedia’s “maximum truth” philosophy.

The project consists of:

  • Backend (backend/) — An Express server that validates Wikipedia URLs, scrapes article content with Firecrawl, enriches requests with configurable options (prompt caching, usage tracking, structured outputs, web search, tool calls), and returns rewritten markdown. It also supports server-sent event (SSE) streaming and has Jest/Supertest integration tests.
  • Frontend (frontend/) — A lightweight HTML/CSS/JS client that collects URLs, exposes advanced controls, renders markdown (streaming or static), and displays usage/bias/context insights while persisting user preferences.

Architecture Overview

  1. The user submits a Wikipedia URL through the frontend.
  2. The backend calls Firecrawl to scrape the page in markdown format (main content only).
  3. The backend sends the scraped markdown and a Grokipedia-specific system prompt to OpenRouter’s chat completion endpoint, optionally enabling features like streaming, structured outputs, web search plugins, and tool calls.
  4. The rewritten article (and optional metadata) flows back to the frontend for rendering.

Supporting components include rate limiting, request logging with redaction, and helper utilities for hashing prompts and extracting sections during tool calls.

Features

  • Prompt transformation: Rewrites with Grokipedia prompt emphasizing truth-seeking, bias reduction, and context correction.
  • Usage telemetry: Returns token counts, cost estimates, and cache savings when available.
  • Streaming responses: Optional SSE proxy that relays OpenRouter streaming output to the browser.
  • Structured output mode: Requests JSON responses summarizing flagged biases, added context, and confidence notes.
  • Web search augmentation: Integrates OpenRouter web plugin / :online model variants for fresh citations.
  • Tool scaffold: Provides a lookup_section function call to re-fetch specific article sections if the model requests them.
  • Frontend UX: Advanced settings accordion with toggles for streaming, caching, web search, structured output, and sampling parameters; temperature/top-p range sliders with inline tooltips; insights sidebar with collapsible panels (usage, biases, context, tool activity, citations) that highlight matching spans in the article; preference persistence via localStorage; skeleton loader + staged status updates; copy-to-clipboard support, reading-time estimator, and auto-generated table of contents.

Prerequisites

  • Node.js 18+ (tested with npm)
  • Firecrawl API key (provided example: fc-...)
  • OpenRouter API key with access to x-ai/grok-4-fast

Repository Structure

backend/
  server.js
  services/
  utils/
  constants/
  __tests__/
  package.json
frontend/
  index.html
  main.js
  styles.css
plan.md
nextsteps.md
README.md
.gitignore

Environment Variables

Create backend/.env (see backend/.env.example) with:

FIRECRAWL_API_KEY=your_firecrawl_key
OPENROUTER_API_KEY=your_openrouter_key
# Optional tuning
MIDDLE_OUT_THRESHOLD=20000
RATE_LIMIT_WINDOW_MINUTES=1
RATE_LIMIT_MAX=30

Installation

Install backend dependencies:

cd backend
npm install

No build steps are required for the static frontend.

Running the Application

Backend

cd backend
npm start

This starts the Express server on port 3000 by default. Endpoints:

  • POST /api/grokify — Main rewrite endpoint
  • GET /health — Basic health check ({ status: "ok" })

Frontend

The frontend is pure HTML/JS/CSS. You can open frontend/index.html directly in a browser or serve it with a lightweight static server (e.g., VS Code Live Server, npx serve frontend, or similar). Ensure it can reach the backend at http://localhost:3000 (adjust via proxy or CORS if hosting separately).

API Reference

POST /api/grokify

Request body:

{
  "url": "https://en.wikipedia.org/wiki/Example",
  "stream": false,
  "settings": {
    "enablePromptCaching": false,
    "useMiddleOut": true,
    "structuredOutput": { "enabled": false },
    "webSearch": {
      "enabled": false,
      "maxResults": 3,
      "engine": "exa",
      "mode": "plugin"
    },
    "tooling": { "enabled": false },
    "temperature": 1.0,
    "top_p": 1.0,
    "max_tokens": 2048
  }
}
  • url must be a valid Wikipedia domain.
  • stream enables SSE streaming when true.
  • settings maps to OpenRouter request parameters; omit fields to use defaults.

Response (non-streaming example):

{
  "content": "# Rewritten Article...",
  "usage": {
    "prompt_tokens": 1234,
    "completion_tokens": 567,
    "total_tokens": 1801,
    "cached_tokens": 400,
    "cache_discount": 0.1234,
    "cost": 0.0234
  },
  "annotations": [
    { "title": "Example Source", "url": "https://example.com" }
  ],
  "structured": null,
  "transforms": ["middle-out"],
  "toolResults": []
}

Error responses follow JSON { "error": "message" } with appropriate HTTP status codes.

Streaming

When stream=true, the endpoint returns SSE chunks compatible with OpenAI response streams. The frontend concatenates delta.content segments; a terminating data: [DONE] indicates completion.

Frontend Usage

  1. Open the frontend page and paste a Wikipedia URL (or use the sample link).
  2. Optionally expand Advanced settings to configure streaming, caching, structured outputs, web augmentation, or sampling controls.
  3. Click Grokify. A skeleton preview and staged status messages (“Fetching…”, “Analyzing…”, “Generating…”) appear while the article loads; streaming shows a “Streaming…” badge with live tokens.
  4. Use the Copy markdown button to copy the raw rewrite once available (feedback shows “Copied!” when successful).
  5. Review results and insights panels:
    • Usage shows token/cost metrics.
    • Bias corrections and Added context summarize structured findings; hovering/focusing these items highlights the corresponding spans in the article.
    • Tool activity surfaces any lookup_section calls.
    • Citations lists sources returned by OpenRouter.

Frontend preferences persist in localStorage so recurring users keep their settings.

Additional UI touches include an automatically generated table of contents for multi-section articles and an estimated reading time badge.

Testing

Backend tests use Jest + Supertest:

cd backend
npm test

Tests cover standard responses, structured mode, middle-out transform, tool replanning, and streaming proxy behavior (mocked).

Development Notes

  • Rate limiting defaults to 30 requests/minute; adjust via environment variables.
  • Prompt caching is configured with cache_control metadata for providers that require it.
  • Logging uses redacted outputs; suppressed during test runs.
  • See plan.md for the original project plan and nextsteps.md for enhancement roadmap (usage telemetry drilldowns, diff viewer, caching inspection, etc.).
  • Advanced settings sliders (temperature/top-p) emit live values and include CSS-only tooltips for quick parameter guidance.

Troubleshooting

  • 400 Invalid URL — Ensure the URL ends with wikipedia.org and is a valid article.
  • Missing API keys — Populate backend/.env with FIRECRAWL_API_KEY and OPENROUTER_API_KEY.
  • Streaming issues — Confirm the frontend is served from a host that can reach the backend without CORS blocks. Disable streaming to fall back to classic responses.
  • Structured output parsing errors — Structured mode expects valid JSON from the model; if parsing fails, the raw markdown is displayed instead.

License

License information is not specified; add one here if required before publishing.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published