Grokipedia is a proof-of-concept application that rewrites Wikipedia articles using the OpenRouter API (x-ai/grok-4-fast) while fetching source content through Firecrawl. The objective is to produce fact-focused, bias-aware summaries that align with Grokipedia’s “maximum truth” philosophy.
The project consists of:
- Backend (
backend/) — An Express server that validates Wikipedia URLs, scrapes article content with Firecrawl, enriches requests with configurable options (prompt caching, usage tracking, structured outputs, web search, tool calls), and returns rewritten markdown. It also supports server-sent event (SSE) streaming and has Jest/Supertest integration tests. - Frontend (
frontend/) — A lightweight HTML/CSS/JS client that collects URLs, exposes advanced controls, renders markdown (streaming or static), and displays usage/bias/context insights while persisting user preferences.
- The user submits a Wikipedia URL through the frontend.
- The backend calls Firecrawl to scrape the page in markdown format (main content only).
- The backend sends the scraped markdown and a Grokipedia-specific system prompt to OpenRouter’s chat completion endpoint, optionally enabling features like streaming, structured outputs, web search plugins, and tool calls.
- The rewritten article (and optional metadata) flows back to the frontend for rendering.
Supporting components include rate limiting, request logging with redaction, and helper utilities for hashing prompts and extracting sections during tool calls.
- Prompt transformation: Rewrites with Grokipedia prompt emphasizing truth-seeking, bias reduction, and context correction.
- Usage telemetry: Returns token counts, cost estimates, and cache savings when available.
- Streaming responses: Optional SSE proxy that relays OpenRouter streaming output to the browser.
- Structured output mode: Requests JSON responses summarizing flagged biases, added context, and confidence notes.
- Web search augmentation: Integrates OpenRouter
webplugin /:onlinemodel variants for fresh citations. - Tool scaffold: Provides a
lookup_sectionfunction call to re-fetch specific article sections if the model requests them. - Frontend UX: Advanced settings accordion with toggles for streaming, caching, web search, structured output, and sampling parameters; temperature/top-p range sliders with inline tooltips; insights sidebar with collapsible panels (usage, biases, context, tool activity, citations) that highlight matching spans in the article; preference persistence via
localStorage; skeleton loader + staged status updates; copy-to-clipboard support, reading-time estimator, and auto-generated table of contents.
- Node.js 18+ (tested with npm)
- Firecrawl API key (provided example:
fc-...) - OpenRouter API key with access to
x-ai/grok-4-fast
backend/
server.js
services/
utils/
constants/
__tests__/
package.json
frontend/
index.html
main.js
styles.css
plan.md
nextsteps.md
README.md
.gitignore
Create backend/.env (see backend/.env.example) with:
FIRECRAWL_API_KEY=your_firecrawl_key
OPENROUTER_API_KEY=your_openrouter_key
# Optional tuning
MIDDLE_OUT_THRESHOLD=20000
RATE_LIMIT_WINDOW_MINUTES=1
RATE_LIMIT_MAX=30
Install backend dependencies:
cd backend
npm installNo build steps are required for the static frontend.
cd backend
npm startThis starts the Express server on port 3000 by default. Endpoints:
POST /api/grokify— Main rewrite endpointGET /health— Basic health check ({ status: "ok" })
The frontend is pure HTML/JS/CSS. You can open frontend/index.html directly in a browser or serve it with a lightweight static server (e.g., VS Code Live Server, npx serve frontend, or similar). Ensure it can reach the backend at http://localhost:3000 (adjust via proxy or CORS if hosting separately).
Request body:
{
"url": "https://en.wikipedia.org/wiki/Example",
"stream": false,
"settings": {
"enablePromptCaching": false,
"useMiddleOut": true,
"structuredOutput": { "enabled": false },
"webSearch": {
"enabled": false,
"maxResults": 3,
"engine": "exa",
"mode": "plugin"
},
"tooling": { "enabled": false },
"temperature": 1.0,
"top_p": 1.0,
"max_tokens": 2048
}
}urlmust be a valid Wikipedia domain.streamenables SSE streaming whentrue.settingsmaps to OpenRouter request parameters; omit fields to use defaults.
Response (non-streaming example):
{
"content": "# Rewritten Article...",
"usage": {
"prompt_tokens": 1234,
"completion_tokens": 567,
"total_tokens": 1801,
"cached_tokens": 400,
"cache_discount": 0.1234,
"cost": 0.0234
},
"annotations": [
{ "title": "Example Source", "url": "https://example.com" }
],
"structured": null,
"transforms": ["middle-out"],
"toolResults": []
}Error responses follow JSON { "error": "message" } with appropriate HTTP status codes.
When stream=true, the endpoint returns SSE chunks compatible with OpenAI response streams. The frontend concatenates delta.content segments; a terminating data: [DONE] indicates completion.
- Open the frontend page and paste a Wikipedia URL (or use the sample link).
- Optionally expand Advanced settings to configure streaming, caching, structured outputs, web augmentation, or sampling controls.
- Click Grokify. A skeleton preview and staged status messages (“Fetching…”, “Analyzing…”, “Generating…”) appear while the article loads; streaming shows a “Streaming…” badge with live tokens.
- Use the Copy markdown button to copy the raw rewrite once available (feedback shows “Copied!” when successful).
- Review results and insights panels:
- Usage shows token/cost metrics.
- Bias corrections and Added context summarize structured findings; hovering/focusing these items highlights the corresponding spans in the article.
- Tool activity surfaces any
lookup_sectioncalls. - Citations lists sources returned by OpenRouter.
Frontend preferences persist in localStorage so recurring users keep their settings.
Additional UI touches include an automatically generated table of contents for multi-section articles and an estimated reading time badge.
Backend tests use Jest + Supertest:
cd backend
npm testTests cover standard responses, structured mode, middle-out transform, tool replanning, and streaming proxy behavior (mocked).
- Rate limiting defaults to 30 requests/minute; adjust via environment variables.
- Prompt caching is configured with
cache_controlmetadata for providers that require it. - Logging uses redacted outputs; suppressed during test runs.
- See
plan.mdfor the original project plan andnextsteps.mdfor enhancement roadmap (usage telemetry drilldowns, diff viewer, caching inspection, etc.). - Advanced settings sliders (temperature/top-p) emit live values and include CSS-only tooltips for quick parameter guidance.
- 400 Invalid URL — Ensure the URL ends with
wikipedia.organd is a valid article. - Missing API keys — Populate
backend/.envwithFIRECRAWL_API_KEYandOPENROUTER_API_KEY. - Streaming issues — Confirm the frontend is served from a host that can reach the backend without CORS blocks. Disable streaming to fall back to classic responses.
- Structured output parsing errors — Structured mode expects valid JSON from the model; if parsing fails, the raw markdown is displayed instead.
License information is not specified; add one here if required before publishing.