docs: evaluations instructions and readme refactoring

2026-02-28 02:47:37 +01:00 · 2025-12-22 18:32:59 +01:00
7 changed files with 570 additions and 232 deletions
--- a/README.md
+++ b/README.md
@@ -1,265 +1,61 @@
-# Minne - A Graph-Powered Personal Knowledge Base
+# Minne

-**Minne (Swedish for "memory")** is a personal knowledge management system and save-for-later application for capturing, organizing, and accessing your information. Inspired by the Zettelkasten method, it uses a graph database to automatically create connections between your notes without manual linking overhead.
+**A graph-powered personal knowledge base that remembers for you.**
+
+Capture content effortlessly, let AI discover connections, and explore your knowledge visually. Self-hosted and privacy-focused.

 [![Release Status](https://github.com/perstarkse/minne/actions/workflows/release.yml/badge.svg)](https://github.com/perstarkse/minne/actions/workflows/release.yml)
 [![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
 [![Latest Release](https://img.shields.io/github/v/release/perstarkse/minne?sort=semver)](https://github.com/perstarkse/minne/releases/latest)

-![Screenshot](screenshot-graph.webp)
+## Try It

-## Demo deployment
-
-To test _Minne_ out, enter [this](https://minne-demo.stark.pub) read-only demo deployment to view and test functionality out.
-
-## Noteworthy Features
-
- **Search & Chat Interface** - Find content or knowledge instantly with full-text search, or use the chat mode and conversational AI to find and reason about content
- **Manual and AI-assisted connections** - Build entities and relationships manually with full control, let AI create entities and relationships automatically, or blend both approaches with AI suggestions for manual approval 
- **Hybrid Retrieval System** - Search combining vector similarity, full-text search, and graph traversal for highly relevant results
- **Scratchpad Feature** - Quickly capture thoughts and convert them to permanent content when ready
- **Visual Graph Explorer** - Interactive D3-based navigation of your knowledge entities and connections
- **Multi-Format Support** - Ingest text, URLs, PDFs, audio files, and images into your knowledge base
- **Performance Focus** - Built with Rust and server-side rendering for speed and efficiency
- **Self-Hosted & Privacy-Focused** - Full control over your data, and compatible with any OpenAI-compatible API that supports structured outputs
-
-## The "Why" Behind Minne
-
-For a while I've been fascinated by personal knowledge management systems. I wanted something that made it incredibly easy to capture content - snippets of text, URLs, and other media - while automatically discovering connections between ideas. But I also wanted to maintain control over my knowledge structure.
-
-Traditional tools like Logseq and Obsidian are excellent, but the manual linking process often became a hindrance. Meanwhile, fully automated systems sometimes miss important context or create relationships I wouldn't have chosen myself.
-
-So I built Minne to offer the best of both worlds: effortless content capture with AI-assisted relationship discovery, but with the flexibility to manually curate, edit, or override any connections. You can let AI handle the heavy lifting of extracting entities and finding relationships, take full control yourself, or use a hybrid approach where AI suggests connections that you can approve or modify.
-
-While developing Minne, I discovered [KaraKeep](https://github.com/karakeep-app/karakeep) (formerly Hoarder), which is an excellent application in a similar space – you probably want to check it out! However, if you're interested in a PKM that offers both intelligent automation and manual curation, with the ability to chat with your knowledge base, then Minne might be worth testing.
-
-## Table of Contents
-
- [Quick Start](#quick-start)
- [Features in Detail](#features-in-detail)
- [Configuration](#configuration)
- [Tech Stack](#tech-stack)
- [Application Architecture](#application-architecture)
- [AI Configuration](#ai-configuration--model-selection)
- [Roadmap](#roadmap)
- [Development](#development)
- [Contributing](#contributing)
- [License](#license)
+**[Live Demo](https://minne-demo.stark.pub)** — Read-only demo deployment

 ## Quick Start

-The fastest way to get Minne running is with Docker Compose:
-
 ```bash
-# Clone the repository
 git clone https://github.com/perstarkse/minne.git
 cd minne

-# Start Minne and its database
+# Set your OpenAI API key in docker-compose.yml, then:
 docker compose up -d

-# Access at http://localhost:3000
+# Open http://localhost:3000
 ```

-**Required Setup:**
- Replace `your_openai_api_key_here` in `docker-compose.yml` with your actual API key
- Configure `OPENAI_BASE_URL` if using a custom AI provider (like Ollama)
-
-For detailed installation options, see [Configuration](#configuration).
-
-## Features in Detail
-
-### Search vs. Chat mode
-
-**Search** - Use when you know roughly what you're looking for. Full-text search finds items quickly by matching your query terms.
-
-**Chat Mode** - Use when you want to explore concepts, find connections, or reason about your knowledge. The AI analyzes your query and finds relevant context across your entire knowledge base.
-
-### Content Processing
-
-Minne automatically processes content you save:
-1. **Web scraping** extracts readable text from URLs
-2. **Text analysis** identifies key concepts and relationships
-3. **Graph creation** builds connections between related content
-4. **Embedding generation** enables semantic search capabilities
-
-### Visual Knowledge Graph
-
-Explore your knowledge as an interactive network with flexible curation options:
-
-**Manual Curation** - Create knowledge entities and relationships yourself with full control over your graph structure
-
-**AI Automation** - Let AI automatically extract entities and discover relationships from your content
-
-**Hybrid Approach** - Get AI-suggested relationships and entities that you can manually review, edit, or approve
-
-The graph visualization shows:
- Knowledge entities as nodes (manually created or AI-extracted)
- Relationships as connections (manually defined, AI-discovered, or suggested)
- Interactive navigation for discovery and editing
-
-### Optional FastEmbed Reranking
-
-Minne ships with an opt-in reranking stage powered by [fastembed-rs](https://github.com/Anush008/fastembed-rs). When enabled, the hybrid retrieval results are rescored with a lightweight cross-encoder before being returned to chat or ingestion flows. In practice this often means more relevant results, boosting answer quality and downstream enrichment.
-
-⚠️ **Resource notes**
- Enabling reranking downloads and caches ~1.1 GB of model data on first startup (cached under `<data_dir>/fastembed/reranker` by default).
- Initialization takes longer while warming the cache, and each query consumes extra CPU. The default pool size (2) is tuned for a singe user setup, but could work with a pool size on 1 as well.
- The feature is disabled by default. Set `reranking_enabled: true` (or `RERANKING_ENABLED=true`) if you’re comfortable with the additional footprint.
-
-Example configuration:
-
-```yaml
-reranking_enabled: true
-reranking_pool_size: 2
-fastembed_cache_dir: "/var/lib/minne/fastembed"  # optional override, defaults to .fastembed_cache
-```
-
-## Tech Stack
-
- **Backend:** Rust with Axum framework and Server-Side Rendering (SSR)
- **Frontend:** HTML with HTMX and minimal JavaScript for interactivity
- **Database:** SurrealDB (graph, document, and vector search)
- **AI Integration:** OpenAI-compatible API with structured outputs
- **Web Processing:** Headless Chrome for robust webpage content extraction
-
-## Configuration
-
-Minne can be configured using environment variables or a `config.yaml` file. Environment variables take precedence over `config.yaml`.
-
-### Required Configuration
-
- `SURREALDB_ADDRESS`: WebSocket address of your SurrealDB instance (e.g., `ws://127.0.0.1:8000`)
- `SURREALDB_USERNAME`: Username for SurrealDB (e.g., `root_user`)
- `SURREALDB_PASSWORD`: Password for SurrealDB (e.g., `root_password`)
- `SURREALDB_DATABASE`: Database name in SurrealDB (e.g., `minne_db`)
- `SURREALDB_NAMESPACE`: Namespace in SurrealDB (e.g., `minne_ns`)
- `OPENAI_API_KEY`: Your API key for OpenAI compatible endpoint
- `HTTP_PORT`: Port for the Minne server (Default: `3000`)
-
-### Optional Configuration
-
- `RUST_LOG`: Controls logging level (e.g., `minne=info,tower_http=debug`)
- `DATA_DIR`: Directory to store local data (e.g., `./data`)
- `OPENAI_BASE_URL`: Base URL for custom AI providers (like Ollama)
- `RERANKING_ENABLED` / `reranking_enabled`: Set to `true` to enable the FastEmbed reranking stage (default `false`)
- `RERANKING_POOL_SIZE` / `reranking_pool_size`: Maximum concurrent reranker workers (defaults to `2`)
- `FASTEMBED_CACHE_DIR` / `fastembed_cache_dir`: Directory for cached FastEmbed models (defaults to `<data_dir>/fastembed/reranker`)
- `FASTEMBED_SHOW_DOWNLOAD_PROGRESS` / `fastembed_show_download_progress`: Show model download progress when warming the cache (default `true`)
-
-### Example config.yaml
-
-```yaml
-surrealdb_address: "ws://127.0.0.1:8000"
-surrealdb_username: "root_user"
-surrealdb_password: "root_password"
-surrealdb_database: "minne_db"
-surrealdb_namespace: "minne_ns"
-openai_api_key: "sk-YourActualOpenAIKeyGoesHere"
-data_dir: "./minne_app_data"
-http_port: 3000
-# rust_log: "info"
-```
-
-## Installation Options
-
-### 1. Docker Compose (Recommended)
-
-```bash
-# Clone and run
-git clone https://github.com/perstarkse/minne.git
-cd minne
-docker compose up -d
-```
-
-The included `docker-compose.yml` handles SurrealDB and Chromium dependencies automatically.
-
-### 2. Nix
+Or with Nix:

 ```bash
 nix run 'github:perstarkse/minne#main'
 ```

-This fetches Minne and all dependencies, including Chromium.
+## Features

-### 3. Pre-built Binaries
+- **Search & Chat** — Full-text search or conversational AI to find and reason about content
+- **Knowledge Graph** — Visual exploration with automatic or manual relationship curation
+- **Hybrid Retrieval** — Vector similarity + full-text + graph traversal for relevant results
+- **Multi-Format** — Ingest text, URLs, PDFs, audio, and images
+- **Self-Hosted** — Your data, your server, any OpenAI-compatible API

-Download binaries for Windows, macOS, and Linux from the [GitHub Releases](https://github.com/perstarkse/minne/releases/latest).
+## Documentation

-**Requirements:** You'll need to provide SurrealDB and Chromium separately.
+| Guide | Description |
+|-------|-------------|
+| [Installation](docs/installation.md) | Docker, Nix, binaries, source builds |
+| [Configuration](docs/configuration.md) | Environment variables, config.yaml, AI setup |
+| [Features](docs/features.md) | Search, Chat, Graph, Reranking, Ingestion |
+| [Architecture](docs/architecture.md) | Tech stack, crate structure, data flow |
+| [Vision](docs/vision.md) | Philosophy, roadmap, related projects |

-### 4. Build from Source
+## Tech Stack

-```bash
-git clone https://github.com/perstarkse/minne.git
-cd minne
-cargo run --release --bin main
-```
-
-**Requirements:** SurrealDB and Chromium must be installed and accessible in your PATH.
-
-## Application Architecture
-
-Minne offers flexible deployment options:
-
- **`main`**: Combined server and worker in one process (recommended for most users)
- **`server`**: Web interface and API only
- **`worker`**: Background processing only (for resource optimization)
-
-## Usage
-
-Once Minne is running at `http://localhost:3000`:
-
-1. **Web Interface**: Full-featured experience for desktop and mobile
-2. **iOS Shortcut**: Use the [Minne iOS Shortcut](https://www.icloud.com/shortcuts/e433fbd7602f4e2eaa70dca162323477) for quick content capture
-3. **Content Types**: Save notes, URLs, audio files, and more
-4. **Knowledge Graph**: Explore automatic connections between your content
-5. **Chat Interface**: Query your knowledge base conversationally
-
-## AI Configuration & Model Selection
-
-### Setting Up AI Providers
-
-Minne uses OpenAI-compatible APIs. Configure via environment variables or `config.yaml`:
-
- `OPENAI_API_KEY` (required): Your API key
- `OPENAI_BASE_URL` (optional): Custom provider URL (e.g., Ollama: `http://localhost:11434/v1`)
-
-### Model Selection
-
-1. Access the `/admin` page in your Minne instance
-2. Select models for content processing and chat from your configured provider
-3. **Content Processing Requirements**: The model must support structured outputs
-4. **Embedding Dimensions**: Update this setting when changing embedding models (e.g., 1536 for `text-embedding-3-small`, 768 for `nomic-embed-text`)
-
-## Roadmap
-
-Current development focus:
-
- TUI frontend with system editor integration
- Enhanced reranking for improved retrieval recall
- Additional content type support
-
-Feature requests and contributions are welcome!
-
-## Development
-
-```bash
-# Run tests
-cargo test
-
-# Development build
-cargo build
-
-# Comprehensive linting
-cargo clippy --workspace --all-targets --all-features
-```
-
-The codebase includes extensive unit tests. Integration tests and additional contributions are welcome.
+Rust • Axum • HTMX • SurrealDB • FastEmbed

 ## Contributing
-I've developed Minne primarily for my own use, but having been in the selfhosted space for a long time, and using the efforts by others, I thought I'd share with the community. Feature requests are welcome.
+
+Feature requests and contributions welcome. See [Vision](docs/vision.md) for roadmap.

 ## License

-Minne is licensed under the **GNU Affero General Public License v3.0 (AGPL-3.0)**. See the [LICENSE](LICENSE) file for details.
+[AGPL-3.0](LICENSE)
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -0,0 +1,74 @@
+# Architecture
+
+## Tech Stack
+
+| Layer | Technology |
+|-------|------------|
+| Backend | Rust with Axum (SSR) |
+| Frontend | HTML + HTMX + minimal JS |
+| Database | SurrealDB (graph, document, vector) |
+| AI | OpenAI-compatible API |
+| Web Processing | Headless Chromium |
+
+## Crate Structure
+
+```
+minne/
+├── main/                 # Combined server + worker binary
+├── api-router/           # REST API routes
+├── html-router/          # SSR web interface
+├── ingestion-pipeline/   # Content processing pipeline
+├── retrieval-pipeline/   # Search and retrieval logic
+├── common/               # Shared types, storage, utilities
+├── evaluations/          # Benchmarking framework
+└── json-stream-parser/   # Streaming JSON utilities
+```
+
+## Process Modes
+
+| Binary | Purpose |
+|--------|---------|
+| `main` | All-in-one: serves UI and processes content |
+| `server` | UI and API only (no background processing) |
+| `worker` | Background processing only (no UI) |
+
+Split deployment is useful for scaling or resource isolation.
+
+## Data Flow
+
+```
+Content In → Ingestion Pipeline → SurrealDB
+                    ↓
+            Entity Extraction
+                    ↓
+            Embedding Generation
+                    ↓
+            Graph Relationships
+
+Query → Retrieval Pipeline → Results
+              ↓
+       Vector Search + FTS + Graph
+              ↓
+       RRF Fusion → (Optional Rerank) → Response
+```
+
+## Database Schema
+
+SurrealDB stores:
+
+- **TextContent** — Raw ingested content
+- **TextChunk** — Chunked content with embeddings
+- **KnowledgeEntity** — Extracted entities (people, concepts, etc.)
+- **KnowledgeRelationship** — Connections between entities
+- **User** — Authentication and preferences
+- **SystemSettings** — Model configuration
+
+Embeddings are stored in dedicated tables with HNSW indexes for fast vector search.
+
+## Retrieval Strategy
+
+1. **Collect candidates** — Vector similarity + full-text search
+2. **Merge ranks** — Reciprocal Rank Fusion (RRF)
+3. **Attach context** — Link chunks to parent entities
+4. **Rerank** (optional) — Cross-encoder rescoring
+5. **Return** — Top-k results with metadata
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -0,0 +1,77 @@
+# Configuration
+
+Minne can be configured via environment variables or a `config.yaml` file. Environment variables take precedence.
+
+## Required Settings
+
+| Variable | Description | Example |
+|----------|-------------|---------|
+| `OPENAI_API_KEY` | API key for OpenAI-compatible endpoint | `sk-...` |
+| `SURREALDB_ADDRESS` | WebSocket address of SurrealDB | `ws://127.0.0.1:8000` |
+| `SURREALDB_USERNAME` | SurrealDB username | `root_user` |
+| `SURREALDB_PASSWORD` | SurrealDB password | `root_password` |
+| `SURREALDB_DATABASE` | Database name | `minne_db` |
+| `SURREALDB_NAMESPACE` | Namespace | `minne_ns` |
+
+## Optional Settings
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `HTTP_PORT` | Server port | `3000` |
+| `DATA_DIR` | Local data directory | `./data` |
+| `OPENAI_BASE_URL` | Custom AI provider URL | OpenAI default |
+| `RUST_LOG` | Logging level | `info` |
+
+### Reranking (Optional)
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `RERANKING_ENABLED` | Enable FastEmbed reranking | `false` |
+| `RERANKING_POOL_SIZE` | Concurrent reranker workers | `2` |
+| `FASTEMBED_CACHE_DIR` | Model cache directory | `<data_dir>/fastembed/reranker` |
+
+> [!NOTE]
+> Enabling reranking downloads ~1.1 GB of model data on first startup.
+
+## Example config.yaml
+
+```yaml
+surrealdb_address: "ws://127.0.0.1:8000"
+surrealdb_username: "root_user"
+surrealdb_password: "root_password"
+surrealdb_database: "minne_db"
+surrealdb_namespace: "minne_ns"
+openai_api_key: "sk-your-key-here"
+data_dir: "./minne_data"
+http_port: 3000
+
+# Optional reranking
+reranking_enabled: true
+reranking_pool_size: 2
+```
+
+## AI Provider Setup
+
+Minne works with any OpenAI-compatible API that supports structured outputs.
+
+### OpenAI (Default)
+
+Set `OPENAI_API_KEY` only. The default base URL points to OpenAI.
+
+### Ollama
+
+```bash
+OPENAI_API_KEY="ollama"
+OPENAI_BASE_URL="http://localhost:11434/v1"
+```
+
+### Other Providers
+
+Any provider exposing an OpenAI-compatible endpoint works. Set `OPENAI_BASE_URL` accordingly.
+
+## Model Selection
+
+1. Access `/admin` in your Minne instance
+2. Select models for content processing and chat
+3. **Content Processing**: Must support structured outputs
+4. **Embedding Dimensions**: Update when changing embedding models (e.g., 1536 for `text-embedding-3-small`)
--- a/docs/features.md
+++ b/docs/features.md
@@ -0,0 +1,64 @@
+# Features
+
+## Search vs Chat
+
+**Search** — Use when you know what you're looking for. Full-text search matches query terms across your content.
+
+**Chat** — Use when exploring concepts or reasoning about your knowledge. The AI analyzes your query and retrieves relevant context from your entire knowledge base.
+
+## Content Processing
+
+Minne automatically processes saved content:
+
+1. **Web scraping** extracts readable text from URLs (via headless Chrome)
+2. **Text analysis** identifies key concepts and relationships
+3. **Graph creation** builds connections between related content
+4. **Embedding generation** enables semantic search
+
+## Knowledge Graph
+
+Explore your knowledge as an interactive network:
+
+- **Manual curation** — Create entities and relationships yourself
+- **AI automation** — Let AI extract entities and discover relationships
+- **Hybrid approach** — AI suggests connections for your approval
+
+The D3-based graph visualization shows entities as nodes and relationships as edges.
+
+## Hybrid Retrieval
+
+Minne combines multiple retrieval strategies:
+
+- **Vector similarity** — Semantic matching via embeddings
+- **Full-text search** — Keyword matching with BM25
+- **Graph traversal** — Following relationships between entities
+
+Results are merged using Reciprocal Rank Fusion (RRF) for optimal relevance.
+
+## Reranking (Optional)
+
+When enabled, retrieval results are rescored with a cross-encoder model for improved relevance. Powered by [fastembed-rs](https://github.com/Anush008/fastembed-rs).
+
+**Trade-offs:**
+- Downloads ~1.1 GB of model data
+- Adds latency per query
+- Potentially improves answer quality, see [blog post](https://blog.stark.pub/posts/eval-retrieval-refactor/)
+
+Enable via `RERANKING_ENABLED=true`. See [Configuration](./configuration.md).
+
+## Multi-Format Ingestion
+
+Supported content types:
+- Plain text and notes
+- URLs (web pages)
+- PDF documents
+- Audio files
+- Images
+
+## Scratchpad
+
+Quickly capture content without committing to permanent storage. Convert to full content when ready.
+
+## iOS Shortcut
+
+Use the [Minne iOS Shortcut](https://www.icloud.com/shortcuts/e433fbd7602f4e2eaa70dca162323477) for quick content capture from your phone.
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -0,0 +1,67 @@
+# Installation
+
+Minne can be installed through several methods. Choose the one that best fits your setup.
+
+## Docker Compose (Recommended)
+
+The fastest way to get Minne running with all dependencies:
+
+```bash
+git clone https://github.com/perstarkse/minne.git
+cd minne
+docker compose up -d
+```
+
+The included `docker-compose.yml` handles SurrealDB and Chromium automatically.
+
+**Required:** Set your `OPENAI_API_KEY` in `docker-compose.yml` before starting.
+
+## Nix
+
+Run Minne directly with Nix (includes Chromium):
+
+```bash
+nix run 'github:perstarkse/minne#main'
+```
+
+Configure via environment variables or a `config.yaml` file. See [Configuration](./configuration.md).
+
+## Pre-built Binaries
+
+Download binaries for Windows, macOS, and Linux from [GitHub Releases](https://github.com/perstarkse/minne/releases/latest).
+
+**Requirements:**
+- SurrealDB instance (local or remote)
+- Chromium (for web scraping)
+
+## Build from Source
+
+```bash
+git clone https://github.com/perstarkse/minne.git
+cd minne
+cargo build --release --bin main
+```
+
+The binary will be at `target/release/main`.
+
+**Requirements:**
+- Rust toolchain
+- SurrealDB accessible at configured address
+- Chromium in PATH
+
+## Process Modes
+
+Minne offers flexible deployment:
+
+| Binary | Description |
+|--------|-------------|
+| `main` | Combined server + worker (recommended) |
+| `server` | Web interface and API only |
+| `worker` | Background processing only |
+
+For most users, `main` is the right choice. Split deployments are useful for resource optimization or scaling.
+
+## Next Steps
+
+- [Configuration](./configuration.md) — Environment variables and config.yaml
+- [Features](./features.md) — What Minne can do
--- a/docs/vision.md
+++ b/docs/vision.md
@@ -0,0 +1,48 @@
+# Vision
+
+## The "Why" Behind Minne
+
+Personal knowledge management has always fascinated me. I wanted something that made it incredibly easy to capture content—snippets of text, URLs, media—while automatically discovering connections between ideas. But I also wanted control over my knowledge structure.
+
+Traditional tools like Logseq and Obsidian are excellent, but manual linking often becomes a hindrance. Fully automated systems sometimes miss important context or create relationships I wouldn't have chosen.
+
+Minne offers the best of both worlds: effortless capture with AI-assisted relationship discovery, but with flexibility to manually curate, edit, or override connections. Let AI handle the heavy lifting, take full control yourself, or use a hybrid approach where AI suggests and you approve.
+
+## Design Principles
+
+- **Capture should be instant** — No friction between thought and storage
+- **Connections should emerge** — AI finds relationships you might miss
+- **Control should be optional** — Automate by default, curate when it matters
+- **Privacy should be default** — Self-hosted, your data stays yours
+
+## Roadmap
+
+### Near-term
+
+- [ ] TUI frontend with system editor integration
+- [ ] Enhanced retrieval recall via improved reranking
+- [ ] Additional content type support (e-books, research papers)
+
+### Medium-term
+
+- [ ] Embedded SurrealDB option (zero-config `nix run` with just `OPENAI_API_KEY`)
+- [ ] Browser extension for seamless capture
+- [ ] Mobile-native apps
+
+### Long-term
+
+- [ ] Federated knowledge sharing (opt-in)
+- [ ] Local LLM integration (fully offline operation)
+- [ ] Plugin system for custom entity extractors
+
+## Related Projects
+
+If Minne isn't quite right for you, check out:
+
+- [Karakeep](https://github.com/karakeep-app/karakeep) (formerly Hoarder) — Excellent bookmark/read-later with AI tagging
+- [Logseq](https://logseq.com/) — Outliner-based PKM with manual linking
+- [Obsidian](https://obsidian.md/) — Markdown-based PKM with plugin ecosystem
+
+## Contributing
+
+Feature requests and contributions are welcome. Minne was built for personal use first, but the self-hosted community benefits when we share.
--- a/evaluations/README.md
+++ b/evaluations/README.md
@@ -0,0 +1,212 @@
+# Evaluations
+
+The `evaluations` crate provides a retrieval evaluation framework for benchmarking Minne's information retrieval pipeline against standard datasets.
+
+## Quick Start
+
+```bash
+# Run SQuAD v2.0 evaluation (vector-only, recommended)
+cargo run --package evaluations -- --ingest-chunks-only
+
+# Run a specific dataset
+cargo run --package evaluations -- --dataset fiqa --ingest-chunks-only
+
+# Convert dataset only (no evaluation)
+cargo run --package evaluations -- --convert-only
+```
+
+## Prerequisites
+
+### 1. SurrealDB
+
+Start a SurrealDB instance before running evaluations:
+
+```bash
+docker-compose up -d surrealdb
+```
+
+Or using the default endpoint configuration:
+
+```bash
+surreal start --user root_user --pass root_password
+```
+
+### 2. Download Raw Datasets
+
+Raw datasets must be downloaded manually and placed in `evaluations/data/raw/`. See [Dataset Sources](#dataset-sources) below for links and formats.
+
+## Directory Structure
+
+```
+evaluations/
+├── data/
+│   ├── raw/          # Downloaded raw datasets (manual)
+│   │   ├── squad/    # SQuAD v2.0
+│   │   ├── nq-dev/   # Natural Questions
+│   │   ├── fiqa/     # BEIR: FiQA-2018
+│   │   ├── fever/    # BEIR: FEVER
+│   │   ├── hotpotqa/ # BEIR: HotpotQA
+│   │   └── ...       # Other BEIR subsets
+│   └── converted/    # Auto-generated (Minne JSON format)
+├── cache/            # Ingestion and embedding caches
+├── reports/          # Evaluation output (JSON + Markdown)
+├── manifest.yaml     # Dataset and slice definitions
+└── src/              # Evaluation source code
+```
+
+## Dataset Sources
+
+### SQuAD v2.0
+
+Download and place at `data/raw/squad/dev-v2.0.json`:
+
+```bash
+mkdir -p evaluations/data/raw/squad
+curl -L https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json \
+  -o evaluations/data/raw/squad/dev-v2.0.json
+```
+
+### Natural Questions (NQ)
+
+Download and place at `data/raw/nq-dev/dev-all.jsonl`:
+
+```bash
+mkdir -p evaluations/data/raw/nq-dev
+# Download from Google's Natural Questions page or HuggingFace
+# File: dev-all.jsonl (simplified JSONL format)
+```
+
+Source: [Google Natural Questions](https://ai.google.com/research/NaturalQuestions)
+
+### BEIR Datasets
+
+All BEIR datasets follow the same format structure:
+
+```
+data/raw/<dataset>/
+├── corpus.jsonl      # Document corpus
+├── queries.jsonl     # Query set
+└── qrels/
+    └── test.tsv      # Relevance judgments (or dev.tsv)
+```
+
+Download datasets from the [BEIR Benchmark repository](https://github.com/beir-cellar/beir). Each dataset zip extracts to the required directory structure.
+
+| Dataset    | Directory     |
+|------------|---------------|
+| FEVER      | `fever/`      |
+| FiQA-2018  | `fiqa/`       |
+| HotpotQA   | `hotpotqa/`   |
+| NFCorpus   | `nfcorpus/`   |
+| Quora      | `quora/`      |
+| TREC-COVID | `trec-covid/` |
+| SciFact    | `scifact/`    |
+| NQ (BEIR)  | `nq/`         |
+
+Example download:
+
+```bash
+cd evaluations/data/raw
+curl -L https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/fiqa.zip -o fiqa.zip
+unzip fiqa.zip && rm fiqa.zip
+```
+
+## Dataset Conversion
+
+Raw datasets are automatically converted to Minne's internal JSON format on first run. To force reconversion:
+
+```bash
+cargo run --package evaluations -- --force-convert
+```
+
+Converted files are saved to `data/converted/` and cached for subsequent runs.
+
+## CLI Reference
+
+### Common Options
+
+| Flag | Description | Default |
+|------|-------------|---------|
+| `--dataset <NAME>` | Dataset to evaluate | `squad-v2` |
+| `--limit <N>` | Max questions to evaluate (0 = all) | `200` |
+| `--k <N>` | Precision@k cutoff | `5` |
+| `--slice <ID>` | Use a predefined slice from manifest | — |
+| `--rerank` | Enable FastEmbed reranking stage | disabled |
+| `--embedding-backend <BE>` | `fastembed` or `hashed` | `fastembed` |
+| `--ingest-chunks-only` | Skip entity extraction, ingest only text chunks | disabled |
+
+> [!TIP]
+> Use `--ingest-chunks-only` when evaluating vector-only retrieval strategies. This skips the LLM-based entity extraction and graph generation, significantly speeding up ingestion while focusing on pure chunk-based vector search.
+
+### Available Datasets
+
+```
+squad-v2, natural-questions, beir, fever, fiqa, hotpotqa, 
+nfcorpus, quora, trec-covid, scifact, nq-beir
+```
+
+### Database Configuration
+
+| Flag | Environment | Default |
+|------|-------------|---------|
+| `--db-endpoint` | `EVAL_DB_ENDPOINT` | `ws://127.0.0.1:8000` |
+| `--db-username` | `EVAL_DB_USERNAME` | `root_user` |
+| `--db-password` | `EVAL_DB_PASSWORD` | `root_password` |
+| `--db-namespace` | `EVAL_DB_NAMESPACE` | auto-generated |
+| `--db-database` | `EVAL_DB_DATABASE` | auto-generated |
+
+### Example Runs
+
+```bash
+# Vector-only evaluation (recommended for benchmarking)
+cargo run --package evaluations -- \
+  --dataset fiqa \
+  --ingest-chunks-only \
+  --limit 200
+
+# Full FiQA evaluation with reranking
+cargo run --package evaluations -- \
+  --dataset fiqa \
+  --ingest-chunks-only \
+  --limit 500 \
+  --rerank \
+  --k 10
+
+# Use a predefined slice for reproducibility
+cargo run --package evaluations -- --slice fiqa-test-200 --ingest-chunks-only
+
+# Run the mixed BEIR benchmark
+cargo run --package evaluations -- --dataset beir --slice beir-mix-600 --ingest-chunks-only
+```
+
+## Slices
+
+Slices are predefined, reproducible subsets defined in `manifest.yaml`. Each slice specifies:
+
+- **limit**: Number of questions
+- **corpus_limit**: Maximum corpus size
+- **seed**: Fixed RNG seed for reproducibility
+
+View available slices in [manifest.yaml](./manifest.yaml).
+
+## Reports
+
+Evaluations generate reports in `reports/`:
+
+- **JSON**: Full structured results (`*-report.json`)
+- **Markdown**: Human-readable summary with sample mismatches (`*-report.md`)
+- **History**: Timestamped run history (`history/`)
+
+## Performance Tuning
+
+```bash
+# Log per-stage performance timings
+cargo run --package evaluations -- --perf-log-console
+
+# Save telemetry to file
+cargo run --package evaluations -- --perf-log-json ./perf.json
+```
+
+## License
+
+See [../LICENSE](../LICENSE).