mirror of
https://github.com/perstarkse/minne.git
synced 2026-02-22 16:17:40 +01:00
118 lines
3.9 KiB
Markdown
118 lines
3.9 KiB
Markdown
# Configuration
|
|
|
|
Minne can be configured via environment variables or a `config.yaml` file. Environment variables take precedence.
|
|
|
|
## Required Settings
|
|
|
|
| Variable | Description | Example |
|
|
|----------|-------------|---------|
|
|
| `OPENAI_API_KEY` | API key for OpenAI-compatible endpoint | `sk-...` |
|
|
| `SURREALDB_ADDRESS` | WebSocket address of SurrealDB | `ws://127.0.0.1:8000` |
|
|
| `SURREALDB_USERNAME` | SurrealDB username | `root_user` |
|
|
| `SURREALDB_PASSWORD` | SurrealDB password | `root_password` |
|
|
| `SURREALDB_DATABASE` | Database name | `minne_db` |
|
|
| `SURREALDB_NAMESPACE` | Namespace | `minne_ns` |
|
|
|
|
|
|
## Optional Settings
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `HTTP_PORT` | Server port | `3000` |
|
|
| `DATA_DIR` | Local data directory | `./data` |
|
|
| `OPENAI_BASE_URL` | Custom AI provider URL | OpenAI default |
|
|
| `RUST_LOG` | Logging level | `info` |
|
|
| `STORAGE` | Storage backend (`local`, `memory`, `s3`) | `local` |
|
|
| `PDF_INGEST_MODE` | PDF ingestion strategy (`classic`, `llm-first`) | `llm-first` |
|
|
| `RETRIEVAL_STRATEGY` | Default retrieval strategy | - |
|
|
| `EMBEDDING_BACKEND` | Embedding provider (`openai`, `fastembed`) | `fastembed` |
|
|
| `FASTEMBED_CACHE_DIR` | Model cache directory | `<data_dir>/fastembed` |
|
|
| `FASTEMBED_SHOW_DOWNLOAD_PROGRESS` | Show progress bar for model downloads | `false` |
|
|
| `FASTEMBED_MAX_LENGTH` | Max sequence length for FastEmbed models | - |
|
|
| `INGEST_MAX_BODY_BYTES` | Max request body size for ingest endpoints | `20000000` |
|
|
| `INGEST_MAX_FILES` | Max files allowed per ingest request | `5` |
|
|
| `INGEST_MAX_CONTENT_BYTES` | Max `content` field size for ingest requests | `262144` |
|
|
| `INGEST_MAX_CONTEXT_BYTES` | Max `context` field size for ingest requests | `16384` |
|
|
| `INGEST_MAX_CATEGORY_BYTES` | Max `category` field size for ingest requests | `128` |
|
|
|
|
### S3 Storage (Optional)
|
|
|
|
Used when `STORAGE` is set to `s3`.
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `S3_BUCKET` | S3 bucket name | - |
|
|
| `S3_ENDPOINT` | Custom endpoint (e.g. MinIO) | AWS default |
|
|
| `S3_REGION` | AWS Region | `us-east-1` |
|
|
| `AWS_ACCESS_KEY_ID` | Access key | - |
|
|
| `AWS_SECRET_ACCESS_KEY` | Secret key | - |
|
|
|
|
### Reranking (Optional)
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `RERANKING_ENABLED` | Enable FastEmbed reranking | `false` |
|
|
| `RERANKING_POOL_SIZE` | Concurrent reranker workers | - |
|
|
|
|
> [!NOTE]
|
|
> Enabling reranking downloads ~1.1 GB of model data on first startup.
|
|
|
|
## Example config.yaml
|
|
|
|
```yaml
|
|
surrealdb_address: "ws://127.0.0.1:8000"
|
|
surrealdb_username: "root_user"
|
|
surrealdb_password: "root_password"
|
|
surrealdb_database: "minne_db"
|
|
surrealdb_namespace: "minne_ns"
|
|
openai_api_key: "sk-your-key-here"
|
|
data_dir: "./minne_data"
|
|
http_port: 3000
|
|
|
|
# New settings
|
|
storage: "local"
|
|
# storage: "s3"
|
|
# s3_bucket: "my-bucket"
|
|
# s3_endpoint: "http://localhost:9000" # Optional, for MinIO etc.
|
|
# s3_region: "us-east-1"
|
|
pdf_ingest_mode: "llm-first"
|
|
embedding_backend: "fastembed"
|
|
|
|
# Optional reranking
|
|
reranking_enabled: true
|
|
reranking_pool_size: 2
|
|
|
|
# Ingest safety limits
|
|
ingest_max_body_bytes: 20000000
|
|
ingest_max_files: 5
|
|
ingest_max_content_bytes: 262144
|
|
ingest_max_context_bytes: 16384
|
|
ingest_max_category_bytes: 128
|
|
```
|
|
|
|
## AI Provider Setup
|
|
|
|
Minne works with any OpenAI-compatible API that supports structured outputs.
|
|
|
|
### OpenAI (Default)
|
|
|
|
Set `OPENAI_API_KEY` only. The default base URL points to OpenAI.
|
|
|
|
### Ollama
|
|
|
|
```bash
|
|
OPENAI_API_KEY="ollama"
|
|
OPENAI_BASE_URL="http://localhost:11434/v1"
|
|
```
|
|
|
|
### Other Providers
|
|
|
|
Any provider exposing an OpenAI-compatible endpoint works. Set `OPENAI_BASE_URL` accordingly.
|
|
|
|
## Model Selection
|
|
|
|
1. Access `/admin` in your Minne instance
|
|
2. Select models for content processing and chat
|
|
3. **Content Processing**: Must support structured outputs
|
|
4. **Embedding Dimensions**: Update when changing embedding models (e.g., 1536 for `text-embedding-3-small`)
|