Per Stark
4e8a58fff1
fix: load embedding dimensions once per persist and trim vector search select.
2026-06-12 13:54:51 +02:00
Per Stark
28e8ede478
release: 1.0.3
...
fix: load ort-version via bash script on all release runners, toolchain
harmonization
v1.0.3
2026-06-12 12:42:40 +02:00
Per Stark
00453fdcbe
chore: bump to 1.0.3 and harmonize onnx runtime version across nix, ci, and docker.
2026-06-12 09:11:55 +02:00
Per Stark
c53ec8c0a1
fix: arc-share retrieved chunks, centralize entity embeddings, and trim hot-path clones.
2026-06-06 23:05:53 +02:00
Per Stark
60cf63292a
fix: replaced several instances if cloning, reduced allocations
2026-06-06 19:45:18 +02:00
Per Stark
ac0d34bfbd
fix: leaner error handling by boxing large variants
2026-06-06 07:59:57 +02:00
Per Stark
4e20da538d
feat: configure FastEmbed model in config and admin, with restart to apply
...
Expose fastembed_model in config and a model dropdown on Admin → Models.
Persist dimension from the chosen model, require restart to load it, and
align legacy OpenAI default settings so fresh local-embedding installs
start cleanly.
2026-06-04 21:51:57 +02:00
Per Stark
15c9f18f6e
feat: pool fastembed, batch embeddings, and reconcile embedding config on startup
2026-06-04 21:51:57 +02:00
Per Stark
7b850769c9
fix: html-router modals and add insta snapshot tests.
...
Avoid nested forms in the scratchpad editor, centralize modal lifecycle in modal.js, return HTMX partials from archive, and add template compile plus layout snapshots.
2026-06-03 20:20:43 +02:00
Per Stark
2a28243213
feat: can now choose search result types
2026-06-01 14:37:19 +02:00
Per Stark
b22c351785
fix: knowledge entity suggestions simplification
2026-05-31 20:23:40 +02:00
Per Stark
3897345ab3
chore: ingestion-pipeline refactor, sort technical debt, rustfmt
2026-05-31 19:48:41 +02:00
Per Stark
5c2d2e24d3
chore: refactor retrieval pipeline to chunk-first RRF with derived entities and slimmer eval surface.
...
Collapse the multi-strategy entity engine into one benchmarked chunk retrieval path, derive entities from retrieved chunks, and update consumers, docs, and clippy fixes across the workspace.
2026-05-30 22:19:08 +02:00
Per Stark
c70141de35
chore: harden api-router errors and add router integration tests while slimming html handlers.
2026-05-30 15:18:12 +02:00
Per Stark
2aa92b6ad7
chore: improve html-router auth, caching, and analytics while centralizing search labels in common.
...
small fix
2026-05-29 15:03:55 +02:00
Per Stark
d3443d4153
chore: centralize embedding errors, retrieval strategy, and test DB helpers.
...
Replace anyhow in embedding production code with EmbeddingError, move
RetrievalStrategy into common config, and deduplicate Surreal test setup
via common::test_utils.
2026-05-29 14:44:23 +02:00
Per Stark
e3bb2935d0
chore: harden common storage bootstrap and slim embedded db assets
...
Unify embedding config, build providers from system settings, and fail
startup when index builds error or time out. Move Surreal assets under
common/db so embeds exclude crate source, and read storage via streams.
2026-05-29 14:44:23 +02:00
Per Stark
93d11b66eb
test: cover system settings sync, validation, and ingestion prompts
...
Add tests for embedding provider sync, patch isolation, typed backend
serde, and DB-backed ingestion prompts.
2026-05-29 14:44:23 +02:00
Per Stark
125b856c49
chore: harden system settings and unify prompt usage
...
Validate settings updates, use typed embedding backends, and route
ingestion through DB-stored prompts so admin edits take effect.
2026-05-29 14:44:23 +02:00
Per Stark
bc41a619ce
chore: move serde helpers to common utils
...
Relocate SurrealDB serde helpers out of storage types so they can be
reused broadly, and align retrieval-pipeline test setup with configured
embedding dimensions.
2026-05-29 14:44:23 +02:00
Per Stark
ba8c36da1e
chore: harden text chunk embeddings and text content storage
...
Align text chunk embedding identity with knowledge entities (chunk id as record id, UNIQUE chunk_id index, dimension validation), make cascade deletes transactional, and improve text content patch/search reliability with tests.
2026-05-29 14:44:23 +02:00
Per Stark
5724f11dc1
chore: harden knowledge graph storage and clear common clippy warnings
...
Enforce stable 1:1 entity embeddings, relationship endpoint auth, and
user-scoped deletes; align schemas/migrations and resolve common crate
clippy findings.
2026-05-29 14:44:23 +02:00
Per Stark
189adb1a5f
chore: harden analytics, conversation access, and per-user file dedup
...
Use UPSERT for analytics counters, enforce message ownership in SQL,
return
NotFound when patch_title updates nothing, scope file dedup by user_id
with a
composite unique index, and expand tests for auth, ordering, and edge
cases.
2026-05-29 14:44:23 +02:00
Per Stark
97beb91710
chore: optimize ingest payloads and add parallel task batch store
...
Parse content before building file payloads to move shared metadata when
possible, add create_all_and_add_to_db for concurrent stores, and extend
tests for batch persistence and payload edge cases.
2026-05-29 14:44:23 +02:00
Per Stark
85336d77a3
chore: harden common errors, fastembed blocking, and ingest ownership
...
Run FastEmbed inference on spawn_blocking, propagate Surreal take
failures,
add AppError::internal and typed ingest/embedding parse errors, and take
owned file lists in ingestion payload construction.
2026-05-29 14:44:23 +02:00
Per Stark
9d5e7cd794
chore: improved error handling
2026-05-28 19:58:14 +02:00
Per Stark
30bb59f243
chore: rename get_id to id, add doc comments, pre-allocate format_history
2026-05-27 18:06:16 +02:00
Per Stark
224a7db451
chore: lowercase all error messages and add # Errors doc sections
...
- Fix err-lowercase-msg: normalize all #[error(...)] display strings to
lowercase (AppError, FileError, ApiErr) and update affected tests
- Fix err-doc-errors: add # Errors sections to 25+ fallible public
functions across db.rs, store.rs, embedding.rs, indexes.rs,
ingestion_task.rs, and ingest_limits.rs
2026-05-27 14:59:48 +02:00
Per Stark
4579725130
chore: resolve remaining uninlined_format_args clippy warnings
2026-05-27 14:34:37 +02:00
Per Stark
0b08801c90
chore: fix and reduce clippy allows in knowledge_entity.rs
...
- rm duplicate 'document' match arm (match_same_arms)
- .get(0) -> .first() (get_first)
- for entity in all_entities.iter() -> &all_entities (explicit_iter_loop)
- 2x error!("{}", err_msg) -> error!("{err_msg}") (uninlined_format_args)
- 2x test format!()/assert!() positional -> inlined (uninlined_format_args)
- removed 6 now-unnecessary allow attributes
2026-05-27 14:28:08 +02:00
Per Stark
45d13230a6
chore: add must_use to 27 non-Result public functions
...
- constructors: KnowledgeEntity, TextChunk, Scratchpad, IngestionTask,
Conversation, KnowledgeRelationship, Message, TextContent,
KnowledgeEntityEmbedding, TextChunkEmbedding
- accessors: Theme::as_str, Theme::initial_theme, TaskState::as_str,
TaskState::display_label, StorageManager::backend_kind,
StorageManager::local_base_path, EmbeddingProvider::backend_label,
EmbeddingProvider::dimension, EmbeddingProvider::model_code
- queries: TaskState::is_terminal, IngestionTask::can_retry,
KnowledgeEntityType::variants, StorageManager::resolve_local_path,
resolve_base_dir, IngestionTask::lease_duration
- helpers: Message::format_history
- builders: StorageManager::with_backend
2026-05-27 14:23:56 +02:00
Per Stark
0acdba4f54
fix: replace manual embedding serialization with serde_json
...
- replaced write!() loops with serde_json::to_string in 4 re-embedding methods
- standardized SQL building to use write!() with proper error propagation
- eliminates manual f32 vector string building (memory waste + loop risk)
2026-05-27 14:13:19 +02:00
Per Stark
9609880cff
fix: revoke_api_key sets NONE, remove unused bind, lowercase error msgs
...
- fix bug where revoke_api_key set literal 'test_string_nullish' instead of NONE
- remove unused table_name bind in update_timezone
- lowercase ~16 error messages across 4 crates
2026-05-27 13:56:32 +02:00
Per Stark
31d585b59f
chore: removed anyhow from apperror for improved error handling
2026-05-27 13:33:02 +02:00
Per Stark
890a4b381d
chore: index slicing and lowercase errors
2026-05-27 12:41:26 +02:00
Per Stark
2d630e2af9
chore: tightening and removing super fn
2026-05-27 11:23:39 +02:00
Per Stark
9ec11e1f79
chore: clippy and nix fmt
2026-05-27 11:23:08 +02:00
Per Stark
c60db0fb56
perf: avoid small own clones and intermediate Vec allocations
...
- Derive Copy on 6 small enums (MessageRole, TaskState, StorageKind, EmbeddingBackend, PdfIngestMode, KnowledgeEntityType)
- Change create_ingestion_payload files param from Vec<FileInfo> to &[FileInfo]
- Remove 5 intermediate Vec allocations (4 embedding serialization + 1 format_history) using write! loop
- Remove 7 unnecessary .clone() calls exposed by Copy derive
2026-05-27 10:28:08 +02:00
Per Stark
f5f0454904
fix: html-router dependency of json-stream-parser
2026-05-27 09:59:26 +02:00
Per Stark
18aadab8ee
refactor: json-stream-parser aligned to clippy standard
2026-05-27 09:07:38 +02:00
Per Stark
414d2f5b34
chore: additional clippy fixes after rebasing
2026-05-27 07:37:18 +02:00
Per Stark
293440b0ee
fix: pin surrealdb
2026-05-26 20:21:40 +02:00
Per Stark
041d9bd81f
clippy: evaluations crate
2026-05-26 20:21:25 +02:00
Per Stark
b4383bb227
perf: pre-allocate collections with known capacity in hot paths
...
- Use with_capacity for chunk_by_source, results, per_entity_traces,
and selected_chunks in assemble() where bound is known
- Pre-allocate tokens/terms vectors in normalize_fts_query and
extract_keywords based on input length
- Pre-allocate neighbor_ids, seen, and ordered in graph expansion
based on relationship count
2026-05-26 20:21:25 +02:00
Per Stark
6c7b586fc5
perf: offload blocking calls to spawn_blocking
...
- Move headless_chrome PDF rasterization from async context to
spawn_blocking, keeping tokio worker threads responsive.
- Switch RerankerPool from tokio::sync::Mutex to std::sync::Mutex
and run TextRerank::rerank inside spawn_blocking, since the
rerank call is CPU-bound with no .await points.
2026-05-26 20:21:25 +02:00
Per Stark
1927149ce9
lint: inherit workspace clippy config in json-stream-parser and evaluations
...
Both crates were missing the [lints] workspace = true directive,
bypassing workspace clippy rules (unwrap_used, expect_used, etc.).
2026-05-26 20:21:25 +02:00
Per Stark
a52dc802de
refactor: simplify and improve testing for initialization
2026-05-26 20:21:24 +02:00
Per Stark
000852c94c
clippy: adhere to pedantic clippy, uniform test error handling
2026-05-26 20:21:13 +02:00
Per Stark
6a5d631287
chore: remove unused clap dep and fix test_session_table name
...
- Remove clap dependency from retrieval-pipeline (RetrievalStrategy
already has FromStr/Display; evaluations uses clap directly)
- Rename session table from test_session_table to session
2026-05-26 20:14:29 +02:00
Per Stark
b965c5a2e6
refactor: replace Box<dyn Error> with anyhow::Result
...
- ingestion_pipeline::run_worker_loop returns anyhow::Result<()>
- api_router::ApiState::new returns anyhow::Result<Self>
- html_router::HtmlState::new_with_resources is infallible, returns Self
- main/server/worker binary entry points return anyhow::Result<()>
2026-05-26 20:14:11 +02:00