Commit Graph

429 Commits

Author SHA1 Message Date
Per Stark d3443d4153 chore: centralize embedding errors, retrieval strategy, and test DB helpers.
Replace anyhow in embedding production code with EmbeddingError, move
RetrievalStrategy into common config, and deduplicate Surreal test setup
via common::test_utils.
2026-05-29 14:44:23 +02:00
Per Stark e3bb2935d0 chore: harden common storage bootstrap and slim embedded db assets
Unify embedding config, build providers from system settings, and fail
startup when index builds error or time out. Move Surreal assets under
common/db so embeds exclude crate source, and read storage via streams.
2026-05-29 14:44:23 +02:00
Per Stark 93d11b66eb test: cover system settings sync, validation, and ingestion prompts
Add tests for embedding provider sync, patch isolation, typed backend
serde, and DB-backed ingestion prompts.
2026-05-29 14:44:23 +02:00
Per Stark 125b856c49 chore: harden system settings and unify prompt usage
Validate settings updates, use typed embedding backends, and route
ingestion through DB-stored prompts so admin edits take effect.
2026-05-29 14:44:23 +02:00
Per Stark bc41a619ce chore: move serde helpers to common utils
Relocate SurrealDB serde helpers out of storage types so they can be
reused broadly, and align retrieval-pipeline test setup with configured
embedding dimensions.
2026-05-29 14:44:23 +02:00
Per Stark ba8c36da1e chore: harden text chunk embeddings and text content storage
Align text chunk embedding identity with knowledge entities (chunk id as record id, UNIQUE chunk_id index, dimension validation), make cascade deletes transactional, and improve text content patch/search reliability with tests.
2026-05-29 14:44:23 +02:00
Per Stark 5724f11dc1 chore: harden knowledge graph storage and clear common clippy warnings
Enforce stable 1:1 entity embeddings, relationship endpoint auth, and
user-scoped deletes; align schemas/migrations and resolve common crate
clippy findings.
2026-05-29 14:44:23 +02:00
Per Stark 189adb1a5f chore: harden analytics, conversation access, and per-user file dedup
Use UPSERT for analytics counters, enforce message ownership in SQL,
return
NotFound when patch_title updates nothing, scope file dedup by user_id
with a
composite unique index, and expand tests for auth, ordering, and edge
cases.
2026-05-29 14:44:23 +02:00
Per Stark 97beb91710 chore: optimize ingest payloads and add parallel task batch store
Parse content before building file payloads to move shared metadata when
possible, add create_all_and_add_to_db for concurrent stores, and extend
tests for batch persistence and payload edge cases.
2026-05-29 14:44:23 +02:00
Per Stark 85336d77a3 chore: harden common errors, fastembed blocking, and ingest ownership
Run FastEmbed inference on spawn_blocking, propagate Surreal take
failures,
add AppError::internal and typed ingest/embedding parse errors, and take
owned file lists in ingestion payload construction.
2026-05-29 14:44:23 +02:00
Per Stark 9d5e7cd794 chore: improved error handling 2026-05-28 19:58:14 +02:00
Per Stark 30bb59f243 chore: rename get_id to id, add doc comments, pre-allocate format_history 2026-05-27 18:06:16 +02:00
Per Stark 224a7db451 chore: lowercase all error messages and add # Errors doc sections
- Fix err-lowercase-msg: normalize all #[error(...)] display strings to
  lowercase (AppError, FileError, ApiErr) and update affected tests
- Fix err-doc-errors: add # Errors sections to 25+ fallible public
  functions across db.rs, store.rs, embedding.rs, indexes.rs,
  ingestion_task.rs, and ingest_limits.rs
2026-05-27 14:59:48 +02:00
Per Stark 4579725130 chore: resolve remaining uninlined_format_args clippy warnings 2026-05-27 14:34:37 +02:00
Per Stark 0b08801c90 chore: fix and reduce clippy allows in knowledge_entity.rs
- rm duplicate 'document' match arm (match_same_arms)
- .get(0) -> .first() (get_first)
- for entity in all_entities.iter() -> &all_entities (explicit_iter_loop)
- 2x error!("{}", err_msg) -> error!("{err_msg}") (uninlined_format_args)
- 2x test format!()/assert!() positional -> inlined (uninlined_format_args)
- removed 6 now-unnecessary allow attributes
2026-05-27 14:28:08 +02:00
Per Stark 45d13230a6 chore: add must_use to 27 non-Result public functions
- constructors: KnowledgeEntity, TextChunk, Scratchpad, IngestionTask,
  Conversation, KnowledgeRelationship, Message, TextContent,
  KnowledgeEntityEmbedding, TextChunkEmbedding
- accessors: Theme::as_str, Theme::initial_theme, TaskState::as_str,
  TaskState::display_label, StorageManager::backend_kind,
  StorageManager::local_base_path, EmbeddingProvider::backend_label,
  EmbeddingProvider::dimension, EmbeddingProvider::model_code
- queries: TaskState::is_terminal, IngestionTask::can_retry,
  KnowledgeEntityType::variants, StorageManager::resolve_local_path,
  resolve_base_dir, IngestionTask::lease_duration
- helpers: Message::format_history
- builders: StorageManager::with_backend
2026-05-27 14:23:56 +02:00
Per Stark 0acdba4f54 fix: replace manual embedding serialization with serde_json
- replaced write!() loops with serde_json::to_string in 4 re-embedding methods
- standardized SQL building to use write!() with proper error propagation
- eliminates manual f32 vector string building (memory waste + loop risk)
2026-05-27 14:13:19 +02:00
Per Stark 9609880cff fix: revoke_api_key sets NONE, remove unused bind, lowercase error msgs
- fix bug where revoke_api_key set literal 'test_string_nullish' instead of NONE
- remove unused table_name bind in update_timezone
- lowercase ~16 error messages across 4 crates
2026-05-27 13:56:32 +02:00
Per Stark 31d585b59f chore: removed anyhow from apperror for improved error handling 2026-05-27 13:33:02 +02:00
Per Stark 890a4b381d chore: index slicing and lowercase errors 2026-05-27 12:41:26 +02:00
Per Stark 2d630e2af9 chore: tightening and removing super fn 2026-05-27 11:23:39 +02:00
Per Stark 9ec11e1f79 chore: clippy and nix fmt 2026-05-27 11:23:08 +02:00
Per Stark c60db0fb56 perf: avoid small own clones and intermediate Vec allocations
- Derive Copy on 6 small enums (MessageRole, TaskState, StorageKind, EmbeddingBackend, PdfIngestMode, KnowledgeEntityType)
- Change create_ingestion_payload files param from Vec<FileInfo> to &[FileInfo]
- Remove 5 intermediate Vec allocations (4 embedding serialization + 1 format_history) using write! loop
- Remove 7 unnecessary .clone() calls exposed by Copy derive
2026-05-27 10:28:08 +02:00
Per Stark f5f0454904 fix: html-router dependency of json-stream-parser 2026-05-27 09:59:26 +02:00
Per Stark 18aadab8ee refactor: json-stream-parser aligned to clippy standard 2026-05-27 09:07:38 +02:00
Per Stark 414d2f5b34 chore: additional clippy fixes after rebasing 2026-05-27 07:37:18 +02:00
Per Stark 293440b0ee fix: pin surrealdb 2026-05-26 20:21:40 +02:00
Per Stark 041d9bd81f clippy: evaluations crate 2026-05-26 20:21:25 +02:00
Per Stark b4383bb227 perf: pre-allocate collections with known capacity in hot paths
- Use with_capacity for chunk_by_source, results, per_entity_traces,
  and selected_chunks in assemble() where bound is known
- Pre-allocate tokens/terms vectors in normalize_fts_query and
  extract_keywords based on input length
- Pre-allocate neighbor_ids, seen, and ordered in graph expansion
  based on relationship count
2026-05-26 20:21:25 +02:00
Per Stark 6c7b586fc5 perf: offload blocking calls to spawn_blocking
- Move headless_chrome PDF rasterization from async context to
  spawn_blocking, keeping tokio worker threads responsive.
- Switch RerankerPool from tokio::sync::Mutex to std::sync::Mutex
  and run TextRerank::rerank inside spawn_blocking, since the
  rerank call is CPU-bound with no .await points.
2026-05-26 20:21:25 +02:00
Per Stark 1927149ce9 lint: inherit workspace clippy config in json-stream-parser and evaluations
Both crates were missing the [lints] workspace = true directive,
bypassing workspace clippy rules (unwrap_used, expect_used, etc.).
2026-05-26 20:21:25 +02:00
Per Stark a52dc802de refactor: simplify and improve testing for initialization 2026-05-26 20:21:24 +02:00
Per Stark 000852c94c clippy: adhere to pedantic clippy, uniform test error handling 2026-05-26 20:21:13 +02:00
Per Stark 6a5d631287 chore: remove unused clap dep and fix test_session_table name
- Remove clap dependency from retrieval-pipeline (RetrievalStrategy
  already has FromStr/Display; evaluations uses clap directly)
- Rename session table from test_session_table to session
2026-05-26 20:14:29 +02:00
Per Stark b965c5a2e6 refactor: replace Box<dyn Error> with anyhow::Result
- ingestion_pipeline::run_worker_loop returns anyhow::Result<()>
- api_router::ApiState::new returns anyhow::Result<Self>
- html_router::HtmlState::new_with_resources is infallible, returns Self
- main/server/worker binary entry points return anyhow::Result<()>
2026-05-26 20:14:11 +02:00
Per Stark 79e46e9c09 refactor: extract serde helpers from stored_object! macro
Move FlexibleIdVisitor, deserialize_flexible_id, and four datetime serde helpers
from repeating inside every macro expansion into a shared
common/src/storage/types/serde_helpers.rs module.

14 macro invocations × 6 items = ~84 fewer redundant function definitions.
Fragile cross-module imports (file_info::deserialize_flexible_id etc.)
are updated to point to the canonical module.
2026-05-26 20:12:54 +02:00
Per Stark f22a1e5ba4 chore: devenv inconsistency, spawn server manually in dev 2026-02-15 18:31:43 +01:00
Per Stark 4d237ff6d9 release: 1.0.2 v1.0.2 2026-02-15 11:57:04 +01:00
Per Stark eb928cdb0e test: minio to devenv, improved testing s3 and relationships 2026-02-15 08:52:56 +01:00
Per Stark 1490852a09 chore: dep updates & kv-mem separation to test feature
docker builder update
2026-02-15 08:51:48 +01:00
Per Stark b0b01182d7 test: add admin auth integration coverage 2026-02-14 23:11:35 +01:00
Per Stark 679308aa1d feat: caching chat history & dto 2026-02-14 19:43:34 +01:00
Per Stark f93c06b347 fix: harden html responses and cache chat sidebar data
Use strict template response handling and sanitized template user context, then add an in-process conversation archive cache with mutation-driven invalidation for chat sidebar renders.
2026-02-14 17:47:14 +01:00
Per Stark a3f207beb1 fix: simplified admin checking 2026-02-13 23:04:01 +01:00
Per Stark e07199adfc fix: name harmonization of endpoints & ingestion security hardening 2026-02-13 22:36:00 +01:00
Per Stark f22cac891c fix: redact ingestion payload logs and update changelog 2026-02-13 12:06:18 +01:00
Per Stark b89171d934 fix: parameterize storage-layer queries and add injection tests 2026-02-12 21:42:46 +01:00
Per Stark 0133eead63 fix: border in navigation 2026-02-12 20:39:36 +01:00
Per Stark e5d2b6605f fix: browser back navigation from chat windows
addenum
2026-02-12 20:32:06 +01:00
Per Stark bbad91d55b fix: references bug
fix
2026-02-11 22:02:40 +01:00