- Use with_capacity for chunk_by_source, results, per_entity_traces,
and selected_chunks in assemble() where bound is known
- Pre-allocate tokens/terms vectors in normalize_fts_query and
extract_keywords based on input length
- Pre-allocate neighbor_ids, seen, and ordered in graph expansion
based on relationship count
- Move headless_chrome PDF rasterization from async context to
spawn_blocking, keeping tokio worker threads responsive.
- Switch RerankerPool from tokio::sync::Mutex to std::sync::Mutex
and run TextRerank::rerank inside spawn_blocking, since the
rerank call is CPU-bound with no .await points.