Switch extraction model to qwen2.5:1.5b, fix mem0migrations dims, update tests

- openmemory: use qwen2.5:1.5b instead of gemma3:1b for fact extraction
- test_pipeline.py: check qwen2.5:1.5b, fix SSE checks, fix Qdrant payload
  parsing, relax SearXNG threshold to 5s, improve marker word test
- potential-directions.md: ranked CPU extraction model candidates
- Root cause: mem0migrations collection had stale 1536-dim vectors causing
  silent dedup failures; recreate both collections at 768 dims

All 18 pipeline tests now pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Alvis
2026-02-23 05:11:29 +00:00
parent 66ab93aa37
commit 19e2c27976
3 changed files with 78 additions and 3 deletions

13
potential-directions.md Normal file
View File

@@ -0,0 +1,13 @@
# Potential Directions
## CPU Extraction Model Candidates (mem0 / openmemory)
Replacing `gemma3:1b` — documented JSON/structured output failures make it unreliable for mem0's extraction pipeline.
| Rank | Model | Size | CPU speed | JSON reliability | Notes |
|------|-------|------|-----------|-----------------|-------|
| 1 | `qwen2.5:1.5b` | ~934 MB | 2540 tok/s | Excellent | Best fit: fast + structured output, 18T token training |
| 2 | `qwen2.5:3b` | ~1.9 GB | 1525 tok/s | Excellent | Quality upgrade, same family |
| 3 | `llama3.2:3b` | ~2 GB | 1525 tok/s | Good | Highest IFEval score (77.4) in class |
| 4 | `smollm2:1.7b` | ~1.1 GB | 2535 tok/s | Moderate | Use temp=0; NuExtract-1.5-smol is fine-tuned variant |
| 5 | `phi4-mini` | ~2.5 GB | 1017 tok/s | Good | Function calling support, borderline CPU speed |