- openmemory: use qwen2.5:1.5b instead of gemma3:1b for fact extraction - test_pipeline.py: check qwen2.5:1.5b, fix SSE checks, fix Qdrant payload parsing, relax SearXNG threshold to 5s, improve marker word test - potential-directions.md: ranked CPU extraction model candidates - Root cause: mem0migrations collection had stale 1536-dim vectors causing silent dedup failures; recreate both collections at 768 dims All 18 pipeline tests now pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
843 B
843 B
Potential Directions
CPU Extraction Model Candidates (mem0 / openmemory)
Replacing gemma3:1b — documented JSON/structured output failures make it unreliable for mem0's extraction pipeline.
| Rank | Model | Size | CPU speed | JSON reliability | Notes |
|---|---|---|---|---|---|
| 1 | qwen2.5:1.5b |
~934 MB | 25–40 tok/s | Excellent | Best fit: fast + structured output, 18T token training |
| 2 | qwen2.5:3b |
~1.9 GB | 15–25 tok/s | Excellent | Quality upgrade, same family |
| 3 | llama3.2:3b |
~2 GB | 15–25 tok/s | Good | Highest IFEval score (77.4) in class |
| 4 | smollm2:1.7b |
~1.1 GB | 25–35 tok/s | Moderate | Use temp=0; NuExtract-1.5-smol is fine-tuned variant |
| 5 | phi4-mini |
~2.5 GB | 10–17 tok/s | Good | Function calling support, borderline CPU speed |