Memory Search
Samantha LLM works without search — memories are loaded by priority during bootstrap. But as your memory collection grows, semantic search lets you (and your assistant) find exactly the right context fast.
Memory search is powered by QMD, a local semantic search engine. Everything runs on your machine — no API calls, no cloud indexing.
Installing QMD
samantha-llm qmd install
This installs the Bun runtime and the QMD search engine. AI models (~2GB) are downloaded automatically on first use.
# Check installation status
samantha-llm qmd status
# Quick check (exit code 0 if installed)
samantha-llm qmd check
Indexing Your Memories
Before searching, index your memory files:
samantha-llm memories index
This indexes all Markdown files across your memory directories:
- Short-term memory — recent interactions and decisions
- Long-term memory — permanent knowledge
- Current tasks — active projects
- Work experience — completed project archive
Only changed files are re-indexed on subsequent runs, so indexing stays fast as your collection grows.
Searching
From the Command Line
samantha-llm memories search "authentication decision"
Results show matching memory excerpts with context, ranked by relevance.
Options:
| Flag | Effect |
|---|---|
-n 20 |
Return more results (default: 10) |
--json |
JSON output for scripting |
--text |
Plain text output |
During Sessions
Your assistant can search memories during a session using the same command. This is useful when a question comes up that might be answered by a past decision or learning — the assistant searches, reads the relevant memories, and incorporates that context into the conversation.
Search Modes
QMD supports three search modes, each with different trade-offs:
Hybrid (Default)
Combines keyword matching, semantic understanding, and AI re-ranking for the
best results. This is what runs when you use samantha-llm memories search.
Keyword
BM25 full-text search. Fastest option — good when you know the exact terms you’re looking for.
Semantic
Vector embedding similarity. Finds conceptually related memories even when the wording is different — useful for questions like “what did we decide about deployment?” when the memory uses words like “release process.”
Without QMD
QMD is optional. Without it:
- Bootstrap still loads memories by priority (critical, high-importance, project-specific, recent)
- Your assistant can still read memory files directly
- Index files provide fast scanning without full-text search
- The
samantha-llm memories searchcommand will prompt you to install QMD
The main thing you lose is the ability to search across hundreds of memories by meaning rather than filename. For small memory collections, you may not need it at all.
Models
QMD uses three local AI models, downloaded to ~/.cache/qmd/models/ on
first use:
| Model | Size | Purpose |
|---|---|---|
| Embedding | ~300MB | Converts text to vectors for semantic matching |
| Re-ranking | ~640MB | Scores and orders candidate results |
| Query expansion | ~1.1GB | Reformulates queries for better recall |
All models run locally. No data leaves your machine.
Next Steps
- Memory System — How memories are structured and loaded
- Subconscious System — Automatic memory creation
- Installation & Setup — Full setup guide