Memory Search

Samantha LLM works without search — memories are loaded by priority during bootstrap. But as your memory collection grows, semantic search lets you (and your assistant) find exactly the right context fast.

Memory search is powered by QMD, a local semantic search engine. Everything runs on your machine — no API calls, no cloud indexing.

Installing QMD

samantha-llm qmd install

This installs the Bun runtime and the QMD search engine. AI models (~2GB) are downloaded automatically on first use.

# Check installation status
samantha-llm qmd status

# Quick check (exit code 0 if installed)
samantha-llm qmd check

Indexing Your Memories

Before searching, index your memory files:

samantha-llm memories index

This indexes all Markdown files across your memory directories:

Short-term memory — recent interactions and decisions
Long-term memory — permanent knowledge
Current tasks — active projects
Work experience — completed project archive

Only changed files are re-indexed on subsequent runs, so indexing stays fast as your collection grows.

Searching

From the Command Line

samantha-llm memories search "authentication decision"

Results show matching memory excerpts with context, ranked by relevance.

Options:

Flag	Effect
`-n 20`	Return more results (default: 10)
`--json`	JSON output for scripting
`--text`	Plain text output

During Sessions

Your assistant can search memories during a session using the same command. This is useful when a question comes up that might be answered by a past decision or learning — the assistant searches, reads the relevant memories, and incorporates that context into the conversation.

Search Modes

QMD supports three search modes, each with different trade-offs:

Hybrid (Default)

Combines keyword matching, semantic understanding, and AI re-ranking for the best results. This is what runs when you use samantha-llm memories search.

Keyword

BM25 full-text search. Fastest option — good when you know the exact terms you’re looking for.

Semantic

Vector embedding similarity. Finds conceptually related memories even when the wording is different — useful for questions like “what did we decide about deployment?” when the memory uses words like “release process.”

Without QMD

QMD is optional. Without it:

Bootstrap still loads memories by priority (critical, high-importance, project-specific, recent)
Your assistant can still read memory files directly
Index files provide fast scanning without full-text search
The samantha-llm memories search command will prompt you to install QMD

The main thing you lose is the ability to search across hundreds of memories by meaning rather than filename. For small memory collections, you may not need it at all.

Models

QMD uses three local AI models, downloaded to ~/.cache/qmd/models/ on first use:

Model	Size	Purpose
Embedding	~300MB	Converts text to vectors for semantic matching
Re-ranking	~640MB	Scores and orders candidate results
Query expansion	~1.1GB	Reformulates queries for better recall

All models run locally. No data leaves your machine.

Next Steps

Memory System — How memories are structured and loaded
Subconscious System — Automatic memory creation
Installation & Setup — Full setup guide

Samantha LLM — Persistent Memory for AI Assistants

Documentation and website for samantha-llm