qmd — An on-device search engine that finds your notes and transcripts without the cloud

hero

A note you can't search is a note you don't have

Notes pile up every day, but when you actually need one, you can't find it. Dozens of meeting transcripts, document folders, scattered personal notes—all sitting on disk, none of them findable by filename alone. A note you can't search is effectively a note you don't have.

qmd takes this problem head-on. In one line: it's an on-device search engine for your markdown notes, transcripts, and documentation. It runs keyword search and natural-language search together, then re-ranks the results to push accuracy higher—all locally.

The problem it solves

Note search has traditionally been one or the other. Plain keyword grep is fast but knows nothing about meaning. Cloud semantic search is smart but requires uploading your notes to someone else's server. qmd combines the strengths of both while keeping everything on your machine.

The core idea is layering three kinds of search: BM25 full-text (keywords), vector semantic search (natural language), and LLM re-ranking. All of it runs on-device via node-llama-cpp with GGUF models. Your notes never leave your machine.

How it works

qmd splits search into three modes: fast keyword search, semantic search,and a hybrid that fuses both and re-ranks. The commands make the difference obvious.

qmd search "project timeline"           # Fast keyword search
qmd vsearch "how to deploy"             # Semantic search
qmd query "quarterly planning process"  # Hybrid + reranking (best quality)

Where qmd truly diverges from other tools is its context feature. Attach a description to a collection or path, and that description is returned whenever a sub-document matches. It works as a tree, so an LLM can make far better contextual choices about which documents to select. The README itself flags this with 'Don't sleep on it!'

Setup guide

Install globally with Node or Bun.

npm install -g @tobilu/qmd
# or
bun install -g @tobilu/qmd

Then register the folders you want to search as collections and generate embeddings for semantic search.

qmd collection add ~/notes --name notes
qmd collection add ~/Documents/meetings --name meetings
qmd context add qmd://notes "Personal notes and ideas"
qmd embed

Now your entire note set is searchable through both keywords and natural language.

A practical example — wiring search into agents

qmd shines in AI agent workflows. Its --json and --files output formats are designed to be consumed directly by an LLM.

qmd search "authentication" --json -n 10
qmd query "error handling" --all --files --min-score 0.4

For tighter integration, you can run an MCP server. In Claude Code, installing the plugin is the recommended path.

claude plugin marketplace add tobi/qmd
claude plugin install qmd@qmd

With HTTP transport, the server loads models once and stays up, avoiding the cost of reloading models on every call.

When not to use it

qmd isn't a silver bullet. Running GGUF models locally means embedding generation and re-ranking cost time and memory depending on your hardware. If you have few notes or only need filename matching, that weight isn't worth it.

And if your goal is real-time collaborative search or a central index shared across a whole team, that's a different shape from an on-device design. qmd is optimized for 'my knowledge, on my machine'—closer to personal and agent search than team sharing.

Comparing alternatives

If you want pure semantic search, a common combo is a cloud embedding API plus a vector database. Smart, but your data leaves the building and costs accumulate. At the other extreme, tools like ripgrep are fully local and blazing fast but understand no meaning.

qmd sits between these extremes. It stays local, uses keywords and meaning together, and shores up accuracy with a final LLM re-ranking pass. When you want privacy and search quality at the same time, it's a reasonable pick.

What to check before wiring it into an agent

QMD is more interesting as an agent retrieval layer than as a standalone note search command. The README calls out --json and --files outputs for agentic workflows. That means an agent can first ask QMD for the most relevant files, then retrieve only the selected documents with qmd get, instead of scanning an entire notes folder blindly.

The context feature is also worth checking early. When you attach context to a collection, that context is returned with matching sub-documents. For an LLM, this is the difference between seeing an isolated chunk and understanding what knowledge area that chunk belongs to. If your agent often picks the wrong note or misses the reason a document matters, collection context may be the feature that improves the workflow.

FAQ

Is QMD a cloud search service?
No. Based on the project README, QMD runs BM25 full-text search, vector semantic search, and LLM re-ranking locally through node-llama-cpp and GGUF models. The point is to search private notes and documents without sending them to an external service.

Will it always be faster than simple grep?
No. Plain keyword search can still be faster for small folders or exact matches. QMD becomes more useful when you need semantic search, reranking, and agent-friendly retrieval across notes, meeting transcripts, documentation, and knowledge bases.

How does it connect to Claude Code?
You can call the CLI directly, but the README also provides an MCP server path and a Claude Code plugin installation path. That makes it practical to use QMD as a local retrieval tool inside an agent workflow rather than as a separate manual search step.

Wrap-up

A note that can't be searched holds no value. qmd finds those notes on your own device, through both keywords and natural language. If you're looking to give an AI agent search capability, connecting it to your workflow via the MCP server or SDK is a solid starting point. Pair it with the context feature and your search quality climbs another notch.

🐦 Faster updates on X: @baegseungh7061
📚 More in this series: AI Insights
💌 Subscribe: Follow on X or grab the RSS

Seunghyeon's Agentic Lab

이 블로그 검색