mgrep: The CLI Tool That Greps Everything, Semantically
Ever found yourself grepping through a codebase for a specific concept, only to realize you need to search through PDFs, images, or documents too? Or maybe you've tried to find "that one function" but can only remember what it does, not its exact name. Traditional grep hits a wall when the search needs to be about meaning, not just raw text.
Enter mgrep from mixedbread-ai. It's a CLI-native tool that brings semantic search to your terminal, letting you grep not just code, but virtually any file type—using the meaning behind your words.
What It Does
mgrep is a command-line tool that performs semantic search across your files. You give it a natural language query (like "function that validates user login"), and it finds relevant content in your code, text files, PDFs, images, and more. It works by generating embeddings (vector representations) of both your query and your file contents, then finding the closest matches. It's grep, but for ideas and concepts.
Why It's Cool
The magic of mgrep isn't just that it searches semantically; it's that it does so across mixed modalities from your terminal. You can point it at a directory and it will intelligently process different file types using the appropriate models.
- Truly Multi-Format: It uses dedicated models for different content. Code, text, PDF text, and images are all encoded into the same vector space, so you can search across all of them with one query.
- CLI-Native: It feels like a classic Unix tool. Pipe it, redirect it, use it in your scripts. It slots right into a developer's existing workflow.
- Offline-First (mostly): While it can use cloud embedding APIs for high performance, it also supports local, offline models, keeping your data private.
- Smart Chunking: It breaks down large documents and images into meaningful chunks before creating embeddings, so your results are precise and relevant, not just a whole-file match.
Imagine searching your project for "database schema diagram" and having it return both the schema.sql file and the whiteboard screenshot you saved in your docs/ folder. That's the power mgrep unlocks.
How to Try It
Getting started is straightforward. You'll need Python (3.9+).
Install it:
pip install mgrepRun your first semantic search: The simplest way is to use the default, free API (requires an internet con