Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Semble – Fast code search for agents with near-transformer accuracy (github.com/minishlab)
7 points by stephantul 43 days ago | hide | past | favorite
Hey HN! We've just open-sourced Semble, a fast and accurate code search library built for agents. We're also releasing potion-code-16M, a small code-specialized static embedding model that powers it.

Most embedding-based code search methods are either too slow to index on demand or need GPU infrastructure, while grep-style retrieval methods often cannot find the relevant content. Semble combines the speed and quality benefits of both, so agents waste less time and fewer tokens exploring.

Main features:

- Fast: indexes a full codebase in ~250 ms and answers queries in ~1.5 ms, all on CPU (roughly ~200x faster indexing and ~10x faster queries than a code-specialized transformer).

- Accurate: on par with code-specialized transformer models at a fraction of the size (see our benchmarks for more info).

- MCP server: drop-in tool for Claude Code, Cursor, Codex, OpenCode, and any other MCP-compatible CLI/agent. Repos are cloned and indexed on demand.

- Zero setup: runs on CPU with no API keys, GPU, or external services.

Install as an MCP server for Claude Code:

claude mcp add semble -s user -- uvx --from "semble[mcp]" semble

Or check our README for install instructions for Codex, OpenCode, Cursor, and other agents.

Semble: https://github.com/MinishLab/semble

Benchmarks: https://github.com/MinishLab/semble/tree/main/benchmarks

How it works: https://github.com/MinishLab/semble#how-it-works

Model: https://huggingface.co/minishlab/potion-code-16M



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: