GuildLM trains small specialist LLMs grouped into domain guilds and coordinated by a central brain. One master per domain. One brain to coordinate them all.
The antithesis of the "one giant model that knows everything" — dozens of sharp 3B–8B masters instead of a single 500B generalist.
A single monolithic model spreads itself thin across every domain. GuildLM does the opposite: many small specialists, each sharp at one thing, with a router that picks the right one.
model.generate(anything)
# hope it's good at this one
brain.route(request)
# → the master for the job
The brain classifies every request and routes it to a specialist — it never answers itself. Each guild holds a team of focused SLMs.
Every specialist is built by the same four domain-agnostic tools. The guild repos supply only the recipes; the engine never changes.
Discover → download → process → generate → build. Turns sources like GitHub into clean SFT datasets via a teacher model, with an offline mode that needs no GPU.
forge run --config go/forge/go_generator.yamlQLoRA supervised fine-tuning (and optional DPO) over a shared base model. Each specialist is a small LoRA adapter, mergeable and quantizable for serving.
anvil-train --config configs/guilds/go_generator.yamlPluggable evaluators score each adapter: a sandboxed go_functional runner builds and tests real Go, while llm_judge and safety grade prose. Reports in JSON + Markdown.
crucible run go/crucible/go_generator.yamlThe brain classifies intent, routes to a specialist, hot-swaps its LoRA adapter, and runs multi-step pipelines like reviewer → generator → reviewer for bug fixes.
brain ask "Fix the race condition in this Go code"Four domain-agnostic core tools, the first guild, a template to build more, and this site. All Apache-2.0.
The umbrella — project hub, local serving, and a runnable offline end-to-end demo wiring all four core tools together.
Data pipeline — discover, download, clean, and generate SFT datasets for any domain.
Training infra — QLoRA SFT, DPO, LoRA merge, and quantization for any base model.
Evaluation — pluggable evaluators, sandboxed code execution, JSON/Markdown reports.
The router — classify intent, route to a specialist, orchestrate multi-step pipelines.
First GuildThe Code Guild — Go specialists (generator, reviewer, tester, explainer) with full forge/anvil/crucible recipes.
The Code Guild's sibling — SQL specialists (generator, reviewer, optimizer, explainer), built from the template to prove guilds just plug in.
Boilerplate + new_guild.sh generator to scaffold a new guild in minutes.
The four domain-agnostic tools — production-grade with pyproject packaging, pytest and CI.
ShippedFour Go specialists with forge data recipes, anvil training recipes, and crucible eval suites — wired to the brain registry.
Spec completeA skeleton plus new_guild.sh that scaffolds a complete, schema-faithful guild from placeholder tokens.
ShippedNew domains built from the template — and a brain trained to route across all of them.
PlannedBuild a guild from the template, sharpen the Code Guild, or improve the core tools. Everything is open and Apache-2.0.