Agent MCP Runtime

Compose AI skills.
Run them anywhere.

One binary pulls skill packs from Git repos, feeds them to any LLM you choose, and runs tools over MCP — all auto-composed for your project's framework.

agent-mcp-runtime --task "Your task here" Get started →

Gemini • OpenAI • Claude • Groq  •  Rails • Hanami auto-detection

Why this exists

The problem: skills and tools are scattered across repositories

AI agent skills live in separate Git repos — Ruby patterns, Rails workflows, Hanami idioms, planning disciplines. Each project needs a different combination. Manually composing them means copy-pasting prompts, remembering which skills overlap, and wiring tools by hand.

The runtime solves this: it auto-detects your framework, resolves skill conflicts, caches packs locally, and exposes everything to your LLM through one command.

Before Manual prompt assembly Find skills Resolve conflicts Craft prompt Wire tools Fragile & project-specific After agent-mcp-runtime --task "..." Auto-detect framework Resolve & compose packs Run ReAct loop → answer Repeatable & portable

The big picture

Six repositories, one runtime

Think of the runtime as a conductor. Four skill packs are the sheet music: each one is a Git repo full of Markdown files that teach an LLM how to perform specific tasks. The runtime reads them all, resolves overlaps, and hands a clean catalog to the LLM.

GIT REPOSITORIES ORCHESTRATOR OUTPUTS ruby-core-skills 15 shared Ruby skills Always loaded rails-agent-skills 28 Rails skills + 9 agents hanakai-yaku 35 Hanami/dry-rb skills agnostic-planning-skills 10 planning skills agent-mcp-runtime Rust CLI Pack resolver ReAct runner MCP server Context merger LLM APIs Gemini • OpenAI • Claude • Groq MCP Subprocess JSON-RPC over stdin/stdout ruby-skill-bench Measures ROI of context

How it works

Three mechanisms, one session

The runtime does three things every time you invoke it. Understanding each one reveals how the whole system composes itself around your project.

1. Pack Resolution

Skill packs are resolved in layers. Higher layers override lower ones for same-named skills. This lets framework-specific packs specialize core skills while local development overrides everything.

Priority 0 — Local registry pack development --registry ./my-pack/ overrides every skill below it. Use when authoring or testing a pack locally. Priority 10 — Framework packs (auto-detected) rails or hanami Detected from Gemfile. Overrides core skills of the same name. Optional with --pack flag. Priority 20 — Core pack (always loaded) ruby-core-skills 15 shared Ruby skills loaded in every session. Provides the foundation framework packs build on. Priority 30 — Default stack planning Language-agnostic planning skills. Loaded unless excluded. MERGE

2. ReAct Loop

The core execution engine. The LLM alternates between choosing a tool to call and observing its result. It keeps going until it has enough information for a final answer — or hits the step limit.

User: task Build system prompt (tools + format rules) LLM: ask_llm(prompt) parse_react_step(response) Final Answer: ... Return to user ✓ Action: tool name Action Input: args tool.call(input) → result Append "Observation: result" Step ceiling max_steps = 5 Prevents infinite loops and runaway LLM costs

3. Tool System

Every capability the agent can use — built-in skills, external MCP tools, project context — implements the same Tool trait. The runner treats them all uniformly.

trait Tool: Send + Sync name() • description() • call(input) → Result Built-in Skill Tools list_skills use_skill list_agents • use_agent • list_packs Project Context get_project_context Database schema • routes • models Merged from HTTP MCP providers Remote MCP Tools Any tool exposed by spawned MCP server McpTool wrapper • JSON-RPC Backed by RegistryResolver Backed by ProjectContext Backed by McpClient

Why a trait and not an enum?

Traits enable compile-time verification that every tool satisfies the contract. They also make testing trivial: swap in a MockTool that returns canned responses, and the entire ReAct loop becomes testable without networking.


Walkthrough

A concrete example

Here's what happens step-by-step when you ask the runtime to add a database column to a Rails model.

1 Parse CLI args Input: --task "Add full_name to User model" Provider: gemini (default) Model: gemini-1.5-flash 2 Auto-detect framework & resolve packs Reads Gemfile → detects Rails → loads core + rails packs → resolves to 43 skills, 9 agents 3 Query context providers Fetches database schema, routes, models, and gems from Rails AI Bridge (HTTP MCP) → ProjectContext 4 Register tools 6 built-in tools (list_skills, use_skill, list_agents, use_agent, list_packs, get_project_context) + discovered MCP tools 5 ReAct Loop (up to 5 steps) Step 1: LLM calls list_skills → sees 43 available skills Step 2: LLM calls get_project_context → sees User schema, existing columns Step 3: LLM calls use_skill("write-migration") → receives migration template instructions Final Answer: "Here is the migration to add full_name to the users table: ..." 6 Print result Final answer printed to stdout. Process exits with code 0.

Why it's built this way

Key design decisions

Git packs over embedded skills

Skills change independently of the runtime. Embedding them requires recompilation on every update. Git repos let packs version independently, accept PRs, and stay fresh with a pull.

Subprocess MCP over HTTP MCP

stdin/stdout pipes have zero network overhead, no port conflicts, and a simpler lifecycle (spawn → communicate → kill). HTTP MCP is still used for context providers where the data source is remote.

Thin LLM trait over provider APIs

The ask_llm(prompt) interface keeps providers swappable. The tradeoff: provider-specific features like function calling and structured output are unavailable. LLM latency dominates anyway.

Mockable traits over real I/O in tests

Every I/O boundary — LLM, git, MCP subprocess — has a mockable trait. This makes the entire runtime testable without network access, a Git remote, or an API key.

Frontmatter in Markdown for skill metadata

Skills are Markdown files with YAML frontmatter (like Jekyll/Hugo). Self-contained, human-readable, no separate metadata files. The cost: schema errors surface at runtime, not at parse time.

Zero unsafe Rust

unsafe_code = "deny" at the compiler level. Strict clippy gates, no .unwrap() in non-test code. Memory safety is enforced by the compiler, not by convention.


Guardrails

Safety and reliability

Step Ceiling Default max_steps = 5 Prevents infinite loops Limits per-session cost Graceful Degradation MCP errors = tool results Context provider failures ignored Missing packs = warning only Key Security API keys from env vars only Never in files or CLI args No secret in source code

Get started

Next steps

Install

curl -fsSL https://raw.githubusercontent.com/igmarin/agent-mcp-runtime/main/install.sh | bash

Run a task

agent-mcp-runtime --task "Add full_name to the User model"

Choose a provider & model

agent-mcp-runtime --provider openai --model gpt-4o --task "Refactor the service layer"

Attach an MCP server

agent-mcp-runtime --mcp-command npx --mcp-args -y @modelcontextprotocol/server-filesystem --mcp-args /tmp --task "List files"

Develop a pack locally

agent-mcp-runtime --registry ./my-pack --task "Test my new skill"

Deeper reading