Agent Capabilities
Diminuendo’s coding agents are not stateless prompt-response loops. They are persistent, stateful processes with structured planning, interactive question-answer flows, cross-session memory extraction, and deep project configuration awareness. These capabilities are shared across all TypeScript agents — the Sonnet Agent (agents/sonnet-agent/) and the Coding Agent TS (agents/coding-agent-ts/) — and are orchestrated through the gateway’s protocol and event system.
This page documents the four capability domains that elevate these agents from simple tool-calling loops to context-aware engineering partners.
Project Configuration: AGENTS.md and CLAUDE.md
Every coding session begins with context injection. Before the first LLM call, the agent discovers and loads project instruction files following a strict priority order:User-Level Instructions
~/.claude/CLAUDE.md — global instructions that apply to all projects for this user.~/.claude/AGENTS.md — global agent-specific instructions.Project-Level Instructions
Checked into the codebase and shared across the team:
CLAUDE.mdor.claude/CLAUDE.mdAGENTS.md,.claude/AGENTS.md, or.github/AGENTS.md
Private Local Instructions
CLAUDE.local.md — user-private overrides, not committed to version control.<system-reminder> block that wraps the project instructions, git status, and current date. This block is injected into the first user message of each session — not the system prompt — which means it participates in the conversation context and can be referenced by the model in subsequent turns.
The _getAnnotation() method tags each file with its provenance: "project instructions, checked into the codebase", "user's private project instructions, not checked in", or "user's private global instructions for all projects". This gives the model explicit signal about which instructions are shared versus private, enabling it to reason about whether a preference is personal or team-wide.
Why AGENTS.md?
TheAGENTS.md convention extends the established CLAUDE.md pattern with agent-specific guidance. Where CLAUDE.md contains general project context (architecture notes, coding standards, dependency decisions), AGENTS.md can contain instructions specifically targeted at automated coding agents — tool usage preferences, file modification boundaries, testing requirements, or review criteria. The three discovery paths (AGENTS.md, .claude/AGENTS.md, .github/AGENTS.md) align with existing conventions across different project structures.
AskUserQuestion: Structured Interactive Dialogue
TheAskUserQuestion tool gives agents the ability to pause execution and present structured multi-choice questions to the user. This is not free-form text output — it is a typed protocol interaction that the client renders as a dedicated UI component with selectable options.
Protocol Mechanics
When the LLM invokesAskUserQuestion, the agent does not execute it through the standard tool registry. Instead, the agent intercepts the tool call at the agentic loop level:
- Validation — 1 to 4 questions, each with 2 to 4 options, unique labels, headers capped at 12 characters
- Event emission —
tool.question_requestedwith a UUIDrequest_idand the full question payload - Suspension — a deferred
Promiseis stored in_pendingQuestionskeyed byrequest_id; the tool loop blocks on this promise - Gateway relay — the
question_requestedevent is broadcast to the client via the existing PodiumEventMapper path - Client rendering — the client displays the question dialog with selectable options (an “Other” free-text option is auto-provided)
- Answer flow — the user’s response arrives as an
answer_questionclient message, which Podium delivers to the agent’sonAnswer()callback - Resume —
onAnswer()matches therequest_id, resolves the pending promise, and the tool loop continues with the formatted answer as the tool result
question_requested / answer_question protocol path that was already wired for approval flows. The key architectural insight is that the agent’s _approvalResolve pattern generalizes cleanly to a Map<string, resolver> — multiple concurrent questions could theoretically be outstanding, though in practice the sequential tool loop means only one is active at a time.
Schema
Mode Availability
AskUserQuestion is available in explore, plan, and review modes. In implement mode (which has allowedTools: null, meaning all tools), it is available by default. This ensures agents can clarify intent regardless of their current operating mode.
Plan Lifecycle: Structured Task Decomposition
The plan system gives agents the ability to decompose work into discrete, trackable steps — and to communicate progress on those steps in real time. Unlike free-form markdown plans written to files, this is a typed, event-driven lifecycle with gateway-level persistence and client-side rendering.Architecture
Three components collaborate:PlanTracker
Agent-side state machine. Holds the active plan with ordered steps, each progressing through
pending → in_progress → completed | skipped. Persisted to SonnetAgentState.activePlan for session resume.Plan Tools
PlanCreate and PlanUpdate — intercepted at the tool loop level (same pattern as AskUserQuestion). The LLM calls them as tools; the agent delegates to PlanTracker and emits events.Gateway Events
Four event types flow through PodiumEventMapper to the client:
plan.created, plan.step_started, plan.step_completed, plan.revised. Created/revised events are persistent (stored to session.db); step progress events are ephemeral.Event Flow
PlanCreate Schema
step-0, step-1, …) that the LLM references in subsequent PlanUpdate calls. The plan is mutable — PlanTracker.revise() replaces the step list while preserving the plan ID, emitting a plan.revised event.
Mode Availability
PlanCreate and PlanUpdate are available in plan mode and implement mode (all tools). This allows agents to create plans during planning phases and track progress during execution.
Cross-Session Memory: LLM-Extracted Persistent Knowledge
The memory system extracts reusable facts from completed coding sessions and makes them available to future sessions. This is not file-based MEMORY.md (which the agent already supports via AutoMemory) — it is a structured, queryable, cross-session knowledge store backed by per-session SQLite databases.Storage Architecture
memory.db file is an architectural decision with concrete benefits:
- Zero write-lock contention — memory extraction writes (which happen at session cleanup) never block the event/message write path
- Independent WAL —
memory.dbcan be checkpointed and replicated independently - Lazy creation — the file is only created when the first
insert_memorycommand arrives; short sessions that don’t extract memories incur zero filesystem overhead
Extraction Pipeline
Memory extraction is LLM-powered, following the same Ensemble client pattern used by the agent’sCompactionService:
- Trigger — the agent’s
cleanup()method fires when a session ends - Guard — sessions with fewer than 4 conversation turns are skipped (insufficient signal)
- LLM call — the full conversation history is sent to the Ensemble client with a structured extraction prompt
- Parse — the response is parsed as a JSON array of memories, each with
content,category,confidence, andtags - Emit — a
memory.extractedevent carries the memories through the gateway - Persist — the gateway’s event stream handler intercepts
memory_extracted, delegates toMemoryService.saveExtracted(), which writes to the session’smemory.dbvia the writer worker
Memory Schema
architecture, preferences, decisions, debugging, conventions, general.
Confidence scoring: the extraction LLM assigns a 0.0–1.0 confidence score. Memories below 0.5 confidence receive a 30-day TTL (expires_at); higher-confidence memories are permanent. Expired memories are filtered at read time — no background cleanup process required.
Supersession: when a newer memory contradicts an older one, superseded_by creates a soft-delete chain. The active memory index (idx_memories_active) filters on superseded_by IS NULL.
Cross-Session Recall
TheMemoryService.recallForSession() method aggregates memories across sessions:
- Index lookup — the tenant’s registry database contains a
memory_sessionstable that tracks which sessions have extracted memories and when - Targeted reads — only sessions with memories are queried (no directory scanning)
- Per-session reads — each session’s
memory.dbis opened read-only via the reader worker - Deduplication — exact content matches are collapsed
- Ranking — memories are sorted by
confidence × recencyand the top N are returned
memory_recall client message; the gateway responds with memory_recalled containing the aggregated memories. The memory_forget client message allows users to explicitly remove memories that are no longer relevant.
Gateway Integration
The memory system adds three client message types and three server event types to the protocol:| Client Message | Purpose |
|---|---|
memory_recall | Request cross-session memory aggregation |
memory_forget | Delete a specific memory |
| Server Event | Persistence | Purpose |
|---|---|---|
memory.extracted | Persistent | N memories extracted at session end |
memory.recalled | Persistent | Aggregated memories returned to client |
memory.forgotten | Persistent | Confirmation of memory deletion |
Agent Modes and Tool Availability
The four capabilities integrate with the existing mode system. Here is the complete tool availability matrix after these additions:| Tool | implement | explore | plan | review |
|---|---|---|---|---|
| Read, Glob, Grep, Bash | ✓ | ✓ | ✓ | ✓ |
| Edit, Write | ✓ | — | Write only | — |
| WebFetch, WebSearch | ✓ | ✓ | ✓ | — |
| GitHub (read) | ✓ | ✓ | ✓ | ✓ |
| GitHub (write) | ✓ | — | — | — |
| CodeOutline | ✓ | ✓ | ✓ | ✓ |
| AskUserQuestion | ✓ | ✓ | ✓ | ✓ |
| PlanCreate | ✓ | — | ✓ | — |
| PlanUpdate | ✓ | — | ✓ | — |
| Skill | ✓ | — | — | — |
cleanup(), orthogonal to the mode system entirely.
System Prompt Integration
Each capability adds a dedicated guidance section to the agent’s system prompt:- AskUserQuestion —
askUserQuestionGuidance()inSections.ts: when to use the tool, option formatting conventions, the auto-provided “Other” option, and the explicit instruction not to use it for plan approval - Plan tools — tool schemas in
schemas/index.tswith descriptive parameter documentation; the LLM learns usage from schema descriptions rather than a dedicated prompt section - Memory — extraction is implicit (fires at cleanup); recall can be injected into the system-reminder context alongside CLAUDE.md and git status on session start
- AGENTS.md — discovered and injected automatically via
ProjectConfig.buildContextInjection(); no prompt section needed because the content is the prompt