Skip to main content

Security

Diminuendo implements a defense-in-depth security architecture across authentication, authorization, transport, input validation, and error handling. No single layer is relied upon exclusively — each defense operates independently and provides protection even if adjacent layers are compromised or misconfigured.

Authentication

Auth0 JWT Verification

In production, every WebSocket connection must authenticate by sending an authenticate message with a JWT token. The gateway verifies tokens against Auth0’s JWKS endpoint:
const jwksUrl = new URL("/.well-known/jwks.json", authUrl)
const jwks = jose.createRemoteJWKSet(jwksUrl)

const { payload } = await jose.jwtVerify(jwt, jwks, {
  issuer: expectedIssuer,
  ...(expectedAudience ? { audience: expectedAudience } : {}),
})
The jose library handles JWKS rotation automatically — when a key ID in the token doesn’t match the cached keyset, it fetches the latest keys from the endpoint.

JWT Verification Cache

Asymmetric JWT verification (RS256) is computationally expensive. To avoid paying this cost on every message from a previously authenticated connection, the gateway maintains an LRU cache of verified tokens:
  • Cache key: SHA-256 hash of the raw JWT string (using Bun.CryptoHasher)
  • Cache size: 10,000 entries maximum
  • TTL: Derived from the token’s exp claim, capped at 5 minutes
  • Eviction: FIFO when at capacity; expired entries cleaned up every 60 seconds
const JWT_CACHE_MAX = 10_000
const jwtCache = new Map<string, { identity: AuthIdentity; expiresAt: number }>()

// Cache with TTL from exp claim (default 5min)
const ttlMs = result.exp ? (result.exp * 1000 - Date.now()) : 5 * 60 * 1000
if (ttlMs > 0) {
  if (jwtCache.size >= JWT_CACHE_MAX) {
    const firstKey = jwtCache.keys().next().value
    if (firstKey !== undefined) jwtCache.delete(firstKey)
  }
  jwtCache.set(cacheKey, {
    identity,
    expiresAt: Date.now() + Math.min(ttlMs, 5 * 60 * 1000),
  })
}

Dev Mode Bypass

When DEV_MODE=true, authentication is bypassed entirely. All connections are automatically authenticated as developer@example.com with tenant ID dev. This is intended exclusively for local development.
Dev mode must never be enabled in production. The gateway logs a clear message at startup: "Auth: Dev mode enabled -- all requests authenticated as developer@example.com". Monitor for this message in production logs as a misconfiguration signal.

Role-Based Access Control

Roles and Permissions

The RBAC system defines 5 roles with 12 granular permissions:
Rolesession:createsession:readsession:writesession:deletesession:archivesession:steermember:readmember:writemember:deletebilling:readbilling:writetenant:admin
ownerYYYYYYYYYYYY
adminYYYYYYYY-Y--
billing_adminYYY-YY---YY-
memberYYY-YY------
viewer-Y----------

MembershipService

Role assignments are stored in per-tenant SQLite databases (tenants/{tenantId}/registry.db). The MembershipService provides CRUD operations on the tenant_members table:
CREATE TABLE IF NOT EXISTS tenant_members (
  tenant_id TEXT NOT NULL,
  user_id TEXT NOT NULL,
  role TEXT NOT NULL DEFAULT 'member',
  created_at INTEGER NOT NULL,
  updated_at INTEGER NOT NULL,
  PRIMARY KEY (tenant_id, user_id)
);

Owner Bootstrap

The first user to authenticate against a tenant is automatically bootstrapped as owner. Subsequent users are assigned the member role by default:
const members = yield* membership.memberCount(identity.tenantId)
const bootstrapRole: Role = members === 0 ? "owner" : "member"
yield* membership.setRole(identity.tenantId, identity.userId, bootstrapRole)

Last Owner Protection

The system prevents removing or demoting the last owner of a tenant. Both the removeMember and set_role operations check the owner count before proceeding:
if (existingMember?.role === "owner" && message.role !== "owner") {
  const ownerCount = members.filter((m) => m.role === "owner").length
  if (ownerCount <= 1) {
    return { kind: "respond", data: {
      type: "error",
      code: "LAST_OWNER_PROTECTED",
      message: "Cannot demote the last owner of a tenant",
    }}
  }
}

Permission Enforcement

The requirePermission function is the authorization checkpoint. It is called before every sensitive operation:
export function requirePermission(
  identity: AuthIdentityWithRole,
  permission: Permission,
): Effect.Effect<void, Unauthorized> {
  if (hasPermission(identity.role, permission)) {
    return Effect.void
  }
  return Effect.fail(new Unauthorized({ tenantId: identity.tenantId, resource: permission }))
}

CSRF Protection

Cross-Site Request Forgery protection uses a three-layer defense. Each layer is checked in order; the request passes if any layer succeeds:
1

Sec-Fetch-Site Header

If the browser sends Sec-Fetch-Site: same-origin or Sec-Fetch-Site: none, the request is same-origin and passes immediately. This header cannot be spoofed by JavaScript running in the same browser — it is set by the browser itself.
2

Origin Header

If the Origin header is present, it is checked against the ALLOWED_ORIGINS configuration list. If absent (non-browser clients such as CLIs and SDKs do not send Origin), the request passes — non-browser clients are not CSRF vectors.
3

Referer Header Fallback

If the Origin check fails, the Referer header is parsed and its origin component is checked against the same allowlist. This handles edge cases where some browsers strip the Origin header on certain redirect chains.
export function checkCsrf(
  req: Request,
  allowedOrigins: readonly string[],
  devMode = false,
): CsrfCheckResult {
  if (devMode) return { ok: true }

  const secFetchSite = req.headers.get("sec-fetch-site")
  if (secFetchSite === "same-origin" || secFetchSite === "none") return { ok: true }

  const origin = req.headers.get("origin")
  if (!origin) return { ok: true }  // Non-browser client

  if (allowedOrigins.includes(origin)) return { ok: true }

  // Fallback: check Referer
  const referer = req.headers.get("referer")
  if (referer) {
    try {
      if (allowedOrigins.includes(new URL(referer).origin)) return { ok: true }
    } catch { /* malformed referer */ }
  }

  return { ok: false, reason: `Origin '${origin}' is not in the allowed list` }
}

SSRF Guard

The assertSafeUrl function validates that outbound HTTP requests do not target internal networks. It inspects the URL’s hostname and blocks:

Private IPv4 Ranges

0.0.0.0/8, 10.0.0.0/8, 127.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 169.254.0.0/16

IPv6 Addresses

Loopback (::1), unique local (fc00::/7), link-local (fe80::/10), IPv4-mapped (::ffff:127.0.0.1)

Cloud Metadata

169.254.169.254 (AWS/GCP), metadata.google.internal, metadata.google

Obfuscation Techniques

Bare integer IPs (http://2130706433), octal notation (0177.0.0.1), hex notation (0x7f.0.0.1)
The guard also restricts URL schemes to http: and https: only, blocking file:, ftp:, gopher:, and other protocol handlers.

IPv4-Mapped IPv6 Detection

A sophisticated attacker might attempt to bypass IPv4 range checks by using IPv4-mapped IPv6 addresses. The guard handles both dotted-quad form (::ffff:127.0.0.1) and hex form (::ffff:7f00:1):
// Dotted-quad form
const v4MappedMatch = cleaned.match(/^::ffff:(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})$/)
if (v4MappedMatch) {
  return isPrivateIPv4(v4MappedMatch[1])
}

// Hex form — reconstruct the IPv4 address from hex groups
const hexMappedMatch = cleaned.match(/^::ffff:([0-9a-f]{1,4}):([0-9a-f]{1,4})$/)
if (hexMappedMatch) {
  const hi = parseInt(hexMappedMatch[1], 16)
  const lo = parseInt(hexMappedMatch[2], 16)
  const ip = `${(hi >> 8) & 0xff}.${hi & 0xff}.${(lo >> 8) & 0xff}.${lo & 0xff}`
  return isPrivateIPv4(ip)
}
The SSRF guard validates the hostname string, not the resolved IP address. This means it cannot protect against DNS rebinding attacks, where a hostname initially resolves to a public IP but is later re-resolved to a private IP. For full protection in high-security environments, validate the resolved IP at connect time using a custom DNS resolver or connect-time hook.

Security Headers

Every HTTP response — including WebSocket upgrade responses, health checks, and 404s — includes a comprehensive set of security headers:
export function securityHeaders(): Record<string, string> {
  return {
    "Strict-Transport-Security": "max-age=31536000; includeSubDomains",
    "Content-Security-Policy": "default-src 'self'; connect-src 'self' wss: ws:",
    "X-Frame-Options": "DENY",
    "X-Content-Type-Options": "nosniff",
    "Referrer-Policy": "strict-origin-when-cross-origin",
    "Permissions-Policy": "camera=(), microphone=(), geolocation=()",
    "X-DNS-Prefetch-Control": "off",
    "X-XSS-Protection": "0",
  }
}
HeaderPurpose
Strict-Transport-SecurityForces HTTPS for 1 year, including subdomains
Content-Security-PolicyRestricts resource loading to same-origin; allows WebSocket connections
X-Frame-Options: DENYPrevents clickjacking via iframes
X-Content-Type-Options: nosniffPrevents MIME type sniffing
Referrer-PolicySends origin only on cross-origin requests
Permissions-PolicyDenies access to camera, microphone, and geolocation APIs
X-DNS-Prefetch-Control: offDisables DNS prefetching to prevent information leakage
X-XSS-Protection: 0Disables the legacy XSS Auditor (which has known bypass quirks that can introduce vulnerabilities)

Error Sanitization

All error messages sent to clients pass through sanitizeErrorMessage, which applies three transformations:
1

Strip Stack Traces

Removes lines matching the pattern \n\s*at\s+... — standard V8 stack trace lines that expose internal file paths.
2

Redact API Keys

Replaces matches for five secret patterns with [REDACTED]:
const SECRET_PATTERNS: RegExp[] = [
  /sk-ant-[a-zA-Z0-9-]+/g,   // Anthropic API keys
  /sk-[a-zA-Z0-9-]+/g,       // OpenAI API keys
  /ghp_[a-zA-Z0-9]+/g,       // GitHub personal access tokens
  /Bearer [a-zA-Z0-9._-]+/g, // Bearer tokens
  /token=[a-zA-Z0-9._-]+/g,  // Query string tokens
]
3

Truncate

Messages exceeding 500 characters are truncated with an ellipsis. This bounds the size of error responses and prevents verbose internal errors from leaking excessive detail.
Additionally, the message router maps known error tags to safe, generic messages:
const safeMessages: Record<string, string> = {
  Unauthenticated: "Authentication required",
  Unauthorized: "Insufficient permissions",
  SessionNotFound: "Session not found",
  PodiumConnectionError: "Failed to connect to agent",
  DbError: "Database operation failed",
  InsufficientCredits: "Insufficient credits",
}

Rate Limiting

Per-Connection Rate Limiter

Every WebSocket connection gets its own sliding-window rate limiter:
  • Limit: 60 messages per 10-second window
  • Enforcement: Checked before message parsing. Exceeding the limit returns an error event with code RATE_LIMITED
  • Cleanup: The limiter is removed from the tracking map when the WebSocket closes
class RateLimiter {
  private timestamps: number[] = []
  constructor(
    private readonly maxMessages: number,  // 60
    private readonly windowMs: number,     // 10,000
  ) {}
  allow(): boolean {
    const now = Date.now()
    this.timestamps = this.timestamps.filter((t) => t > now - this.windowMs)
    if (this.timestamps.length >= this.maxMessages) return false
    this.timestamps.push(now)
    return true
  }
}

Authentication Rate Limiter

A separate, IP-based rate limiter protects the authentication endpoint:
ParameterValue
Max attempts10 per IP
Window60 seconds
Lockout duration5 minutes
Max tracked IPs10,000
The 10,000-IP bound prevents memory exhaustion from spoofed source addresses. When the map reaches capacity, the oldest entry is evicted (FIFO). On successful authentication, the IP’s record is cleared entirely:
recordSuccess(ip: string): void {
  this.records.delete(ip)
}
The auth rate limiter runs a cleanup() method every 60 seconds to prune expired entries and release lockouts. This periodic maintenance ensures the tracking map does not grow unbounded even if recordSuccess is never called for some IPs.

Input Validation

Schema Validation

Every incoming WebSocket message is validated against Effect Schema definitions before processing:
const decoded = Schema.decodeUnknownEither(ClientMessage)(parsed)
if (decoded._tag === "Left") {
  ws.send(JSON.stringify({
    type: "error",
    code: "INVALID_MESSAGE",
    message: "Message does not match any known schema",
  }))
  return
}
The ClientMessage schema is a union of 49 message types, each with its own field requirements. Messages that do not match any variant are rejected immediately.

Message Size Limit

Raw messages exceeding 1 MB are rejected before JSON parsing, preventing denial-of-service via oversized payloads:
if (raw.length > 1_048_576) {
  ws.send(JSON.stringify({
    type: "error",
    code: "MESSAGE_TOO_LARGE",
    message: "Message exceeds maximum allowed size (1MB)",
  }))
  return
}

Session ID Path Traversal Prevention

Session IDs are used to construct file paths for per-session SQLite databases. The resolveSessionDir function validates that the resolved path stays within the sessions base directory:
export function resolveSessionDir(sessionsBaseDir: string, sessionId: string): string {
  const resolved = path.resolve(sessionsBaseDir, sessionId)
  if (!resolved.startsWith(sessionsBaseDir + path.sep)) {
    throw new Error("Invalid session ID")
  }
  return resolved
}
A session ID like ../../etc/passwd would resolve to a path outside sessionsBaseDir and be rejected.

Tenant ID Validation

Tenant IDs extracted from JWT claims are validated against a strict pattern before use:
const TENANT_ID_PATTERN = /^[a-zA-Z0-9_-]+$/

export function isValidTenantId(tenantId: string): boolean {
  return TENANT_ID_PATTERN.test(tenantId)
}
Only alphanumeric characters, hyphens, and underscores are permitted. This prevents directory traversal and SQL injection through the tenant ID.

Transport Security

WebSocket Upgrade Validation

The WebSocket upgrade path (/ws) validates the Origin header before upgrading the connection. Non-browser clients (which don’t send Origin) are allowed through, but browser-based connections from unauthorized origins are rejected with a 403 response.

Connection Lifecycle

ParameterValue
Max payload length2 MB (maxPayloadLength)
Idle timeout120 seconds
Per-message deflateDisabled (avoids CRIME-class compression attacks)
Heartbeat interval30 seconds

WAL Mode SQLite

All SQLite databases are opened with PRAGMA journal_mode = WAL. WAL mode prevents database corruption from concurrent access (reader and writer workers accessing the same file) and provides crash recovery — incomplete transactions are rolled back automatically on the next open.
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA busy_timeout = 5000;
synchronous = NORMAL provides a balance between durability and performance: data is safe after a process crash but may be lost on an OS crash or power failure. For a gateway managing ephemeral agent sessions, this trade-off is appropriate.

OpenBSD-Inspired Defensive Hardening

The gateway applies design principles from OpenBSD — privilege separation, capability restriction, fail-closed defaults, resource bounds, and defense in depth. Each hardening measure operates independently; no single layer is relied upon exclusively.

Fail-Closed Defaults

Principle: When something goes wrong, deny access. Never fail into a permissive state.

Role Resolution

When the membership database is unavailable or a user’s role cannot be determined, the gateway defaults to viewer (minimum privilege) rather than member. This prevents a database outage from silently granting elevated permissions:
// middleware/auth.ts
const role: Role = yield* membership.getRole(identity.tenantId, identity.userId).pipe(
  Effect.map((r) => r ?? ("viewer" as Role)),
  Effect.catchAll(() => Effect.succeed("viewer" as Role)),
)

Origin Policy

When ALLOWED_ORIGINS is empty in production, all browser-origin requests are rejected rather than accepted. Operators must explicitly configure allowed origins:
if (config.allowedOrigins.length === 0) return false  // Reject all browser origins

DevMode Production Guard

The gateway refuses to start with DEV_MODE=true if production environment indicators are detected (e.g., NODE_ENV=production, FLY_APP_NAME, AWS_EXECUTION_ENV, K_SERVICE). This prevents accidental bypass of authentication, CSRF protection, and rate limiting in production.

WebSocket Compression

WS_COMPRESSION defaults to false. Enabling compression on WebSocket connections exposes CRIME/BREACH-class side-channel attacks. Opt-in only via WS_COMPRESSION=true.

Resource Bounds

Principle: Every allocation must be bounded. Every data structure must have a maximum size.
Data StructureBoundEviction Strategy
JWT verification cache10,000 entriesFIFO when at capacity
Auth rate limiter IPs10,000 entriesFIFO + periodic cleanup
HTTP rate limiter buckets50,000 entriesFIFO
Active session tracking100,000 entriesStops tracking (session still functions)
Per-connection rate limiters100,000 entriesCleaned on WS close; skips creation at capacity
Tenant DB connections1,000 connectionsLRU eviction + 30-minute idle timeout
Podium event queue (per connection)10,000 eventsSliding window (oldest events dropped)
Webhook dedup IDs10,000 entriesTime-bounded (5-minute TTL)
The TenantDbPool implements LRU eviction with idle timeout: connections unused for 30 minutes are closed, and when at the 1,000-connection capacity, the least recently accessed connection is evicted. The Podium event queue uses a sliding window (Queue.sliding(10_000)) rather than an unbounded queue. If an agent floods events faster than the gateway can process them, the oldest events are dropped — acceptable for streaming where only the latest state matters.

Capability Restriction

Principle: After initialization, declare exactly what resources are needed. Deny everything else.

Filesystem Confinement (FsGuard)

The FsGuard (src/security/fs-guard.ts) validates that all filesystem operations stay within declared writable directories. It is integrated into TenantDbPool to prevent directory traversal through tenant-controlled paths:
const fsGuard = buildFsGuard({ dataDir: config.dataDir })

// Before any filesystem I/O with tenant-derived paths:
fsGuard.assertWritable(tenantDir)  // Throws if outside data directory
This operates as a second line of defense behind the tenant ID regex validation — even if the regex is somehow bypassed, the filesystem guard rejects paths outside the data directory.

Network Confinement (OutboundGuard)

The OutboundGuard (src/security/outbound-guard.ts) validates outbound HTTP requests against a declared hostname allowlist. It prevents the gateway from making requests to unauthorized hosts — the TypeScript analog of OpenBSD’s unveil() for network access:
const guard = new OutboundGuard(['api.github.com', 'github.com', 'podium.internal', ...])
guard.assertAllowed('https://api.github.com/repos')  // OK
guard.assertAllowed('https://evil.com/exfil')         // Throws
The guard only affects outbound HTTP requests from the gateway process itself (upstream service clients). It does not affect:
  • Agent code running on E2B sandboxes (separate process/container with own network stack)
  • Bash commands executed on sandboxes
  • WebSocket connections from clients to the gateway
The allowed host list is automatically derived from configured service URLs (Podium, Ensemble, Auth0) plus GitHub API hosts.

Environment Variable Scrubbing

After AppConfig loads all configuration at startup, sensitive environment variables (AUTH_CLIENT_SECRET, PODIUM_API_KEY, ENSEMBLE_API_KEY, GITHUB_CLIENT_SECRET, GITHUB_WEBHOOK_SECRET, etc.) are deleted from process.env. This reduces the attack surface if arbitrary code execution occurs later — secrets are only accessible through the typed AppConfig service.

Webhook Secret in Config

The GitHub webhook HMAC secret is loaded via AppConfig at startup (githubWebhookSecret), not read from process.env at request time. This ensures the secret is validated once at startup and is not accessible via the global environment after scrubbing.

Enhanced Input Validation

Principle: Never trust data that crosses a trust boundary. Validate everything at the point of entry.

Schema String Length Limits

All 30+ request body Schema definitions enforce explicit maxLength constraints on every string field:
Field CategoryMax Length
Resource IDs (agentType, projectId, etc.)128
Names (thread name, project name)256
Descriptions2,048
Email addresses320
Tokens (access, refresh, invitation)512–8,192
Message text (run_turn)500,000
File content (base64 upload)10,000,000
GitHub issue/PR body65,536

Array Length Limits

Arrays in GitHub-related schemas are bounded to prevent denial-of-service through arrays with millions of elements:
  • Labels: max 100 items
  • Assignees: max 50 items
  • Reviewers: max 50 items

Safe Integer Parsing

All parseInt() calls on URL parameters use a bounds-checked wrapper that returns null for NaN, negative values, or values exceeding 2^31 - 1:
function safeParseInt(value: string, min: number, max: number): number | null {
  const n = parseInt(value, 10)
  if (Number.isNaN(n) || n < min || n > max) return null
  return n
}
This prevents NaN propagation and out-of-bounds values from reaching GitHub API calls.

Resource ID Validation

All route parameters (:id, :userId, :service) are validated against a strict pattern (/^[a-zA-Z0-9_.-]{1,128}$/) before any service call. Invalid identifiers receive an immediate 400 response.

Tenant ID Length Bound

Tenant IDs are bounded to 64 characters (/^[a-zA-Z0-9_-]{1,64}$/), preventing unbounded string allocation from JWT claims.

Defense in Depth: Upstream Response Validation

Responses from upstream services (Podium, Ensemble) are validated against Effect Schema definitions before use. This prevents a compromised upstream service from injecting malicious payloads:
// PodiumClient createInstance response:
const CreateInstanceResponse = Schema.Struct({
  instance_id: Schema.String.pipe(Schema.maxLength(512)),
  deployment_id: Schema.String.pipe(Schema.maxLength(512)),
})

// EnsembleClient generate response:
const GenerateResponse = Schema.Struct({
  content: Schema.String,
  usage: Schema.Struct({
    input_tokens: Schema.Number,
    output_tokens: Schema.Number,
  }),
})

Security Audit Events

Security-relevant events are logged to the AuditService for forensic analysis. The following event types are tracked:
EventAudit ActionLocation
Authentication failuresecurity.auth_failureWebSocket message handler
Auth rate limit lockoutsecurity.auth_rate_limitedWebSocket message handler
WebSocket message rate limitsecurity.ws_rate_limitedWebSocket message handler
CSRF rejectionsecurity.csrf_rejectedHTTP router, WebSocket upgrade
Oversized messagesecurity.oversized_messageWebSocket message handler
These events are stored in the per-tenant audit_log table alongside application events, enabling correlation between security incidents and application behavior.

Feature Gating

Principle: If you don’t need it, don’t include it. Unused features should not be discoverable. Feature flags are auto-detected from the application configuration:
FeatureDetectionDisabled Behavior
GitHub integrationGITHUB_CLIENT_ID is set/api/github/* and /api/oauth/github/* routes return 404
Prometheus metricsMETRICS_PROMETHEUS=true/metrics endpoint not served
In dev mode, all features are enabled regardless of configuration. In production, unconfigured features are invisible — their routes are not discoverable and return standard 404 responses.

Startup Configuration Validation

The gateway validates its configuration at startup and logs warnings for insecure or incomplete production configurations:
ConditionSeverityMessage
ALLOWED_ORIGINS empty in productionWarningAll browser-origin requests will be rejected
AUTH_URL not set in productionErrorAll authentication will fail
WS_COMPRESSION enabledWarningCRIME/BREACH-class side-channel risk
PODIUM_API_KEY emptyWarningPodium calls will be unauthenticated
GITHUB_WEBHOOK_SECRET emptyWarningWebhook signature verification will reject all requests

Backpressure Awareness

The WebSocket drain callback tracks backpressure events via the ws_drain_events metric counter. When a client receives data too slowly, Bun buffers outbound frames; the drain callback fires when the buffer is flushed. This provides observability into slow-client conditions that could lead to memory pressure.

Runtime Role Validation

Role values from request bodies are validated against the isValidRole() function before being passed to the MembershipService. This prevents arbitrary strings from being stored as roles:
if (!isValidRole(body.role)) {
  return errorResponse({ code: "INVALID_REQUEST", message: `Invalid role: ${body.role}` })
}