Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.simplefunctions.dev/llms.txt

Use this file to discover all available pages before exploring further.

Status: historical RFC. The v1 alpha surface is now published in @spfunctions/agent/v1. This RFC records the boundary that preceded the @spfunctions/agent/v1 alpha runtime. The alpha package now exposes a model-loop surface, sessions, hooks, watch primitives, and Cursor-style compatibility. The constraints remain: no endpoint expansion, no live trading, no CLI shell-out, and no hosted trace/session backend.

Current boundary

@spfunctions/agent v0 is the governed direct tool runner. It provides:
  • strict manifest loading from GET /api/contracts/tools
  • canonical direct tool calls through call()
  • event streaming through stream()
  • policy gates for permissions, sideEffect, and costEffect
  • trace record/replay
  • API-key-first live execution
v1 is different. It is an alpha model-backed workflow runtime that plans and executes governed tool calls over several turns. It must not blur into the CLI, MCP, /api/tools, or live-trading automation.

Non-goals

These remain non-goals for the v1 alpha:
  • model provider SDK dependency
  • hosted run endpoint
  • background worker
  • MCP runtime
  • browser runtime
  • live trading
  • write/default Agent tools
  • events.*
  • market.related
  • auth.status
  • investigations.create
  • intents.propose
  • webhooks.create
The v1 runtime must not shell out to sf agent. The CLI may later reuse Agent SDK internals, but the Agent SDK must remain an embeddable library.

Proposed package boundary

@spfunctions/sdk remains the typed data and contract client. @spfunctions/agent v0 remains the governed direct runner:
await agent.call("world.read")
for await (const event of agent.stream("markets.search", { query: "Fed CPI" })) {
  console.log(event)
}
@spfunctions/agent v1 adds objective-oriented runtime APIs:
const runtime = new SimpleFunctionsRuntime({
  client: sf,
  model,
  policy,
  trace,
  tools: {
    mode: "manifest-search",
    preload: ["world.read", "markets.search"],
  },
})

const result = await runtime.run({
  objective: "Research Fed CPI repricing using read-only market tools.",
})

for await (const event of runtime.runStream({
  objective: "Monitor Fed cut markets and report read-only changes.",
})) {
  console.log(event)
}
run() and runStream() must be layered on top of the v0 direct runner. They must not bypass v0 policy, identity, trace, or canonical tool resolution.

Interface sketches

export interface SimpleFunctionsRuntimeOptions {
  client: SimpleFunctions
  model: ModelAdapter
  policy?: AgentPolicy
  trace?: TraceStore
  tools?: RuntimeToolSelectionPolicy
  sessionStore?: SessionStore
}

export interface RuntimeRunInput {
  objective: string
  tools?: string[]
  context?: Record<string, unknown>
  sessionId?: string
  maxSteps?: number
}

export interface RuntimeRunResult {
  runId: string
  sessionId?: string
  status: "completed" | "failed" | "blocked" | "requires_approval"
  output?: unknown
  steps: RuntimeStep[]
  usage?: RuntimeUsage
}

export interface ModelAdapter {
  name: string
  complete(input: ModelCompleteInput): Promise<ModelCompleteResult>
  stream?(input: ModelCompleteInput): AsyncIterable<ModelEvent>
}
The model adapter is an interface only. This RFC does not add OpenAI, Anthropic, Cursor, or other provider packages.

Tool selection

v1 should not load every broad hosted or MCP tool into context. The tool source remains GET /api/contracts/tools, not /api/tools.
export type RuntimeToolSelectionMode =
  | "explicit"
  | "manifest-search"
  | "preload-only"

export interface RuntimeToolSelectionPolicy {
  mode: RuntimeToolSelectionMode
  preload?: string[]
  maxCandidateTools?: number
}
Rules:
  • explicit: only tools supplied in RuntimeRunInput.tools
  • preload-only: only configured preloaded canonical tools
  • manifest-search: search strict contract metadata, then select a small set of canonical candidates
Broad compatibility names such as get_world_state and get_regime_history remain invalid for v1 runtime planning.

Session and run state

v1 needs stable IDs and resumable state before it can be considered alpha.
export interface RuntimeSession {
  sessionId: string
  createdAt: string
  updatedAt: string
  objective?: string
  policySummary: Record<string, unknown>
}

export interface RuntimeStep {
  stepId: string
  runId: string
  type: "model" | "tool" | "approval" | "handoff" | "system"
  status: "started" | "completed" | "failed" | "blocked"
  tool?: string
  traceId?: string
  startedAt: string
  completedAt?: string
}

export interface SessionStore {
  get(sessionId: string): Promise<RuntimeSession | null>
  put(session: RuntimeSession): Promise<void>
}
Initial v1 can use an in-memory session store for package-local dogfood. Hosted or database-backed sessions require a later design.

Policy, budgets, and approvals

v1 must enforce the same policy gates as v0 before any tool call:
  • identity
  • canonical tool existence
  • tool status
  • agent.callable
  • deny list
  • allow list
  • maxSideEffect
  • maxCostEffect
  • user-data auth invariants
  • live-trade hard stop
Budgeting should start as counters and hard limits over known local events, not estimated billing unless the platform exposes reliable per-call cost metadata.
export interface RuntimeBudgetPolicy {
  maxSteps?: number
  maxToolCalls?: number
  maxCostEffect?: CostEffect
  maxSideEffect?: SideEffect
  budgetUsd?: number
}

export interface ApprovalPolicy {
  requireForSideEffectAtOrAbove?: SideEffect
  requireForCostEffectAtOrAbove?: CostEffect
}
Approvals must block execution and emit events. They must not auto-approve writes, runtime actions, paper trades, or live trades.

Human escalation

Human escalation is a runtime event and state transition, not an endpoint in this RFC.
export type RuntimeEvent =
  | AgentEvent
  | { type: "runtime.started"; runId: string; sessionId?: string }
  | { type: "model.started"; runId: string; stepId: string }
  | { type: "model.completed"; runId: string; stepId: string }
  | { type: "approval.required"; runId: string; stepId: string; reason: string }
  | { type: "runtime.completed"; runId: string }
  | { type: "runtime.failed"; runId: string; error: { code: string; message: string } }
No hosted approval workflow is defined here.

Trace and replay

v1 must preserve v0 replay rules:
  • strict tool + inputHash matching
  • replay miss never calls live
  • input normalization is deterministic
  • traces redact secret-shaped fields
  • model prompts must not include raw API keys
  • trace entries must distinguish model steps from direct tool calls
Model replay is a separate problem and should not be implied by v0 tool replay.

Implementation entry criteria

Do not implement v1 until these are true:
Entry criterionRequired evidence
v0 direct runner is package-stabletests, pack smoke, live smoke
SDK preflight is stableno-key/auth/cost/side-effect tests
strict manifest drift guards pass/api/contracts/tools and package tests
trace redaction and replay tests passAgent trace suite
CLI/direct parity tests passCLI/manifest parity suite
alpha release checklist existsSDK and Agent alpha checklist RFC
model adapter interface is approvedRFC review
tool selection plan is approvedRFC review
session and approval plan is approvedRFC review
If any criterion is missing, continue hardening v0 instead.

Test plan for future implementation

When implementation is approved, add tests before runtime expansion:
  • v1 refuses to construct without API-keyed client for live mode
  • v1 resolves only canonical contract tools
  • broad names are rejected
  • model adapter receives only policy-approved tool candidates
  • maxSideEffect and maxCostEffect are enforced before tool execution
  • approval-required runs stop before tool execution
  • replay-only mode never calls live
  • trace redaction covers model context and tool inputs
  • no live trading tool can be enabled

Stop line

This RFC is complete when it documents boundaries and entry criteria. It is not complete if it adds provider dependencies, endpoint code, model calls, or runtime execution behavior.