Skip to main content

State Module Overview

The State Module implements a YAML-configurable state machine that manages conversation flow and agent behavior in ARKOS.

Core Concepts

The State Module provides a declarative way to define conversation flows using YAML configuration files, with automatic transition handling based on context.

What is a State?

A state represents a distinct phase in the conversation or agent processing pipeline. Each state:
  • Has a unique name
  • Defines specific behavior when active
  • Specifies possible transitions to other states
  • Can be marked as terminal (ending the flow)

Architecture

State Class

The base State class that all states extend:
from state_module.state import State

class State:
    def __init__(self, name: str, config: Dict[str, Any]):
        self.name = name
        self.is_terminal: bool = False
        self.transition = config.get("transition", {})

    def check_transition_ready(self, context: Dict[str, Any]) -> bool:
        """
        Override to define when this state can transition.
        """
        raise NotImplementedError

    def run(self, context: Dict[str, Any]) -> Optional[Dict[str, Any]]:
        """
        Override to define state behavior.
        """
        raise NotImplementedError

AgentState Enum

Predefined states available in the system:
from enum import Enum

class AgentState(Enum):
    """Central registry of all possible agent states."""

    GREET_USER = "greet_user"
    FETCH_PRODUCT = "fetch_product"
    SUMMARIZE_RESULT = "summarize_result"
    DONE = "done"

StateHandler

Manages the state machine lifecycle and transitions:
from state_module.state_handler import StateHandler

# Initialize with YAML configuration
handler = StateHandler(yaml_path="state_module/state_graph.yaml")

# Get initial state
initial_state = handler.get_initial_state()

# Get a specific state
state = handler.get_state("agent_reply")

# Get possible transitions from current state
transitions = handler.get_transitions(current_state, messages)
# Returns: {"tt": ["state1", "state2"], "td": ["description1", "description2"]}

YAML Configuration

Basic State Graph

# state_module/state_graph.yaml
 ask_user:
    description: "Waiting for user input"
    type: user
    transition:
      next: [agent_reply]

  agent_reply:
    description: "AI reasoning and response generation"
    type: agent
    transition:
      next: [ask_user, use_tool]

  use_tool:
    description: "Execute a specific tool"
    type: tool
    transition:
      next: [agent_reply, ask_user]

Transition Format

Transitions use a list of names corresponding to the next state
transition: 
  next: [agent_reply, ask_user, use_tool]

State Execution Flow

Integration with Agent

The Agent uses StateHandler for flow control:
class Agent:
    def __init__(self, ..., flow: StateHandler, ...):
        self.flow = flow
        self.current_state = self.flow.get_initial_state()

    async def step(self, messages, user_id=None):
        while not self.current_state.is_terminal:
            # Run current state
            context = self.get_context()
            update = await self.current_state.run(context, self)

            if update:
                self.add_context([update])

            # Check for transition
            if self.current_state.check_transition_ready(messages):
                transition_dict = self.flow.get_transitions(
                    self.current_state, messages
                )
                transition_names = transition_dict["tt"]

                if len(transition_names) == 1:
                    next_state_name = transition_names[0]
                else:
                    # LLM chooses between options
                    next_state_name = await self.choose_transition(
                        transition_dict, messages
                    )

                self.current_state = self.flow.get_state(next_state_name)

Transition Selection

When multiple transitions are possible, the LLM chooses:
async def choose_transition(self, transitions_dict, messages):
    """Use LLM to choose the best next state."""

    # Create tuples of (state_name, description)
    transition_tuples = list(zip(
        transitions_dict["tt"],
        transitions_dict["td"]
    ))

    prompt = f"""Given the conversation context and these options:
    {transition_tuples}
    Choose the most appropriate next state."""

    # Create Pydantic model for structured output
    NextStates = self.create_next_state_class(transition_tuples)

    # Get LLM decision
    output = await self.call_llm(
        context=[SystemMessage(content=prompt)] + messages,
        json_schema=NextStates.model_json_schema()
    )

    return json.loads(output.content)["next_state"]

Custom State Types

Creating a Custom State

(NOTE: all states must be registered with @register_state and follow state_[name].py convention)

from state_module.state import State
from model_module.ArkModelNew import AIMessage

@register_state
class GreetingState(State):
    def __init__(self, name: str, config: dict):
        super().__init__(name, config)
        self.greeting = config.get("greeting", "Hello!")

    def check_transition_ready(self, context) -> bool:
        # Always ready to transition after greeting
        return True

    def run(self, context, agent=None):
        return AIMessage(content=self.greeting)

Tool State Example

from model_module.ArkModelNew import SystemMessage
from tool_module.tool_call import AuthRequiredError

from state_module.state import State
from state_module.state_registry import register_state


@register_state
class StateTool(State):
    type = "tool"

    def __init__(self, name: str, config: dict):
        super().__init__(name, config)
        self.is_terminal = False

    def check_transition_ready(self, context):
        return True

    async def choose_tool(self, context, agent):
        """
        Chooses tool to use based on the context and server
        """
        prompt = "Based on the user request, choose the best tool. Return JSON with tool_name field."
        instructions = context + [SystemMessage(content=prompt)]

        # Get available tools and build schema
        tool_option_class = await agent.create_tool_option_class()
        json_schema = {
            "type": "json_schema",
            "json_schema": {
                "name": "tool_choice",
                "schema": tool_option_class.model_json_schema(),
            },
        }

        # Get tool choice from LLM
        output = await agent.call_llm(instructions, json_schema)
        parsed = json.loads(output.content)
        tool_name = parsed["tool_name"]

        # Get tool spec
        server_name = agent.tool_manager._tool_registry[tool_name]
        all_tools = await agent.tool_manager.list_all_tools()
        tool_spec = all_tools[server_name][tool_name]

        # Get tool arguments
        args_prompt = f"Fill in arguments for tool '{tool_name}'. Return JSON matching the schema."
        args_context = context + [SystemMessage(content=args_prompt)]
        args_schema = {
            "type": "json_schema",
            "json_schema": {
                "name": "tool_args",
                "schema": tool_spec.get("inputSchema", {"type": "object", "properties": {}}),
            },
        }

        args_output = await agent.call_llm(args_context, args_schema)
        tool_args = json.loads(args_output.content)

        return {"tool_name": tool_name, "tool_args": tool_args}

    async def execute_tool(self, tool_call, agent):
        """
        Parses and fills args for chosen tool for tool call execution
        """
        tool_name = tool_call["tool_name"]
        tool_args = tool_call["tool_args"]

        tool_result = await agent.tool_manager.call_tool(
            tool_name=tool_name,
            arguments=tool_args,
            user_id=agent.current_user_id,
        )

        return tool_result

    async def run(self, context, agent=None):
        try:
            tool_arg_dict = await self.choose_tool(context=context, agent=agent)
            tool_result = await self.execute_tool(tool_call=tool_arg_dict, agent=agent)
            return SystemMessage(content=str(tool_result))

        except AuthRequiredError as e:
            # Return friendly message with connect link
            return SystemMessage(
                content=f"To complete this request, please connect your {e.service_info.get('name', e.service)}:\n\n"
                        f" {e.connect_url}\n\n"
                        f"After connecting, try your request again."
            )

State Types Reference

TypePurposeTerminal
agentGenerate AI responseNo
userWait for user inputNo
toolExecute tool callNo
decisionMake routing decisionNo
terminalEnd conversationYes

Best Practices

State Design Guidelines

  1. Single Responsibility: Each state should have one clear purpose
  2. Clear Transitions: Define unambiguous transition conditions
  3. Terminal States: Ensure all paths eventually reach a terminal
  4. Descriptive Names: Use clear, action-oriented state names

YAML Configuration Tips

# Good: Clear state names and descriptions
states:
  gather_requirements:
    type: user
    transition:
      next: [validate_input, ask_clarification]
     

# Avoid: Ambiguous transitions
states:
  process:
    type: unknown
    transition:
      next: [next1, next2]

Debugging States

Logging State Transitions

# Agent logs state changes
print(f"agent.py CURR STATE: {self.current_state}")
print(f"agent.py IS TERMINAL?: {self.current_state.is_terminal}")

Common Issues

Check that check_transition_ready() returns True:
ready = state.check_transition_ready(messages)
print(f"Transition ready: {ready}")
Improve transition descriptions for better LLM selection:
transition:
  next: [help, search]
  t
Ensure there’s always a path to a terminal state and MAX_ITER limit is respected.

Example: Complete State Graph


initial: agent_reply

states:
  # === Core States ===
  ask_user:
    description: "Waiting for user input"
    type: user
    transition:
      next: [agent_reply]

  agent_reply:
    description: "AI reasoning and response generation"
    type: agent
    transition:
      next: [agent_reply, ask_user, use_tool, autonomous_plan, learn_skill]

  use_tool:
    description: "Execute a specific tool"
    type: tool
    transition:
      next: [agent_reply, ask_user]

  # === Autonomous Planning Loop (OpenClaw-style) ===
  autonomous_plan:
    description: "Create multi-step execution plan for complex tasks"
    type: planner
    transition:
      next: [execute_step, ask_user]  # ask_user if permission needed

  execute_step:
    description: "Execute current step in the plan"
    type: executor
    transition:
      next: [execute_step, verify_progress]  # loop or verify

  verify_progress:
    description: "Check if goal achieved or continue execution"
    type: verifier
    transition:
      next: [execute_step, autonomous_plan, agent_reply, ask_user]  # continue, replan, or finish

  # === Self-Improving Skills ===
  learn_skill:
    description: "Analyze and potentially create new capabilities"
    type: skill_builder
    transition:
      next: [agent_reply, autonomous_plan]  # use skill or plan with it


Next Steps

Implementation Details

Deep dive into state implementation

Agent Module

Learn about agent orchestration

Memory Module

Explore memory integration

Model Module

Understand LLM integration