Skip to main content

State Module Overview

The State Module implements a YAML-configurable state machine that manages conversation flow and agent behavior in ARKOS.

Core Concepts

The State Module provides a declarative way to define conversation flows using YAML configuration files, with automatic transition handling based on context.

What is a State?

A state represents a distinct phase in the conversation or agent processing pipeline. Each state:
  • Has a unique name
  • Defines specific behavior when active
  • Specifies possible transitions to other states
  • Can be marked as terminal (ending the flow)

Architecture

State Class

The base State class that all states extend:
from state_module.state import State

class State:
    def __init__(self, name: str, config: Dict[str, Any]):
        self.name = name
        self.is_terminal: bool = False
        self.transition = config.get("transition", {})

    def check_transition_ready(self, context: Dict[str, Any]) -> bool:
        """
        Override to define when this state can transition.
        """
        raise NotImplementedError

    def run(self, context: Dict[str, Any]) -> Optional[Dict[str, Any]]:
        """
        Override to define state behavior.
        """
        raise NotImplementedError

AgentState Enum

Predefined states available in the system:
from enum import Enum

class AgentState(Enum):
    """Central registry of all possible agent states."""

    GREET_USER = "greet_user"
    FETCH_PRODUCT = "fetch_product"
    SUMMARIZE_RESULT = "summarize_result"
    DONE = "done"

StateHandler

Manages the state machine lifecycle and transitions:
from state_module.state_handler import StateHandler

# Initialize with YAML configuration
handler = StateHandler(yaml_path="state_module/state_graph.yaml")

# Get initial state
initial_state = handler.get_initial_state()

# Get a specific state
state = handler.get_state("agent_reply")

# Get possible transitions from current state
transitions = handler.get_transitions(current_state, messages)
# Returns: {"tt": ["state1", "state2"], "td": ["description1", "description2"]}

YAML Configuration

Basic State Graph

# state_module/state_graph.yaml
 ask_user:
    description: "Waiting for user input"
    type: user
    transition:
      next: [agent_reply]

  agent_reply:
    description: "AI reasoning and response generation"
    type: agent
    transition:
      next: [ask_user, use_tool]

  use_tool:
    description: "Execute a specific tool"
    type: tool
    transition:
      next: [agent_reply, ask_user]

Transition Format

Transitions use a list of names corresponding to the next state
transition: 
  next: [agent_reply, ask_user, use_tool]

State Execution Flow

Integration with Agent

The Agent uses StateHandler for flow control:
class Agent:
    def __init__(self, ..., flow: StateHandler, ...):
        self.flow = flow
        self.current_state = self.flow.get_initial_state()

    async def step(self, messages, user_id=None):
        while not self.current_state.is_terminal:
            # Run current state
            context = self.get_context()
            update = await self.current_state.run(context, self)

            if update:
                self.add_context([update])

            # Check for transition
            if self.current_state.check_transition_ready(messages):
                transition_dict = self.flow.get_transitions(
                    self.current_state, messages
                )
                transition_names = transition_dict["tt"]

                if len(transition_names) == 1:
                    next_state_name = transition_names[0]
                else:
                    # LLM chooses between options
                    next_state_name = await self.choose_transition(
                        transition_dict, messages
                    )

                self.current_state = self.flow.get_state(next_state_name)

Transition Selection

When multiple transitions are possible, the LLM chooses:
async def choose_transition(self, transitions_dict, messages):
    """Use LLM to choose the best next state."""

    # Create tuples of (state_name, description)
    transition_tuples = list(zip(
        transitions_dict["tt"],
        transitions_dict["td"]
    ))

    prompt = f"""Given the conversation context and these options:
    {transition_tuples}
    Choose the most appropriate next state."""

    # Create Pydantic model for structured output
    NextStates = self.create_next_state_class(transition_tuples)

    # Get LLM decision
    output = await self.call_llm(
        context=[SystemMessage(content=prompt)] + messages,
        json_schema=NextStates.model_json_schema()
    )

    return json.loads(output.content)["next_state"]

Custom State Types

Creating a Custom State

(NOTE: all states must be registered with @register_state and follow state_[name].py convention)

from state_module.state import State
from model_module.ArkModelNew import AIMessage

@register_state
class GreetingState(State):
    def __init__(self, name: str, config: dict):
        super().__init__(name, config)
        self.greeting = config.get("greeting", "Hello!")

    def check_transition_ready(self, context) -> bool:
        # Always ready to transition after greeting
        return True

    def run(self, context, agent=None):
        return AIMessage(content=self.greeting)

Tool State Example

from model_module.ArkModelNew import SystemMessage
from tool_module.tool_call import AuthRequiredError

from state_module.state import State
from state_module.state_registry import register_state


@register_state
class StateTool(State):
    type = "tool"

    def __init__(self, name: str, config: dict):
        super().__init__(name, config)
        self.is_terminal = False

    def check_transition_ready(self, context):
        return True

    async def choose_tool(self, context, agent):
        """
        Chooses tool to use based on the context and server
        """
        prompt = "Based on the user request, choose the best tool. Return JSON with tool_name field."
        instructions = context + [SystemMessage(content=prompt)]

        # Get available tools and build schema
        tool_option_class = await agent.create_tool_option_class()
        json_schema = {
            "type": "json_schema",
            "json_schema": {
                "name": "tool_choice",
                "schema": tool_option_class.model_json_schema(),
            },
        }

        # Get tool choice from LLM
        output = await agent.call_llm(instructions, json_schema)
        parsed = json.loads(output.content)
        tool_name = parsed["tool_name"]

        # Get tool spec
        server_name = agent.tool_manager._tool_registry[tool_name]
        all_tools = await agent.tool_manager.list_all_tools()
        tool_spec = all_tools[server_name][tool_name]

        # Get tool arguments
        args_prompt = f"Fill in arguments for tool '{tool_name}'. Return JSON matching the schema."
        args_context = context + [SystemMessage(content=args_prompt)]
        args_schema = {
            "type": "json_schema",
            "json_schema": {
                "name": "tool_args",
                "schema": tool_spec.get("inputSchema", {"type": "object", "properties": {}}),
            },
        }

        args_output = await agent.call_llm(args_context, args_schema)
        tool_args = json.loads(args_output.content)

        return {"tool_name": tool_name, "tool_args": tool_args}

    async def execute_tool(self, tool_call, agent):
        """
        Parses and fills args for chosen tool for tool call execution
        """
        tool_name = tool_call["tool_name"]
        tool_args = tool_call["tool_args"]

        tool_result = await agent.tool_manager.call_tool(
            tool_name=tool_name,
            arguments=tool_args,
            user_id=agent.current_user_id,
        )

        return tool_result

    async def run(self, context, agent=None):
        try:
            tool_arg_dict = await self.choose_tool(context=context, agent=agent)
            tool_result = await self.execute_tool(tool_call=tool_arg_dict, agent=agent)
            return SystemMessage(content=str(tool_result))

        except AuthRequiredError as e:
            # Return friendly message with connect link
            return SystemMessage(
                content=f"To complete this request, please connect your {e.service_info.get('name', e.service)}:\n\n"
                        f" {e.connect_url}\n\n"
                        f"After connecting, try your request again."
            )

State Types Reference

TypePurposeTerminal
agentGenerate AI responseNo
userWait for user inputNo
toolExecute tool callNo
decisionMake routing decisionNo
terminalEnd conversationYes

Best Practices

State Design Guidelines

  1. Single Responsibility: Each state should have one clear purpose
  2. Clear Transitions: Define unambiguous transition conditions
  3. Terminal States: Ensure all paths eventually reach a terminal
  4. Descriptive Names: Use clear, action-oriented state names

YAML Configuration Tips

# Good: Clear state names and descriptions
states:
  gather_requirements:
    type: user
    transition:
      next: [validate_input, ask_clarification]
     

# Avoid: Ambiguous transitions
states:
  process:
    type: unknown
    transition:
      next: [next1, next2]

Debugging States

Logging State Transitions

# Agent logs state changes
print(f"agent.py CURR STATE: {self.current_state}")
print(f"agent.py IS TERMINAL?: {self.current_state.is_terminal}")

Common Issues

Check that check_transition_ready() returns True:
ready = state.check_transition_ready(messages)
print(f"Transition ready: {ready}")
Improve transition descriptions for better LLM selection:
transition:
  next: [help, search]
  t
Ensure there’s always a path to a terminal state and MAX_ITER limit is respected.

Example: Complete State Graph


initial: agent_reply

states:
  # === Core States ===
  ask_user:
    description: "Waiting for user input"
    type: user
    transition:
      next: [agent_reply]

  agent_reply:
    description: "AI reasoning and response generation"
    type: agent
    transition:
      next: [agent_reply, ask_user, use_tool, autonomous_plan, learn_skill]

  use_tool:
    description: "Execute a specific tool"
    type: tool
    transition:
      next: [agent_reply, ask_user]

  # === Autonomous Planning Loop (OpenClaw-style) ===
  autonomous_plan:
    description: "Create multi-step execution plan for complex tasks"
    type: planner
    transition:
      next: [execute_step, ask_user]  # ask_user if permission needed

  execute_step:
    description: "Execute current step in the plan"
    type: executor
    transition:
      next: [execute_step, verify_progress]  # loop or verify

  verify_progress:
    description: "Check if goal achieved or continue execution"
    type: verifier
    transition:
      next: [execute_step, autonomous_plan, agent_reply, ask_user]  # continue, replan, or finish

  # === Self-Improving Skills ===
  learn_skill:
    description: "Analyze and potentially create new capabilities"
    type: skill_builder
    transition:
      next: [agent_reply, autonomous_plan]  # use skill or plan with it


Next Steps