Quick Start Guide

This guide will help you set up ARKOS and create your first intelligent agent with persistent memory.

Prerequisites

Before you begin, ensure you have:

Python 3.8 or higher
Git
PostgreSQL database (Supabase recommended)
8GB+ RAM recommended
GPU optional but recommended for local models

Installation

1. Clone the Repository

git clone https://github.com/SGIARK/arkos.git
cd arkos

2. Install Dependencies

pip install -r requirements.txt

3. Configure Your Environment

Create a .env file in the project root:

cp .env.example .env

Edit .env with your configuration:

# Database Connection (REQUIRED)
# Format: postgresql://user:password@host:port/database
DB_URL=postgresql://postgres:your-password@localhost:54322/postgres

# Hugging Face Token (OPTIONAL - for gated models)
HF_TOKEN=

# MCP Server Credentials (OPTIONAL - for tool integrations)
GOOGLE_OAUTH_CREDENTIALS=
GOOGLE_CALENDAR_MCP_TOKEN_PATH=
BRAVE_API_KEY=

# OpenAI API Key (REQUIRED by mem0, but can be placeholder)
OPENAI_API_KEY=sk-placeholder

The DB_URL environment variable is required. ARKOS uses PostgreSQL for storing conversation context and Supabase for vector memory.

Starting the Inference Engine

Check if LLM Server is Already Running

Since ARKOS is often deployed on shared servers, check if the LLM server is already running:

# Check if port 30000 is in use
lsof -i :30000

# Or verify it's responding
curl http://localhost:30000/v1/models

If you see output, the LLM server is already running - you can skip starting it.

Starting the LLM Server (if not running)

The project uses SGLang to run the Qwen 2.5-7B-Instruct model:

bash model_module/run.sh

This starts the SGLang server on port 30000 using Docker and GPU. Wait for “server started” messages.

Starting the Embedding Server (if not running)

The project uses Huggingface-TEI to run the Qwen 2 1.5B-Instruct model

bash model_module/run_tei.sh

This starts the SGLang server on port 4444 using Docker and GPU. Wait for “server started” messages.

Running Your First Agent

1. Start the API Server

python base_module/app.py

This starts the FastAPI server on port 1111, providing the /v1/chat/completions endpoint.

2. Run the Test Interface

In another terminal:

python base_module/main_interface.py

This provides an interactive CLI to test the agent. Type your messages and press Enter. Type exit or quit to stop.

Basic Usage Examples

Using the OpenAI-Compatible API

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:1111/v1",
    api_key="not-needed"  # Local deployment
)

response = client.chat.completions.create(
    model="ark-agent",
    messages=[
        {"role": "user", "content": "Hello! What can you help me with?"}
    ]
)

print(response.choices[0].message.content)

Streaming Responses

stream = client.chat.completions.create(
    model="ark-agent",
    messages=[{"role": "user", "content": "Tell me about ARKOS"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Using Memory Directly

from memory_module.memory import Memory
from model_module.ArkModelNew import UserMessage, AIMessage, SystemMessage

# Initialize memory system
memory = Memory(
    user_id="alice",
    session_id=None,  # Auto-generates session ID
    db_url="postgresql://...",
    use_long_term=True  # Enable Mem0 vector memory
)

# Store a message
memory.add_memory(UserMessage(content="My favorite color is blue"))

# Retrieve recent context
context = memory.retrieve_short_memory(turns=5)

# Retrieve relevant long-term memories
long_term = memory.retrieve_long_memory(context=context)

Configuration Files

Main Configuration (`config_module/config.yaml`)

app:
  host: "0.0.0.0"
  port: 1111
  reload: false
  system_prompt: "You are a helpful AI assistant."

llm:
  base_url: "http://localhost:30000/v1"

database:
  url: "${DB_URL}"

memory:
  user_id: "default_user"
  use_long_term: false  # Set to true to enable Mem0

state:
  graph_path: "state_module/state_graph.yaml"

# MCP Server Configuration (optional)
mcp_servers:
  google-calendar:
    transport: stdio
    command: npx
    args: ["-y", "@anthropic/google-calendar-mcp"]

State Graph Configuration (`state_module/state_graph.yaml`)

initial: agent_reply

states:
  ask_user:
    description: "state used for input from user"
    type: user 
    transition:
      next: [agent_reply]

  agent_reply:
    description: "state used for your reasoning"
    type: agent
    transition:
      next: [ask_user, use_tool]

  use_tool:
    description: "state used for tool use"
    type: tool
    transition: 
      next: [agent_reply]

Testing Your Setup

Health Check

curl http://localhost:1111/health

Expected response:

{"status": "ok", "llm_server": "running", "port": 1111}

Test Chat Completion

curl -X POST http://localhost:1111/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "ark-agent",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Common Issues and Solutions

Database connection failed

Ensure your DB_URL is correctly set in .env:

# Check if PostgreSQL is accessible
psql $DB_URL -c "SELECT 1"

Make sure the conversation_context table exists in your database.

LLM server not responding

Check if the SGLang server is running on port 30000:

curl http://localhost:30000/v1/models

If not running, start it with bash model_module/run.sh

CUDA/GPU not detected

Ensure PyTorch is installed with CUDA support:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Port already in use

Find and kill the process using the port:

lsof -i :1111
kill -9 <PID>

Next Steps

Architecture Overview

Understand the system architecture

Module Documentation

Deep dive into each module

Development Setup

Set up your development environment

API Reference

Explore the API endpoints

Getting Help

GitHub Issues: Report bugs or request features
Documentation: You’re already here!
Community: Join our discussions on GitHub

Getting Started

Architecture

Core Modules

Development

Quick Start

Quick Start Guide

Prerequisites

Installation

1. Clone the Repository

2. Install Dependencies

3. Configure Your Environment

Starting the Inference Engine

Check if LLM Server is Already Running

Starting the LLM Server (if not running)

Starting the Embedding Server (if not running)

Running Your First Agent

1. Start the API Server

2. Run the Test Interface

Basic Usage Examples

Using the OpenAI-Compatible API

Streaming Responses

Using Memory Directly

Configuration Files

Main Configuration (`config_module/config.yaml`)

State Graph Configuration (`state_module/state_graph.yaml`)

Testing Your Setup

Health Check

Test Chat Completion

Common Issues and Solutions

Next Steps

Architecture Overview

Module Documentation

Development Setup

API Reference

Getting Help

Getting Started

Architecture

Core Modules

Development

​Quick Start Guide

​Prerequisites

​Installation

​1. Clone the Repository

​2. Install Dependencies

​3. Configure Your Environment

​Starting the Inference Engine

​Check if LLM Server is Already Running

​Starting the LLM Server (if not running)

​Starting the Embedding Server (if not running)

​Running Your First Agent

​1. Start the API Server

​2. Run the Test Interface

​Basic Usage Examples

​Using the OpenAI-Compatible API

​Streaming Responses

​Using Memory Directly

​Configuration Files

​Main Configuration (config_module/config.yaml)

​State Graph Configuration (state_module/state_graph.yaml)

​Testing Your Setup

​Health Check

​Test Chat Completion

​Common Issues and Solutions

​Next Steps

Architecture Overview

Module Documentation

Development Setup

API Reference

​Getting Help

Quick Start Guide

Prerequisites

Installation

1. Clone the Repository

2. Install Dependencies

3. Configure Your Environment

Starting the Inference Engine

Check if LLM Server is Already Running

Starting the LLM Server (if not running)

Starting the Embedding Server (if not running)

Running Your First Agent

1. Start the API Server

2. Run the Test Interface

Basic Usage Examples

Using the OpenAI-Compatible API

Streaming Responses

Using Memory Directly

Configuration Files

Main Configuration (`config_module/config.yaml`)

State Graph Configuration (`state_module/state_graph.yaml`)

Testing Your Setup

Health Check

Test Chat Completion

Common Issues and Solutions

Next Steps

Getting Help