Skip to main content

Development Setup

This guide will help you set up a complete development environment for ARKOS, including all dependencies and configurations.

System Requirements

Minimum Requirements

  • OS: Ubuntu 20.04+, macOS 11+, or Windows 10+ with WSL2
  • Python: 3.8 or higher
  • RAM: 8GB minimum
  • Storage: 20GB free space
  • CPU: 4 cores recommended
  • RAM: 16GB or more
  • GPU: NVIDIA GPU with 8GB+ VRAM (for local models)
  • Storage: 50GB+ for model files
  • CPU: 8+ cores for better performance

Environment Setup

1. Python Environment

# Create virtual environment
python3 -m venv arkos-env

# Activate environment
# On Linux/Mac:
source arkos-env/bin/activate
# On Windows:
arkos-env\Scripts\activate

# Upgrade pip
pip install --upgrade pip

2. Clone Repository

# Clone the repository
git clone https://github.com/SGIARK/arkos.git
cd arkos

3. Install Dependencies

pip install -r requirements.txt

Dependencies Overview

Current Dependencies (requirements.txt)

openai>=1.61.0
pyyaml>=6.0.2
pydantic>=2.10.6
requests>=2.32.3
fastapi>=0.115.0
uvicorn>=0.32.0
psycopg2-binary>=2.9.11
vecs
mem0ai
sentence-transformers
python-dotenv>=1.0.0
aiohttp>=3.9.0
google-auth-oauthlib>=1.2.0

Dependency Details

PackagePurpose
openaiAsyncOpenAI client for LLM communication
pyyamlYAML configuration parsing
pydanticData validation and message schemas
fastapi + uvicornAPI server
psycopg2-binaryPostgreSQL database adapter
mem0aiVector memory with Supabase
sentence-transformersEmbeddings for memory
python-dotenvEnvironment variable loading
aiohttpAsync HTTP client
google-auth-oauthlibGoogle OAuth for calendar

Configuration Files

1. Environment Variables (.env)

Create a .env file in the project root:
cp .env.example .env
Edit .env with your configuration:
# Database Connection (REQUIRED)
# Format: postgresql://user:password@host:port/database
DB_URL=postgresql://postgres:your-password@localhost:54322/postgres

# Hugging Face Token (OPTIONAL - for gated models)
HF_TOKEN=

# MCP Server Credentials (OPTIONAL - for tool integrations)
GOOGLE_OAUTH_CREDENTIALS=
GOOGLE_CALENDAR_MCP_TOKEN_PATH=
BRAVE_API_KEY=

# OpenAI API Key (REQUIRED by mem0, but can be placeholder)
OPENAI_API_KEY=sk-placeholder
The DB_URL environment variable is required. ARKOS uses PostgreSQL for conversation context storage.

2. YAML Configuration

The main configuration file is config_module/config.yaml:
app:
  host: "0.0.0.0"
  port: 1111
  reload: false
  system_prompt: "You are a helpful AI assistant."

llm:
  base_url: "http://localhost:30000/v1"

database:
  url: "${DB_URL}"

memory:
  user_id: "default_user"
  use_long_term: false

state:
  graph_path: "state_module/state_graph.yaml"

# MCP Server Configuration (optional)
mcp_servers:
  google-calendar:
    transport: stdio
    command: npx
    args: ["-y", "@anthropic/google-calendar-mcp"]

Database Setup

PostgreSQL (Required)

ARKOS requires PostgreSQL for conversation context storage.
  1. Create a Supabase project
  2. Get your connection string from Settings > Database
  3. Set DB_URL in your .env file

Using Local PostgreSQL

# Install PostgreSQL
sudo apt-get install postgresql postgresql-contrib

# Create database
sudo -u postgres psql
CREATE DATABASE arkos;
CREATE USER arkos_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE arkos TO arkos_user;
\q

Database Schema

Create the required table:
CREATE TABLE conversation_context (
    id SERIAL PRIMARY KEY,
    user_id VARCHAR(255) NOT NULL,
    session_id VARCHAR(255) NOT NULL,
    role VARCHAR(50) NOT NULL,
    message TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_user_id ON conversation_context(user_id);
CREATE INDEX idx_session_id ON conversation_context(session_id);

Running SGLANG

Check if Already Running

On shared servers, the LLM server may already be running:
# Check if port 30000 is in use
lsof -i :30000

# Or verify it's responding
curl http://localhost:30000/v1/models

Start the LLM Server

cd arkos
bash model_module/run.sh
This starts the SGLang server on port 30000 with the Qwen 2.5-7B-Instruct model.
Only one instance can run on port 30000 at a time. Coordinate with your team on shared servers.

Running the Application

1. Start the LLM Server

# Terminal 1
bash model_module/run.sh
Wait for “server started” messages.

2. Start the API Server

# Terminal 2
python base_module/app.py
This starts FastAPI on port 1111.

3. Start the CLI Interface (Optional)

# Terminal 3
python base_module/main_interface.py

Verify Setup

# Health check
curl http://localhost:1111/health

# Test chat
curl -X POST http://localhost:1111/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "ark-agent", "messages": [{"role": "user", "content": "Hello!"}]}'

IDE Configuration

VS Code

Create .vscode/settings.json:
{
  "python.defaultInterpreterPath": "${workspaceFolder}/arkos-env/bin/python",
  "python.linting.enabled": true,
  "python.linting.flake8Enabled": true,
  "python.formatting.provider": "black",
  "python.testing.pytestEnabled": true,
  "python.testing.pytestArgs": ["tests"],
  "editor.formatOnSave": true
}

PyCharm

  1. Set Python interpreter to virtual environment
  2. Enable pytest as test runner
  3. Configure code style with Black formatter

Testing

Running Tests

# Run specific test files
python model_module/tests_arkmodel.py
python config_module/test_config_loader.py
python tool_module/test_tool_call.py

# Run with pytest (if available)
pytest tests/

Test Configuration

# pytest.ini (if using pytest)
[pytest]
testpaths = tests
python_files = test_*.py
asyncio_mode = auto

Debugging

Enable Debug Logging

import logging
logging.basicConfig(level=logging.DEBUG)

VS Code Launch Configuration

Create .vscode/launch.json:
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Debug API Server",
      "type": "python",
      "request": "launch",
      "program": "${workspaceFolder}/base_module/app.py",
      "console": "integratedTerminal"
    },
    {
      "name": "Debug CLI",
      "type": "python",
      "request": "launch",
      "program": "${workspaceFolder}/base_module/main_interface.py",
      "console": "integratedTerminal"
    }
  ]
}

Common Issues

Add the project root to PYTHONPATH:
export PYTHONPATH="${PYTHONPATH}:$(pwd)"
Verify your DB_URL in .env:
psql $DB_URL -c "SELECT 1"
Check if SGLang is running:
curl http://localhost:30000/v1/models
If not, start it with bash model_module/run.sh
Find and kill the process:
lsof -i :1111
kill -9 <PID>
Reduce batch size or use a smaller model:
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'max_split_size_mb:512'
Ensure .env file exists and is in the project root:
ls -la .env
cat .env

Project Structure

arkos/
├── agent_module/         # Agent orchestration
│   ├── __init__.py
│   └── agent.py
├── base_module/          # API server and CLI
│   ├── __init__.py
│   ├── app.py           # FastAPI server
│   ├── auth.py          # OAuth routes
│   └── main_interface.py # CLI
├── config_module/        # Configuration
│   ├── __init__.py
│   ├── loader.py        # Config loading
│   ├── config.yaml      # Main config
│   └── state_graph.yaml # State machine
├── memory_module/        # Memory system
│   ├── __init__.py
│   └── memory.py
├── model_module/         # LLM interface
│   ├── __init__.py
│   ├── ArkModelNew.py
│   └── run.sh           # SGLang startup
├── state_module/         # State machine
│   ├── __init__.py
│   ├── state.py
│   ├── state_handler.py
│   └── state_registry.py
├── tool_module/          # MCP tools
│   ├── __init__.py
│   ├── tool_call.py     # MCP client
│   ├── token_store.py   # OAuth tokens
│   └── transports/      # Transport layer
├── .env.example
├── .gitignore
├── README.md
└── requirements.txt

Next Steps