πŸ“– Documentation

🎯 What is ROMA?

alt text

ROMA is a meta-agent framework that uses recursive hierarchical structures to solve complex problems. By breaking down tasks into parallelizable components, ROMA enables agents to tackle sophisticated reasoning challenges while maintaining transparency that makes context-engineering and iteration straightforward. The framework offers parallel problem solving where agents work simultaneously on different parts of complex tasks, transparent development with a clear structure for easy debugging, and proven performance demonstrated through our search agent’s strong benchmark results. We’ve shown the framework’s effectiveness, but this is just the beginning. As an open-source and extensible platform, ROMA is designed for community-driven development, allowing you to build and customize agents for your specific needs while benefiting from the collective improvements of the community.

πŸ—οΈ How It Works

ROMA framework processes tasks through a recursive plan–execute loop:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def solve(task):
    if is_atomic(task):                 # Step 1: Atomizer
        return execute(task)            # Step 2: Executor
    else:
        subtasks = plan(task)           # Step 2: Planner
        results = []
        for subtask in subtasks:
            results.append(solve(subtask))  # Recursive call
        return aggregate(results)       # Step 3: Aggregator

# Entry point:
answer = solve(initial_request)
  1. Atomizer – Decides whether a request is atomic (directly executable) or requires planning.
  2. Planner – If planning is needed, the task is broken into smaller subtasks. Each subtask is fed back into the Atomizer, making the process recursive.
  3. Executors – Handle atomic tasks. Executors can be LLMs, APIs, or even other agents β€” as long as they implement an agent.execute() interface.
  4. Aggregator – Collects and integrates results from subtasks. Importantly, the Aggregator produces the answer to the original parent task, not just raw child outputs.

πŸ“ Information Flow

This structure makes the system flexible, recursive, and dependency-aware β€” capable of decomposing complex problems into smaller steps while ensuring results are integrated coherently.

Click to view the system flow diagram
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
flowchart TB
    A[Your Request] --> B{Atomizer}
    B -->|Plan Needed| C[Planner]
    B -->|Atomic Task| D[Executor]

    %% Planner spawns subtasks
    C --> E[Subtasks]
    E --> G[Aggregator]

    %% Recursion
    E -.-> B  

    %% Execution + Aggregation
    D --> F[Final Result]
    G --> F

    style A fill:#e1f5fe
    style F fill:#c8e6c9
    style B fill:#fff3e0
    style C fill:#ffe0b2
    style D fill:#d1c4e9
    style G fill:#c5cae9

πŸš€ 30-Second Quick Start

1
2
3
4
5
git clone https://github.com/sentient-agi/ROMA.git
cd ROMA

# Run the automated setup
./setup.sh

Choose between:

πŸ› οΈ Technical Stack

πŸ“¦ Installation Options

1
2
3
4
5
6
7
8
# Main setup (choose Docker or Native)
./setup.sh

# Optional: Setup E2B sandbox integration
./setup.sh --e2b

# Test E2B integration  
./setup.sh --test-e2b

Command Line Options

1
2
3
4
5
6
./setup.sh --docker     # Run Docker setup directly
./setup.sh --docker-from-scratch  # Rebuild Docker images/containers from scratch (down -v, no cache)
./setup.sh --native     # Run native setup directly (macOS/Ubuntu/Debian)
./setup.sh --e2b        # Setup E2B template (requires E2B_API_KEY + AWS creds)
./setup.sh --test-e2b   # Test E2B template integration
./setup.sh --help       # Show all available options

Manual Installation

See setup docs for detailed instructions.

πŸ—οΈ Optional: E2B Sandbox Integration

For secure code execution capabilities, optionally set up E2B sandboxes:

1
2
3
4
5
# After main setup, configure E2B (requires E2B_API_KEY and AWS credentials in .env)
./setup.sh --e2b

# Test E2B integration
./setup.sh --test-e2b

E2B Features:

πŸ€– Pre-built Agents

Note: These agents are demonstrations built using ROMA’s framework through simple vibe-prompting and minimal manual tuning. They showcase how easily you can create high-performance agents with ROMA, rather than being production-final solutions. Our mission is to empower the community to build, share, and get rewarded for creating innovative agent recipes and use-cases.

ROMA comes with example agents that demonstrate the framework’s capabilities:

πŸ” General Task Solver

A versatile agent powered by ChatGPT Search Preview for handling diverse tasks:

Perfect for: General research, fact-checking, exploratory analysis, quick information gathering

πŸ”¬ Deep Research Agent

A comprehensive research system that breaks down complex research questions into manageable sub-tasks:

Perfect for: Academic research, market analysis, competitive intelligence, technical documentation

πŸ’Ή Crypto Analytics Agent

Specialized financial analysis agent with deep blockchain and DeFi expertise:

Perfect for: Token research, portfolio analysis, DeFi protocol evaluation, market trend analysis

All three agents demonstrate ROMA’s recursive architecture in action, showing how complex queries that would overwhelm single-pass systems can be elegantly decomposed and solved. They serve as templates and inspiration for building your own specialized agents.

Your First Agent in 5 Minutes

1
./setup.sh  # Automated setup with Docker or native installation

Access all the pre-defined agents through the frontend on localhost:3000 after setting up the backend on localhost:5000. Please checkout Setup and the Agents guide to get started!

alt text
1
2
3
4
5
# Your first agent in 3 lines
from sentientresearchagent import SentientAgent

agent = SentientAgent.create()
result = await agent.run("Create a podcast about AI safety")

πŸ“Š Benchmarks

We evaluate our simple implementation of a search system using ROMA, called ROMA-Search across three benchmarks: SEAL-0, FRAMES, and SimpleQA.
Below are the performance graphs for each benchmark.

SEAL-0

SealQA is a new challenging benchmark for evaluating Search-Augmented Language models on fact-seeking questions where web search yields conflicting, noisy, or unhelpful results.

SEAL-0 Results


FRAMES

View full results

A comprehensive evaluation dataset designed to test the capabilities of Retrieval-Augmented Generation (RAG) systems across factuality, retrieval accuracy, and reasoning.

FRAMES Results


SimpleQA

View full results

Factuality benchmark that measures the ability for language models to answer short, fact-seeking questions.

SimpleQA Results

✨ Features

πŸ”„ Recursive Task Decomposition

Automatically breaks down complex tasks into manageable subtasks with intelligent dependency management. Runs independent sub-tasks in parallel.

πŸ€– Agent Agnostic

Works with any provider (OpenAI, Anthropic, Google, local models) through unified interface, as long as it has an agent.run() command, then you can use it!

πŸ” Complete Transparency

Stage tracing shows exactly what happens at each step - debug and optimize with full visibility

πŸ”Œ Connect Any Tool

Seamlessly integrate external tools and protocols with configurable intervention points. Already includes production-grade connectors such as E2B, file-read-write, and more.

πŸ™ Acknowledgments

This framework would not have been possible if it wasn’t for these amazing open-source contributions!

πŸ“š Citation

If you use the ROMA repo in your research, please cite:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
@software{al_zubi_2025_17052592,
  author       = {Al-Zubi, Salah and
                  Nama, Baran and
                  Kaz, Arda and
                  Oh, Sewoong},
  title        = {SentientResearchAgent: A Hierarchical AI Agent
                   Framework for Research and Analysis
                  },
  month        = sep,
  year         = 2025,
  publisher    = {Zenodo},
  version      = {ROMA},
  doi          = {10.5281/zenodo.17052592},
  url          = {https://doi.org/10.5281/zenodo.17052592},
  swhid        = {swh:1:dir:69cd1552103e0333dd0c39fc4f53cb03196017ce
                   ;origin=https://doi.org/10.5281/zenodo.17052591;vi
                   sit=swh:1:snp:f50bf99634f9876adb80c027361aec9dff97
                   3433;anchor=swh:1:rel:afa7caa843ce1279f5b4b29b5d3d
                   5e3fe85edc95;path=salzubi401-ROMA-b31c382
                  },
}

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.