* Update with local changes * removed uneeded files * removed uneeded files * removed all uneeded files * update .gitignore to exclude specific env files and modify docker-compose to build r2r service * Stop tracking docker/env directory. They're my API keys mang * Stop tracking docker/env directory. hopefully not tracking now * gitignore fix & stopped tracking custom compose.yaml * Update .gitignore to stop tracking docker/compose.yaml and clean up redundant entries * Remove README.md sym link and add PythonParser to core parsers; introduce PY document type * Add CSS, JS, and TS parsers; update document types and parser exports. Untested. * Add DockerfileParser to parse and structure Dockerfile content. still need to test all new parsers in running image. * Add Dockerfile and Docker Compose parsers these are a test not sure if i will keep; update document types and exports * Remove Dockerfile and Docker Compose parsers from core exports and ingestion provider; update document types accordingly test didnt work. Dockerfile has no file ext and docker-compose can be a yaml or yml. * recreate README.md * update .gitignore to exclude 'paul/' and 'Todo/' directories; add Todo file for debugging notes, but removed in .gitignore * Delete README.md * add symlink to README.md pointing to ./py/README.md * remove DockerfileParser and DockerComposeParser from text parsers * remove ProfileRouter and update import paths in ingestion and llm modules * Ignore .gitignore * Delete .gitignore * Delete Todo My personal project todo list. * Add .gitignore file to exclude unnecessary files and directories * Fix import path for GenerationConfig in litellm.py * Fix import statements in base.py to use correct paths * Refactor parser imports and enhance pre-commit workflow to automatically commit changes * Update pre-commit configuration to exclude virtual environment and improve print statement checks * Add new file types support and update .gitignore for pre-commit configuration * Fix regex pattern in JSParser to correctly match arrow functions * Add global cleanup fixture to remove leftover documents after tests * Add global cleanup fixture to remove leftover documents and collections after tests * Refactor global cleanup fixture to use AsyncGenerator and enhance database cleanup logic * Refactor global cleanup fixture to use AsyncGenerator and improve cleanup logic * Remove automatic commit step from quality workflow * Add TypeScript ignore comments for axios module declaration and GitHub flow build * Conftest.py in test/intergration dir remove cleanup func * Remove empty __init__.py files from integration and unit test directories
The most advanced AI retrieval system.
Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.
About
R2R (Reason to Retrieve) is an advanced AI retrieval system supporting Retrieval-Augmented Generation (RAG) with production-ready features. Built around a RESTful API, R2R offers multimodal content ingestion, hybrid search, knowledge graphs, and comprehensive document management.
R2R also includes a Deep Research API, a multi-step reasoning system that fetches relevant data from your knowledgebase and/or the internet to deliver richer, context-aware answers for complex queries.
Getting Started
Cloud Option: SciPhi Cloud
Access R2R through SciPhi's managed deployment with a generous free tier. No credit card required.
Self-Hosting Option
# Quick install and run in light mode
pip install r2r
export OPENAI_API_KEY=sk-...
python -m r2r.serve
# Or run in full mode with Docker
# git clone git@github.com:SciPhi-AI/R2R.git && cd R2R
# export R2R_CONFIG_NAME=full OPENAI_API_KEY=sk-...
# docker compose -f compose.full.yaml --profile postgres up -d
For detailed self-hosting instructions, see the self-hosting docs.
Demo
https://github.com/user-attachments/assets/173f7a1f-7c0b-4055-b667-e2cdcf70128b
Using the API
1. Install SDK & Setup
# Install SDK
pip install r2r # Python
# or
npm i r2r-js # JavaScript
# Setup API key
export R2R_API_KEY=pk_..sk_... # Get from SciPhi Cloud dashboard
2. Client Initialization
from r2r import R2RClient
client = R2RClient() # Use base_url=... for self-hosted
const { r2rClient } = require('r2r-js');
const client = new r2rClient(); // Use baseURL=... for self-hosted
3. Document Operations
# Ingest sample or your own document
client.documents.create_sample(hi_res=True)
# client.documents.create(file_path="/path/to/file")
# List documents
client.documents.list()
4. Search & RAG
# Basic search
results = client.retrieval.search(query="What is DeepSeek R1?")
# RAG with citations
response = client.retrieval.rag(query="What is DeepSeek R1?")
# Agentic reasoning with RAG
response = client.retrieval.agent(
message={"role":"user", "content": "What does deepseek r1 imply? Think about market, societal implications, and more."},
rag_generation_config={
"model"="anthropic/claude-3-7-sonnet-20250219",
"extended_thinking": True,
"thinking_budget": 4096,
"temperature": 1,
"top_p": None,
"max_tokens_to_sample": 16000,
},
mode="research" # for Deep Research style output
)
Key Features
- 📁 Multimodal Ingestion: Parse
.txt,.pdf,.json,.png,.mp3, and more - 🔍 Hybrid Search: Semantic + keyword search with reciprocal rank fusion
- 🔗 Knowledge Graphs: Automatic entity & relationship extraction
- 🤖 Agentic RAG: Reasoning agent integrated with retrieval
- 🔐 User & Access Management: Complete authentication & collection system
Community & Contributing
- Join our Discord for support and discussion
- Submit feature requests or bug reports
- Open PRs for new features, improvements, or documentation
- Book a demo call with the SciPhi founders