Files
R2R/py
Paul Spedding 0aeb00cbb1 File types for Python, JavaScript, TypeScript and CSS added. (#2073)
* Update with local changes

* removed uneeded files

* removed uneeded files

* removed all uneeded files

* update .gitignore to exclude specific env files and modify docker-compose to build r2r service

* Stop tracking docker/env directory. They're my API keys mang

* Stop tracking docker/env directory. hopefully not tracking now

* gitignore fix & stopped tracking custom compose.yaml

* Update .gitignore to stop tracking docker/compose.yaml and clean up redundant entries

* Remove README.md sym link and add PythonParser to core parsers; introduce PY document type

* Add CSS, JS, and TS parsers; update document types and parser exports. Untested.

* Add DockerfileParser to parse and structure Dockerfile content. still need to test all new parsers in running image.

* Add Dockerfile and Docker Compose parsers these are a test not sure if i will keep; update document types and exports

* Remove Dockerfile and Docker Compose parsers from core exports and ingestion provider; update document types accordingly test didnt work. Dockerfile has no file ext  and docker-compose can be a yaml or yml.

* recreate README.md

* update .gitignore to exclude 'paul/' and 'Todo/' directories; add Todo file for debugging notes, but removed in .gitignore

* Delete README.md

* add symlink to README.md pointing to ./py/README.md

* remove DockerfileParser and DockerComposeParser from text parsers

* remove ProfileRouter and update import paths in ingestion and llm modules

* Ignore .gitignore

* Delete .gitignore

* Delete Todo

My personal project todo list.

* Add .gitignore file to exclude unnecessary files and directories

* Fix import path for GenerationConfig in litellm.py

* Fix import statements in base.py to use correct paths

* Refactor parser imports and enhance pre-commit workflow to automatically commit changes

* Update pre-commit configuration to exclude virtual environment and improve print statement checks

* Add new file types support and update .gitignore for pre-commit configuration

* Fix regex pattern in JSParser to correctly match arrow functions

* Add global cleanup fixture to remove leftover documents after tests

* Add global cleanup fixture to remove leftover documents and collections after tests

* Refactor global cleanup fixture to use AsyncGenerator and enhance database cleanup logic

* Refactor global cleanup fixture to use AsyncGenerator and improve cleanup logic

* Remove automatic commit step from quality workflow

* Add TypeScript ignore comments for axios module declaration and GitHub flow build

* Conftest.py in test/intergration dir remove cleanup func

* Remove empty __init__.py files from integration and unit test directories
2025-03-24 10:58:33 -07:00
..
2025-03-21 05:21:22 -07:00
2025-03-20 02:41:02 -07:00
2025-02-20 10:01:22 -08:00
2025-02-20 10:01:22 -08:00
2025-03-20 22:53:13 -07:00

R2R Logo

The most advanced AI retrieval system.

Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

About

R2R (Reason to Retrieve) is an advanced AI retrieval system supporting Retrieval-Augmented Generation (RAG) with production-ready features. Built around a RESTful API, R2R offers multimodal content ingestion, hybrid search, knowledge graphs, and comprehensive document management.

R2R also includes a Deep Research API, a multi-step reasoning system that fetches relevant data from your knowledgebase and/or the internet to deliver richer, context-aware answers for complex queries.

Getting Started

Cloud Option: SciPhi Cloud

Access R2R through SciPhi's managed deployment with a generous free tier. No credit card required.

Self-Hosting Option

# Quick install and run in light mode
pip install r2r
export OPENAI_API_KEY=sk-...
python -m r2r.serve

# Or run in full mode with Docker
# git clone git@github.com:SciPhi-AI/R2R.git && cd R2R
# export R2R_CONFIG_NAME=full OPENAI_API_KEY=sk-...
# docker compose -f compose.full.yaml --profile postgres up -d

For detailed self-hosting instructions, see the self-hosting docs.

Demo

https://github.com/user-attachments/assets/173f7a1f-7c0b-4055-b667-e2cdcf70128b

Using the API

1. Install SDK & Setup

# Install SDK
pip install r2r  # Python
# or
npm i r2r-js    # JavaScript

# Setup API key
export R2R_API_KEY=pk_..sk_...  # Get from SciPhi Cloud dashboard

2. Client Initialization

from r2r import R2RClient
client = R2RClient()  # Use base_url=... for self-hosted
const { r2rClient } = require('r2r-js');
const client = new r2rClient();  // Use baseURL=... for self-hosted

3. Document Operations

# Ingest sample or your own document
client.documents.create_sample(hi_res=True)
# client.documents.create(file_path="/path/to/file")

# List documents
client.documents.list()

4. Search & RAG

# Basic search
results = client.retrieval.search(query="What is DeepSeek R1?")

# RAG with citations
response = client.retrieval.rag(query="What is DeepSeek R1?")

# Agentic reasoning with RAG
response = client.retrieval.agent(
  message={"role":"user", "content": "What does deepseek r1 imply? Think about market, societal implications, and more."},
  rag_generation_config={
    "model"="anthropic/claude-3-7-sonnet-20250219",
    "extended_thinking": True,
    "thinking_budget": 4096,
    "temperature": 1,
    "top_p": None,
    "max_tokens_to_sample": 16000,
  },
  mode="research" # for Deep Research style output
)

Key Features

  • 📁 Multimodal Ingestion: Parse .txt, .pdf, .json, .png, .mp3, and more
  • 🔍 Hybrid Search: Semantic + keyword search with reciprocal rank fusion
  • 🔗 Knowledge Graphs: Automatic entity & relationship extraction
  • 🤖 Agentic RAG: Reasoning agent integrated with retrieval
  • 🔐 User & Access Management: Complete authentication & collection system

Community & Contributing

Our Contributors