d693f690a6
* improve ci/cd runtime * update prompt tests * improve ci/cd runtime (#1535) * improve ci/cd runtime * update prompt tests * Support Python ^3.10 (#1534) * add azure * up * up * spec out v3 api * checkin document router * adding chunk abstr * add list chunks * add chunk search * up * add users routes * up * checkin progress * add collections annotations * add indices * add user * checkin work * up * complete conversations CRUD * fix type errors * add graph router * add graphs * Update JS (#1563) * Feature/add graph to v3 (#1565) * complete simple tests, cleanup routers * up * Harmonize Pagination across endpoints (#1564) * Pagination * Add fixmes * Fix nested deletion filter bug (#1567) * Remove Mintlify docs (#1569) * Nolan/list collection (#1568) * Check in * More * Fix nested transactions issue in sqlite logger * Fix update collection return type * JS V3 (#1571) * Sync collections JS * More documents * Clean up messy code * list not List * Users first pass * User tests and fixmes * More * typo * More prompts * Pre-commit improvements * Remove prints * Cleanups on conversations * Branches response * Chunks * More work on the return types * Jest config * Fix branch creation time * Fix lock * Nolan/v3 tests (#1578) * Add deprecated command back * Add warning * Fix GraphRAG tests (#1579) * More cleanup (#1580) * More cleanup * More * Fix test * More cleanups * More cleanups * More * Merge main * Python SDK V3 (#1585) * Python SDK V3 * Fix * First pass (#1586) * More V3 (#1587) * Validation errors * Update js test * more * Fix sync methods on v2 sdk, add check for download files (#1588) * More CLI (#1589) * Print logs on failing tests (#1590) * Print logs on failing tests * MOre * cleanup * Again * Again * More JS testing (#1591) * More JS testing * Cleanup * More refactors for tests (#1592) * System Routes (#1594) * Fix type errors, pass collection id (#1595) * Hotfix: dict * V3 graph implmentations (#1593) * complete simple tests, cleanup routers * up * up * checkin * up * up * response models * checkin * up * checkin * up * up * up * up * up * up * v2 * up * up * up * up --------- Co-authored-by: emrgnt-cmplxty <owen@algofi.org> * Allow passing of collection id at document ingestion (#1596) * KG Response sync (#1597) * fix * Fix Prompt Override (#1599) * Fix Prompt Override * print * Caching * Fix * Updated Graph Models, Drop SID (#1598) * New Graph Models * Fix * minor tweaks * fix summary model (#1604) * incr progress * Add /users/me (#1605) * Add /users/me * oops * Resolve Merge Conflicts (#1607) * Fix conflicts * Clean up * Nolan/conflicts (#1608) * expose reset data to admin (#1602) * up (#1603) * up * up * wtf github is a piece of garbage --------- Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com> * wrapup walkthrough * Add delete user method, sync JS to camel case (#1609) * V3 graph testing (#1606) * up * up * up * graph crud * up * community endpts * up * up * up * up * up * up * up * up * add back routers * up * pre-commit * Fix Broken V2 Graphs, Better Response Models (#1612) * Increase test coverage * Fix v2 graphs, better response models * Remaining types * Add types to Python SDK * Typo * update tests * revert test change * up * Add types to package export (#1613) * Graph refactor (#1611) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * Create branch update (#1617) * Graph refactor (#1616) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * List collections (#1619) * up * up * up * up * Graph refactor (#1620) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * up * up * up * up * Nolan/update graph (#1621) * List collections * Update Graph JS SDK * up * up * cleanup * Graph refactor (#1622) * up * up * add back routers * up * pre-commit * update tests * revert test change * up * simplify * up * add the add/remove endpoints * up * include routers back * up * up * up * up * up * up * cleanup * up * up * up * remove unnecessary functions * up * up * complete document embedding workflow * working get command on graph * checkin progress * up * add entity and relationship deletions * no verif * up * up * up * up (#1636) * up * sync graph * up * up * fix relationship distance calc. * fix issue with faulty collection filter (#1637) * Patch/alternative fix logics 2 (#1638) * fix issue with faulty collection filter * further refinements, like fixing limits * up * fix logic around include metadata and scores * fix double collection assignment * up * fix communities * working clusters * up * add collection extraction * add collection extraction * up * prep for merge * Patch/alternative up with nolan (#1643) * SDK First pass * Add feature tracking * Typo * Check in * Rebase * Add Graph tests * Fix Agent empty message bug * Check in JS routes * More tests, examples * Sync python * Expose Entity/Relationship Params in Routes (#1640) * Expose Entity/Relationship Params * Descriptions * Modify create entities * Create relationships * set parent_id * Update entitiy * Update Relationships * Check in * Ellipsis fixes * More cleanup * Start CRUD on communities * Communities DB * Explicit working path * Once again * Fail fast false * Testing around community creation * Delete community test * Update community tests * Clean up type errors, cleaner code * More cleanup * More * remove chunk_entity * Delete bad, unused methods * More * fixup crud * rm pull --------- Co-authored-by: NolanTrem <34580718+NolanTrem@users.noreply.github.com> * Feature/fix graph permissions (#1645) * update docs / collections * up * Feature/fix auth checks (#1647) * update docs / collections * up * fix super user and more * up * up (#1648) * Feature/rm v2 api (#1649) * SDK First pass * Add feature tracking * Typo * Check in * Rebase * Add Graph tests * Fix Agent empty message bug * Check in JS routes * More tests, examples * Sync python * Expose Entity/Relationship Params in Routes (#1640) * Expose Entity/Relationship Params * Descriptions * Modify create entities * Create relationships * set parent_id * Update entitiy * Update Relationships * Check in * Ellipsis fixes * More cleanup * Start CRUD on communities * Communities DB * Explicit working path * Once again * Fail fast false * Testing around community creation * Delete community test * Update community tests * Clean up type errors, cleaner code * More cleanup * More * remove chunk_entity * Delete bad, unused methods * More * remove v2 api * rm kg router * cleanups * fixup delete by filter * fixup delete by filter * fixes * up * up --------- Co-authored-by: NolanTrem <34580718+NolanTrem@users.noreply.github.com> * Improved Data Structures (#1650) * Check in * Most tests fixed * fix tables * Once more * Move to a single community table * Don't modify existing migration script--keep them atomic * Migration * Migration, more clean up * All but deletion working * Up * Feature/tweaks for prod (#1651) * tweaks for prod * up * final tweaks * Nolan/deletion (#1652) * Check in * Most tests fixed * fix tables * Once more * Move to a single community table * Don't modify existing migration script--keep them atomic * Migration * Migration, more clean up * All but deletion working * Up * Fix deletion * Working migration (#1654) * Feature/production tweaks (#1653) * tweaks for prod * up * final tweaks * prod tweaks * fixed --------- Co-authored-by: NolanTrem <34580718+NolanTrem@users.noreply.github.com> * sort --------- Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com> Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>
198 lines
5.7 KiB
Python
198 lines
5.7 KiB
Python
import asyncio
|
|
import logging
|
|
import random
|
|
import time
|
|
from abc import abstractmethod
|
|
from enum import Enum
|
|
from typing import Any, Optional
|
|
|
|
from litellm import AuthenticationError
|
|
|
|
from core.base.abstractions import VectorQuantizationSettings
|
|
|
|
from ..abstractions import (
|
|
ChunkSearchResult,
|
|
EmbeddingPurpose,
|
|
default_embedding_prefixes,
|
|
)
|
|
from .base import Provider, ProviderConfig
|
|
|
|
logger = logging.getLogger()
|
|
|
|
|
|
class EmbeddingConfig(ProviderConfig):
|
|
provider: str
|
|
base_model: str
|
|
base_dimension: int
|
|
rerank_model: Optional[str] = None
|
|
rerank_url: Optional[str] = None
|
|
batch_size: int = 1
|
|
prefixes: Optional[dict[str, str]] = None
|
|
add_title_as_prefix: bool = True
|
|
concurrent_request_limit: int = 256
|
|
max_retries: int = 8
|
|
initial_backoff: float = 1
|
|
max_backoff: float = 64.0
|
|
quantization_settings: VectorQuantizationSettings = (
|
|
VectorQuantizationSettings()
|
|
)
|
|
|
|
## deprecated
|
|
rerank_dimension: Optional[int] = None
|
|
rerank_transformer_type: Optional[str] = None
|
|
|
|
def validate_config(self) -> None:
|
|
if self.provider not in self.supported_providers:
|
|
raise ValueError(f"Provider '{self.provider}' is not supported.")
|
|
|
|
@property
|
|
def supported_providers(self) -> list[str]:
|
|
return ["litellm", "openai", "ollama"]
|
|
|
|
|
|
class EmbeddingProvider(Provider):
|
|
class PipeStage(Enum):
|
|
BASE = 1
|
|
RERANK = 2
|
|
|
|
def __init__(self, config: EmbeddingConfig):
|
|
if not isinstance(config, EmbeddingConfig):
|
|
raise ValueError(
|
|
"EmbeddingProvider must be initialized with a `EmbeddingConfig`."
|
|
)
|
|
logger.info(f"Initializing EmbeddingProvider with config {config}.")
|
|
|
|
super().__init__(config)
|
|
self.config: EmbeddingConfig = config
|
|
self.semaphore = asyncio.Semaphore(config.concurrent_request_limit)
|
|
self.current_requests = 0
|
|
|
|
async def _execute_with_backoff_async(self, task: dict[str, Any]):
|
|
retries = 0
|
|
backoff = self.config.initial_backoff
|
|
while retries < self.config.max_retries:
|
|
try:
|
|
async with self.semaphore:
|
|
return await self._execute_task(task)
|
|
except AuthenticationError as e:
|
|
raise
|
|
except Exception as e:
|
|
logger.warning(
|
|
f"Request failed (attempt {retries + 1}): {str(e)}"
|
|
)
|
|
retries += 1
|
|
if retries == self.config.max_retries:
|
|
raise
|
|
await asyncio.sleep(random.uniform(0, backoff))
|
|
backoff = min(backoff * 2, self.config.max_backoff)
|
|
|
|
def _execute_with_backoff_sync(self, task: dict[str, Any]):
|
|
retries = 0
|
|
backoff = self.config.initial_backoff
|
|
while retries < self.config.max_retries:
|
|
try:
|
|
return self._execute_task_sync(task)
|
|
except AuthenticationError as e:
|
|
raise
|
|
except Exception as e:
|
|
logger.warning(
|
|
f"Request failed (attempt {retries + 1}): {str(e)}"
|
|
)
|
|
retries += 1
|
|
if retries == self.config.max_retries:
|
|
raise
|
|
time.sleep(random.uniform(0, backoff))
|
|
backoff = min(backoff * 2, self.config.max_backoff)
|
|
|
|
@abstractmethod
|
|
async def _execute_task(self, task: dict[str, Any]):
|
|
pass
|
|
|
|
@abstractmethod
|
|
def _execute_task_sync(self, task: dict[str, Any]):
|
|
pass
|
|
|
|
async def async_get_embedding(
|
|
self,
|
|
text: str,
|
|
stage: PipeStage = PipeStage.BASE,
|
|
purpose: EmbeddingPurpose = EmbeddingPurpose.INDEX,
|
|
):
|
|
task = {
|
|
"text": text,
|
|
"stage": stage,
|
|
"purpose": purpose,
|
|
}
|
|
return await self._execute_with_backoff_async(task)
|
|
|
|
def get_embedding(
|
|
self,
|
|
text: str,
|
|
stage: PipeStage = PipeStage.BASE,
|
|
purpose: EmbeddingPurpose = EmbeddingPurpose.INDEX,
|
|
):
|
|
task = {
|
|
"text": text,
|
|
"stage": stage,
|
|
"purpose": purpose,
|
|
}
|
|
return self._execute_with_backoff_sync(task)
|
|
|
|
async def async_get_embeddings(
|
|
self,
|
|
texts: list[str],
|
|
stage: PipeStage = PipeStage.BASE,
|
|
purpose: EmbeddingPurpose = EmbeddingPurpose.INDEX,
|
|
):
|
|
task = {
|
|
"texts": texts,
|
|
"stage": stage,
|
|
"purpose": purpose,
|
|
}
|
|
return await self._execute_with_backoff_async(task)
|
|
|
|
def get_embeddings(
|
|
self,
|
|
texts: list[str],
|
|
stage: PipeStage = PipeStage.BASE,
|
|
purpose: EmbeddingPurpose = EmbeddingPurpose.INDEX,
|
|
) -> list[list[float]]:
|
|
task = {
|
|
"texts": texts,
|
|
"stage": stage,
|
|
"purpose": purpose,
|
|
}
|
|
return self._execute_with_backoff_sync(task)
|
|
|
|
@abstractmethod
|
|
def rerank(
|
|
self,
|
|
query: str,
|
|
results: list[ChunkSearchResult],
|
|
stage: PipeStage = PipeStage.RERANK,
|
|
limit: int = 10,
|
|
):
|
|
pass
|
|
|
|
@abstractmethod
|
|
async def arerank(
|
|
self,
|
|
query: str,
|
|
results: list[ChunkSearchResult],
|
|
stage: PipeStage = PipeStage.RERANK,
|
|
limit: int = 10,
|
|
):
|
|
pass
|
|
|
|
def set_prefixes(self, config_prefixes: dict[str, str], base_model: str):
|
|
self.prefixes = {}
|
|
|
|
for t, p in config_prefixes.items():
|
|
purpose = EmbeddingPurpose(t.lower())
|
|
self.prefixes[purpose] = p
|
|
|
|
if base_model in default_embedding_prefixes:
|
|
for t, p in default_embedding_prefixes[base_model].items():
|
|
if t not in self.prefixes:
|
|
self.prefixes[t] = p
|