* Feature/encapsulate orchestration (#1265)

* fully encapsulate orchestration

* fully encapsulate orchestration

* complete encapsulation

* revert import cmt

* making default r2r lighter (#1268)

* making default r2r lighter

* fix bug in ingest files

* checkin

* workingupdate

* complete simple orch

* update docs

* up (#1273)

* up

* up

* merge (#1276)

* Postgres configuration settings (#1277)

* Improvements on Auth in JS, CLI (#1267)

* CLI Telemetry (#1266)

* check in

* working

* redundant

* JS auth improvements (#1263)

* Check in JS auth improvements

* Update login with toke

* Fix to allow disabling telemetry

* fix lock

* Try to avoid merge conflicts

* Clean up collection bugs

* remove comments

* Add Postgres configuration settings

* Image

* bad github conflict

* merge (#1278)

* port KG to postgres (#1272)

* create + cluster

* local search

* up

* clean

* format

* basics

* add collection_id and paginate

* rename

* change api

* up

* kg_creation_status

* up

* up

* up

* Feature/cleanup docker (#1279)

* merge

* up

* rm neo4j refs and cleanup docker cmds

* fixup

* Patch/cleanup kg migration (#1281)

* cleanup kg migration

* up

* Kg testing (#1280)

* up

* up

* up

* up

* slay neo4j

---------

Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* add back poetry lock

* Default Collections (#1282)

* Default collections

* Naughty naughty need to follow the SRP

* Testing (#1284)

* CICD

* actions

* poetry

* poetry

* Add env vars

* name

* increase timeout

* add user to collection

* Kg testing (#1283)

* up

* up

* cleanup kg migration

* up

* up

* up

* Kg testing (#1280)

* up

* up

* up

* up

* rename

* project name

* up

* add chunk order

* fragments => extractions

* bug squash

* up

* up

* up

* change postgres project name

---------

Co-authored-by: emrgnt-cmplxty <owen@algofi.org>

* Feature/fix logic bugs (#1285)

* fixing minor logic bugs in dev branch

* fixing minor logic bugs in dev branch

* merge

* Application docs

* add image (#1287)

* Add version to CLI telemetry (#1288)

* add image

* Add version to cli telemetry

* KG hatchet orchestration (#1286)

* up

* up

* cleanup kg migration

* up

* up

* up

* Kg testing (#1280)

* up

* up

* up

* up

* rename

* project name

* up

* add chunk order

* fragments => extractions

* bug squash

* up

* up

* up

* change postgres project name

* up

* up

---------

Co-authored-by: emrgnt-cmplxty <owen@algofi.org>

* Feature/update documentation rebased (#1289)

* up

* merge

* rebase

* fix ingestion issues (#1291)

* fix ingestion issues

* fix lock file

* fix embedding

* Fix SDK KG Serialization (#1292)

* add image

* serialization

* cleanup cli (#1294)

* CLI serialization (#1295)

* add image

* Fix more serialization around kg

* Nolan/schemacreation (#1296)

* add image

* Fix more serialization around kg

* add quotes to prevent reserved keywords from failing

* Prevent errors if config name is reserved name in postgres (#1297)

* Prevent reserved words (#1298)

* Move default collection id method to utils (#1299)

* Allow json fallback (#1301)

* hotfix: import

* Fix description error (#1302)

* up (#1303)

* rename to `full` (#1304)

* rename to `full`

* add html parser

* Remove postgres vecs variables (#1306)

* Feature/rename ingest files (#1307)

* rename to `full`

* add html parser

* Vec Removal (#1308)

* Remove postgres vecs variables

* up

* change kg settings parsing (#1309)

* offset + limit (#1305)

* offset + limit

* fix order

* update query

* change entity offset

* leiden seed

---------

Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>
This commit is contained in:
emrgnt-cmplxty
2024-10-01 18:43:52 -07:00
committed by GitHub
parent 8644a086c3
commit 3721fcb7ad
257 changed files with 8113 additions and 8262 deletions
+38
View File
@@ -2,11 +2,40 @@ import logging
from abc import ABC, abstractmethod
from typing import Any, Optional
from pydantic import BaseModel
from .base import Provider, ProviderConfig
logger = logging.getLogger(__name__)
class PostgresConfigurationSettings(BaseModel):
"""
Configuration settings with defaults defined by the PGVector docker image.
These settings are helpful in managing the connections to the database.
To tune these settings for a specific deployment, see https://pgtune.leopard.in.ua/
"""
max_connections: Optional[int] = 100
shared_buffers: Optional[int] = 16384
effective_cache_size: Optional[int] = 524288
maintenance_work_mem: Optional[int] = 65536
checkpoint_completion_target: Optional[float] = 0.9
wal_buffers: Optional[int] = 512
default_statistics_target: Optional[int] = 100
random_page_cost: Optional[float] = 4
effective_io_concurrency: Optional[int] = 1
work_mem: Optional[int] = 4096
huge_pages: Optional[str] = "try"
min_wal_size: Optional[int] = 80
max_wal_size: Optional[int] = 1024
max_worker_processes: Optional[int] = 8
max_parallel_workers_per_gather: Optional[int] = 2
max_parallel_workers: Optional[int] = 8
max_parallel_maintenance_workers: Optional[int] = 2
class DatabaseConfig(ProviderConfig):
"""A base database configuration class"""
@@ -18,6 +47,11 @@ class DatabaseConfig(ProviderConfig):
db_name: Optional[str] = None
vecs_collection: Optional[str] = None
project_name: Optional[str] = None
postgres_configuration_settings: Optional[
PostgresConfigurationSettings
] = None
default_collection_name: str = "Default"
default_collection_description: str = "Your default collection."
def __post_init__(self):
self.validate_config()
@@ -66,3 +100,7 @@ class DatabaseProvider(Provider):
@abstractmethod
async def _initialize_relational_db(self) -> RelationalDBProvider:
pass
@abstractmethod
def _get_table_name(self, base_name: str) -> str:
pass