Feature/improve docs #1053

emrgnt-cmplxty · 2024-09-05T02:32:20Z

No description provided.

* Feature/remove extra r2r abstraction (#996) * moving kg construction to enrich-graph (#984) * checkin * up * done * formatting * Feature/update ingestion issues (#985) * udpate ingestion issues * keep unbounded limit support, but default to bounded * fix * fmt * removes an unnecessary abstraction * sync changes --------- Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com> * first commit * move towards orchestration * tweaks * check in working ingestion * move * kg enrichment * update future, postgres compose * hatchetize ingestion pipeline * ready for prime time * finish --------- Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>

* add update files workflow * rm ingestion pipeline

* add update files workflow * rm ingestion pipeline * v0 restructure orch

* add update files workflow * rm ingestion pipeline * v0 restructure orch * kg orchestration * finish kg orchestration * update service * merge * cleanups

* moving kg construction to enrich-graph (#984) * checkin * up * done * formatting * Feature/update ingestion issues (#985) * udpate ingestion issues * keep unbounded limit support, but default to bounded * fix * fmt * Add support for CharacterTextSplitter (#986) * Add support for CharacterTextSplitter Allows R2R client to override the text splitter. Example: ```python ingestion_response = client.ingest_files( file_paths=[file_path], metadatas=metadata, # optionally override chunking settings at runtime chunking_settings={ "provider": "r2r", "method": "character", "extra_fields": { "separator": "---" }, } ) ``` * fixup! Add support for CharacterTextSplitter * fixup! fixup! Add support for CharacterTextSplitter * Patch/ollama base cli (#992) * Dev (#990) * moving kg construction to enrich-graph (#984) * checkin * up * done * formatting * Feature/update ingestion issues (#985) * udpate ingestion issues * keep unbounded limit support, but default to bounded * fix * fmt * Add support for CharacterTextSplitter (#986) * Add support for CharacterTextSplitter Allows R2R client to override the text splitter. Example: ```python ingestion_response = client.ingest_files( file_paths=[file_path], metadatas=metadata, # optionally override chunking settings at runtime chunking_settings={ "provider": "r2r", "method": "character", "extra_fields": { "separator": "---" }, } ) ``` * fixup! Add support for CharacterTextSplitter * fixup! fixup! Add support for CharacterTextSplitter --------- Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com> Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com> * fix ollama cli --------- Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com> Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com> * Ingestion refactor (#991) * fix test (#993) * Increase Neo4j memory limits, add GDS plugin, and update LLM concurrency limit to 256. * Update ingestion sample file, disable KG node extraction pipe, add community processing in clustering, and enhance graph clustering queries. * Update runners (#1007) * Refactor KG clustering process to simplify community processing and enhance entity-triple retrieval from Neo4j. * Refactor Neo4j configuration for memory settings and update graph clustering logic in the KG provider. * Fix pipeline by enabling node extraction and refactor community processing logic in KGClusteringPipe. * hatchet works * throw error if you run global search before enrichment * Fix communities in local search * turn off node desc embedding * fix rag endpoint * Increase hatchet msg size * Update ingestion.py * Refactor and clean up code formatting * modified workflow * Add graph creation functionality * Refactor KG parameters and logging. * review * up --------- Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com> Co-authored-by: emrgnt-cmplxty <owen@algofi.org> Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com> Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>

* add update files workflow * rm ingestion pipeline * v0 restructure orch * kg orchestration * finish kg orchestration * update service * merge * cleanups * add hatchet api key setup * cleanup * add hatchet api key setup (#1037) * add hatchet api key setup * cleanup * fix merge * cleanups

* Update runners (#1007) * Check in logs --------- Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>

* Pull in subnet and graph PR * Add in templates

* dockerfile * Update ingestion file with new sample URL and enhance unstructured chunking configuration and error handling. * clean up * clean up dockerfile * up * Update sample file and clean code * Add hatchet-sdk dependency in project. * Update providers to include local option.

* Draft of file provider * Some cleanup * Regenearte lock * Stream it * Use document_id as primary key * Pydantic v2 * File provider finished

* Fix * Fix parsing pipeline * working

emrgnt-cmplxty and others added 23 commits August 29, 2024 17:14

Feature/add update files workflow (#1010)

44a741e

* add update files workflow * rm ingestion pipeline

Feature/add enrichment flow (#1013)

fa88875

* add update files workflow * rm ingestion pipeline * v0 restructure orch

Feature/merged enrichment flow (#1016)

79eac6f

* add update files workflow * rm ingestion pipeline * v0 restructure orch * kg orchestration * finish kg orchestration * update service * merge * cleanups

Feature/nolan logs refactored (#1041)

5d3515e

* Update runners (#1007) * Check in logs --------- Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>

Pull open PRs into dev (#1042)

131c968

* Pull in subnet and graph PR * Add in templates

Add python files for templates in cli (#1043)

d280ce6

Merge branch 'main' into dev

bb08b60

working hatchet integration (#1046)

b5fb31b

Update local_llm_neo4j_kg.toml

1fc3927

Introduce File Provider (#1044)

a46a638

* Draft of file provider * Some cleanup * Regenearte lock * Stream it * Use document_id as primary key * Pydantic v2 * File provider finished

Make 7272 the default port (#1045)

3ff56d5

Fix poetry.lock

39a4326

Precommit

6fde6e6

Enhance Dockerfile and add telemetry events (#1049)

ef0b6fd

Fix File Provider (#1050)

0be2767

* Fix * Fix parsing pipeline * working

improve documentation

8512200

merge doc changes

5e1c9e6

fix unstr

e8575e5

add ingestion

7d9564e

emrgnt-cmplxty closed this Sep 5, 2024

emrgnt-cmplxty deleted the feature/improve-docs branch September 6, 2024 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/improve docs #1053

Feature/improve docs #1053

emrgnt-cmplxty commented Sep 5, 2024

Feature/improve docs #1053

Feature/improve docs #1053

Conversation

emrgnt-cmplxty commented Sep 5, 2024