Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/improve docs #1053

Closed
wants to merge 23 commits into from
Closed

Feature/improve docs #1053

wants to merge 23 commits into from

Commits on Aug 30, 2024

  1. Feature/orchestration v0 (#1006)

    * Feature/remove extra r2r abstraction (#996)
    
    * moving kg construction to enrich-graph (#984)
    
    * checkin
    
    * up
    
    * done
    
    * formatting
    
    * Feature/update ingestion issues (#985)
    
    * udpate ingestion issues
    
    * keep unbounded limit support, but default to bounded
    
    * fix
    
    * fmt
    
    * removes an unnecessary abstraction
    
    * sync changes
    
    ---------
    
    Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>
    
    * first commit
    
    * move towards orchestration
    
    * tweaks
    
    * check in working ingestion
    
    * move
    
    * kg enrichment
    
    * update future, postgres compose
    
    * hatchetize ingestion pipeline
    
    * ready for prime time
    
    * finish
    
    ---------
    
    Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>
    emrgnt-cmplxty and shreyaspimpalgaonkar authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    62cd6f6 View commit details
    Browse the repository at this point in the history
  2. Feature/add update files workflow (#1010)

    * add update files workflow
    
    * rm ingestion pipeline
    emrgnt-cmplxty authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    44a741e View commit details
    Browse the repository at this point in the history
  3. Feature/add enrichment flow (#1013)

    * add update files workflow
    
    * rm ingestion pipeline
    
    * v0 restructure orch
    emrgnt-cmplxty authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    fa88875 View commit details
    Browse the repository at this point in the history
  4. Feature/merged enrichment flow (#1016)

    * add update files workflow
    
    * rm ingestion pipeline
    
    * v0 restructure orch
    
    * kg orchestration
    
    * finish kg orchestration
    
    * update service
    
    * merge
    
    * cleanups
    emrgnt-cmplxty authored Aug 30, 2024
    Configuration menu
    Copy the full SHA
    79eac6f View commit details
    Browse the repository at this point in the history

Commits on Sep 4, 2024

  1. Rm graspologic (#1034)

    * moving kg construction to enrich-graph (#984)
    
    * checkin
    
    * up
    
    * done
    
    * formatting
    
    * Feature/update ingestion issues (#985)
    
    * udpate ingestion issues
    
    * keep unbounded limit support, but default to bounded
    
    * fix
    
    * fmt
    
    * Add support for CharacterTextSplitter (#986)
    
    * Add support for CharacterTextSplitter
    
    Allows R2R client to override the text splitter. Example:
    
    ```python
    ingestion_response = client.ingest_files(
            file_paths=[file_path],
            metadatas=metadata,
            # optionally override chunking settings at runtime
            chunking_settings={
                "provider": "r2r",
                "method": "character",
                "extra_fields": {
                    "separator": "---"
                },
            }
        )
    ```
    
    * fixup! Add support for CharacterTextSplitter
    
    * fixup! fixup! Add support for CharacterTextSplitter
    
    * Patch/ollama base cli (#992)
    
    * Dev (#990)
    
    * moving kg construction to enrich-graph (#984)
    
    * checkin
    
    * up
    
    * done
    
    * formatting
    
    * Feature/update ingestion issues (#985)
    
    * udpate ingestion issues
    
    * keep unbounded limit support, but default to bounded
    
    * fix
    
    * fmt
    
    * Add support for CharacterTextSplitter (#986)
    
    * Add support for CharacterTextSplitter
    
    Allows R2R client to override the text splitter. Example:
    
    ```python
    ingestion_response = client.ingest_files(
            file_paths=[file_path],
            metadatas=metadata,
            # optionally override chunking settings at runtime
            chunking_settings={
                "provider": "r2r",
                "method": "character",
                "extra_fields": {
                    "separator": "---"
                },
            }
        )
    ```
    
    * fixup! Add support for CharacterTextSplitter
    
    * fixup! fixup! Add support for CharacterTextSplitter
    
    ---------
    
    Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>
    Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
    
    * fix ollama cli
    
    ---------
    
    Co-authored-by: Shreyas Pimpalgaonkar <shreyas.gp.7@gmail.com>
    Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
    
    * Ingestion refactor (#991)
    
    * fix test (#993)
    
    * Increase Neo4j memory limits, add GDS plugin, and update LLM concurrency limit to 256.
    
    * Update ingestion sample file, disable KG node extraction pipe, add community processing in clustering, and enhance graph clustering queries.
    
    * Update runners (#1007)
    
    * Refactor KG clustering process to simplify community processing and enhance entity-triple retrieval from Neo4j.
    
    * Refactor Neo4j configuration for memory settings and update graph clustering logic in the KG provider.
    
    * Fix pipeline by enabling node extraction and refactor community processing logic in KGClusteringPipe.
    
    * hatchet works
    
    * throw error if you run global search before enrichment
    
    * Fix communities in local search
    
    * turn off node desc embedding
    
    * fix rag endpoint
    
    * Increase hatchet msg size
    
    * Update ingestion.py
    
    * Refactor and clean up code formatting
    
    * modified workflow
    
    * Add graph creation functionality
    
    * Refactor KG parameters and logging.
    
    * review
    
    * up
    
    ---------
    
    Co-authored-by: emrgnt-cmplxty <68796651+emrgnt-cmplxty@users.noreply.github.com>
    Co-authored-by: emrgnt-cmplxty <owen@algofi.org>
    Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
    Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
    5 people authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    cbe0f19 View commit details
    Browse the repository at this point in the history
  2. Feature/add hatchet api key setup rebased (#1040)

    * add update files workflow
    
    * rm ingestion pipeline
    
    * v0 restructure orch
    
    * kg orchestration
    
    * finish kg orchestration
    
    * update service
    
    * merge
    
    * cleanups
    
    * add hatchet api key setup
    
    * cleanup
    
    * add hatchet api key setup (#1037)
    
    * add hatchet api key setup
    
    * cleanup
    
    * fix merge
    
    * cleanups
    emrgnt-cmplxty authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    291e2d3 View commit details
    Browse the repository at this point in the history
  3. Feature/nolan logs refactored (#1041)

    * Update runners (#1007)
    
    * Check in logs
    
    ---------
    
    Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>
    emrgnt-cmplxty and NolanTrem authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    5d3515e View commit details
    Browse the repository at this point in the history
  4. Pull open PRs into dev (#1042)

    * Pull in subnet and graph PR
    
    * Add in templates
    NolanTrem authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    131c968 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d280ce6 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    bb08b60 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b5fb31b View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    1fc3927 View commit details
    Browse the repository at this point in the history
  9. Unstructured fixes (#1048)

    * dockerfile
    
    * Update ingestion file with new sample URL and enhance unstructured chunking configuration and error handling.
    
    * clean up
    
    * clean up dockerfile
    
    * up
    
    * Update sample file and clean code
    
    * Add hatchet-sdk dependency in project.
    
    * Update providers to include local option.
    shreyaspimpalgaonkar authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    ffab16a View commit details
    Browse the repository at this point in the history
  10. Introduce File Provider (#1044)

    * Draft of file provider
    
    * Some cleanup
    
    * Regenearte lock
    
    * Stream it
    
    * Use document_id as primary key
    
    * Pydantic v2
    
    * File provider finished
    NolanTrem authored Sep 4, 2024
    Configuration menu
    Copy the full SHA
    a46a638 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    3ff56d5 View commit details
    Browse the repository at this point in the history
  12. Fix poetry.lock

    NolanTrem committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    39a4326 View commit details
    Browse the repository at this point in the history
  13. Precommit

    NolanTrem committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    6fde6e6 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    ef0b6fd View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. Fix File Provider (#1050)

    * Fix
    
    * Fix parsing pipeline
    
    * working
    NolanTrem authored Sep 5, 2024
    Configuration menu
    Copy the full SHA
    0be2767 View commit details
    Browse the repository at this point in the history
  2. improve documentation

    emrgnt-cmplxty committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    8512200 View commit details
    Browse the repository at this point in the history
  3. merge doc changes

    emrgnt-cmplxty committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    5e1c9e6 View commit details
    Browse the repository at this point in the history
  4. fix unstr

    emrgnt-cmplxty committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    e8575e5 View commit details
    Browse the repository at this point in the history
  5. add ingestion

    emrgnt-cmplxty committed Sep 5, 2024
    Configuration menu
    Copy the full SHA
    7d9564e View commit details
    Browse the repository at this point in the history