Remove ChromaDB/NetworkX/BM25 remnants, update docs for Graphiti#12
Merged
github-actions[bot] merged 8 commits intomainfrom Feb 15, 2026
Merged
Conversation
added 8 commits
February 12, 2026 21:34
Update NEO4J_URI from old Cloud Run Neo4j service (bolt+s://neo4j-4aosg235qq-uc.a.run.app:443) to GCE VM (bolt://10.0.0.27:7687) for all production resources: - confluence-sync, index-rebuild, sync-pipeline jobs - slack-bot service Switch NEO4J_PASSWORD from Secret Manager reference to random_password.neo4j_prod_password.result to match the actual password on the GCE VM.
Production Graphiti indexing was extremely slow (0.27 chunks/min vs staging's 4.5) due to missing LLM_PROVIDER, GOOGLE_GENAI_USE_VERTEXAI, and GRAPHITI_BULK_ENABLED env vars that staging already had. Without proper Gemini config, Graphiti produced malformed JSON responses triggering retries and exponential backoff.
The circuit breaker was resetting consecutive_failures to 0 before the skip check could trigger (consecutive_failures >= MAX_RETRIES), causing chunks that always fail to retry forever. Added separate chunk_attempts counter that tracks retries per chunk independently of the circuit breaker reset.
Neo4j 5.x stores auth in the system database, not flat files. The previous auth reset (rm auth.ini/auth) was ineffective. Now deletes databases/system and transactions/system directories to force Neo4j to recreate auth from NEO4J_AUTH env var.
All 6 intake-related Cloud Scheduler jobs were firing daily/weekly without being intentionally enabled, causing duplicate pipeline runs. Intake jobs should be run manually until a sync strategy is defined. Removed: confluence-sync-daily, parse-daily, metadata-generation-daily, index-rebuild-weekly, quality-scoring-daily, sync-pipeline-daily. Kept: scheduler service account + IAM (used by backup.tf schedulers).
…metadata-generation, confluence-sync These standalone jobs are remnants from pre-pipeline architecture. The consolidated pipeline job (sync-pipeline) handles download+parse+index in a single process. The standalone jobs can't work in Cloud Run anyway because they need shared SQLite state between steps. Quality-scoring and metadata-generation are dead features not used by Graphiti search, and were burning Vertex AI Claude credits for nothing. Only sync-pipeline remains for manual intake runs.
Update README, ARCHITECTURE.md, ADRs, and GRAPH_DATABASE_PLAN to reflect the completed migration from ChromaDB/NetworkX/BM25 to Neo4j + Graphiti. Mark ADR-0002 and ADR-0005 as superseded, add ADR-0009 for Neo4j + Graphiti.
…aphiti - Remove chromadb and rank-bm25 dependencies from pyproject.toml - Remove 9 deprecated config settings (CHROMA_*, BM25_*, GRAPH_DUAL_WRITE) - Delete graph_builder.py (NetworkX) and graph_retriever.py (NetworkX) - Remove EntityExtractor class (replaced by Graphiti), GovernanceMetadata, Entity, and Relationship models (replaced by Neo4j) - Remove deprecated lifecycle stubs and rebuild-bm25 CLI command - Rename VectorIndexer -> GraphitiIndexer in all callers and test mocks - Rename index_to_chromadb -> index_to_graphiti in downloader - Update all ChromaDB references in comments/docstrings to Graphiti - Clean up test files: remove tests for deleted code, fix imports
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
Documentation (commit 1):
Code cleanup (commit 2):
chromadbandrank-bm25from pyproject.tomlgraph_builder.py(NetworkX) andgraph_retriever.py(NetworkX)EntityExtractor,GovernanceMetadata,Entity,Relationshipclassesrebuild-bm25CLI commandVectorIndexer->GraphitiIndexerin all callers and test mocksindex_to_chromadb->index_to_graphitiin downloaderTest plan