-
Notifications
You must be signed in to change notification settings - Fork 9
docs: Add architecture doc in developer guide. #163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughA new architecture documentation file for the Spider system was added, detailing its components, design, and fault tolerance. The developer documentation index was updated to include a link to this new architecture page, along with a description and navigation entry. No code or API changes were made. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Storage
participant Scheduler
participant Worker
Client->>Storage: Submit job / Query results
Scheduler->>Storage: Heartbeat / Fetch state
Worker->>Scheduler: Request task
Scheduler->>Worker: Assign task
Worker->>Storage: Fetch inputs
Worker->>Worker: Execute task
Worker->>Storage: Store outputs
Worker->>Storage: Heartbeat
Scheduler->>Storage: Heartbeat
Note over Scheduler, Storage: On failure, Scheduler restarts and fetches state
Note over Scheduler, Worker: On worker failure, Scheduler detects and recovers tasks
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
docs/src/dev-docs/index.md (1)
9-15: Minor wording & ordering tweaks will polish the new navigation card
- The description line already contains the word “Architecture”, so the heading + description pair currently reads “Architecture / Architecture overview of Spider.” – consider re-phrasing the description to avoid the immediate word repetition.
- In the hidden toctree, inserting the new
architectureentry abovetestingkeeps the list alphabetically ordered (current style throughout the docs).-Architecture overview of Spider. +Overview of Spider’s architecture.These are purely cosmetic, but they improve readability and maintain consistency.
Also applies to: 27-28
docs/src/dev-docs/architecture.md (3)
13-15: Fix typo “Spdier” → “Spider” & tighten sentence-`Spdier` relies on a fault-tolerant and ACID storage, e.g. MariaDB, to store all the states of the -system. +`Spider` relies on fault-tolerant, ACID-compliant storage (e.g., MariaDB) to persist all system state.
45-48: Grammar: add article & preposition for smoother reading-A worker executes tasks allocated by the scheduler. It runs the following steps in loop: +A worker executes tasks allocated by the scheduler and runs the following steps in a loop:
75-79: Correct wording in failure-recovery bullet list-- Identify all the failed tasks inside a job -- Compute the minimum subgraph that contains the fail tasks where all inputs to the subgraph are +- Identify all failed tasks within the job +- Compute the minimal sub-graph that contains the failed tasks where all inputs to the sub-graph are
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/src/dev-docs/arch.pngis excluded by!**/*.png
📒 Files selected for processing (2)
docs/src/dev-docs/architecture.md(1 hunks)docs/src/dev-docs/index.md(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davidlion
PR: y-scope/spider#100
File: src/spider/worker/worker.cpp:205-230
Timestamp: 2025-04-09T17:15:24.552Z
Learning: Documentation should be added to new functions in the spider codebase, as already discussed with the user.
docs/src/dev-docs/architecture.md (1)
Learnt from: davidlion
PR: y-scope/spider#100
File: src/spider/worker/worker.cpp:205-230
Timestamp: 2025-04-09T17:15:24.552Z
Learning: Documentation should be added to new functions in the spider codebase, as already discussed with the user.
🪛 LanguageTool
docs/src/dev-docs/index.md
[duplication] ~10-~10: Possible typo: you repeated a word.
Context: ... :gutter: 2 :::{grid-item-card} 🔗 architecture Architecture ^^^ Architecture overview of Spider. ::...
(ENGLISH_WORD_REPEAT_RULE)
docs/src/dev-docs/architecture.md
[duplication] ~36-~36: Possible typo: you repeated a word.
Context: ...Worker ID - Heartbeat timestamps ### Scheduler Scheduler is responsible for: - Allocating tasks ...
(ENGLISH_WORD_REPEAT_RULE)
[duplication] ~53-~53: Possible typo: you repeated a word.
Context: ... only executes one task at a time. ### Client Client communicates only with the storage to s...
(ENGLISH_WORD_REPEAT_RULE)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
docs/src/dev-docs/architecture.md (3)
13-15: Fix typo in component name (“Spdier” → “Spider”)The very first mention of the storage component spells the project name incorrectly, which looks unpolished in published docs.
-`Spdier` relies on a fault-tolerant … +`Spider` relies on a fault-tolerant …
46-48: Tighten wording (“in loop” → “in a loop”)Minor grammar tweak for readability.
-A worker executes tasks allocated by the scheduler. It runs the following steps in loop: +A worker executes tasks allocated by the scheduler. It runs the following steps in a loop:
75-77: Correct typo in recovery steps (“fail tasks” → “failed tasks”)-- Compute the minimum subgraph that contains the fail tasks where all inputs to the subgraph are +- Compute the minimum subgraph that contains the failed tasks where all inputs to the subgraph are
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/src/dev-docs/architecture.md(1 hunks)docs/src/dev-docs/index.md(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davidlion
PR: y-scope/spider#100
File: src/spider/worker/worker.cpp:205-230
Timestamp: 2025-04-09T17:15:24.552Z
Learning: Documentation should be added to new functions in the spider codebase, as already discussed with the user.
docs/src/dev-docs/architecture.md (1)
Learnt from: davidlion
PR: y-scope/spider#100
File: src/spider/worker/worker.cpp:205-230
Timestamp: 2025-04-09T17:15:24.552Z
Learning: Documentation should be added to new functions in the spider codebase, as already discussed with the user.
🪛 LanguageTool
docs/src/dev-docs/architecture.md
[duplication] ~36-~36: Possible typo: you repeated a word.
Context: ...Worker ID - Heartbeat timestamps ### Scheduler Scheduler is responsible for: - Allocating tasks ...
(ENGLISH_WORD_REPEAT_RULE)
[duplication] ~53-~53: Possible typo: you repeated a word.
Context: ... only executes one task at a time. ### Client Client communicates only with the storage to s...
(ENGLISH_WORD_REPEAT_RULE)
docs/src/dev-docs/index.md
[duplication] ~10-~10: Possible typo: you repeated a word.
Context: ... :gutter: 2 :::{grid-item-card} 🔗 architecture Architecture ^^^ Overview of Spider's architecture. ...
(ENGLISH_WORD_REPEAT_RULE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: lint
- GitHub Check: non-storage-unit-tests (ubuntu-24.04)
- GitHub Check: non-storage-unit-tests (ubuntu-22.04)
🔇 Additional comments (1)
docs/src/dev-docs/index.md (1)
9-14: LGTM – new card integrates cleanlyThe added Architecture card and toctree entry render fine and follow existing conventions.
LinZhihao-723
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we move forward: can we use the config from the latest yscope-dev-utils to lint this markdown file manually?
The command:
npx markdownlint-cli2 --config yscope-dev-utils/exports/lint-configs/.markdownlint-cli2.yaml --fix docs/src/dev-docs/architecture.md
Not sure if upgrading dev utils will cause any other issues to the existing tasks. We can use a local version for now.
|
@CodeRabbit Please open an issue to upgrade |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@sitaowang1998 I'll create a GitHub issue for upgrading 🏁 Script executed: Length of output: 826 Perfect! I've successfully created GitHub issue #164 for upgrading Issue created: #164 The issue includes:
The issue is now ready for you to work on when convenient! 🧠 Learnings used |
Description
As title.
Checklist
breaking change.
Validation performed
task docs:sitebuilds successfully and local website fromtask docs:servelooks as expected.Summary by CodeRabbit