docs: add structured documentation (architecture, developer & user guides) and improve onboarding#145
docs: add structured documentation (architecture, developer & user guides) and improve onboarding#145Mario5T wants to merge 5 commits intom-lab:mainfrom
Conversation
|
@zarsl @bassosimone @sermpezis plz review this PR. |
|
You just turned around our entire documentation structure w/o previous coordination and discussion. I think this is a bit too much for a first contribution. If you want to get this considered for merging, I recommend we engage into a conversation in which you explain to me the pain points in the original documentation. This would help me understand the motivation and rationale of your proposed changes and help me vet them and decide what to do. Thank you for thinking about doing this, by the way. The documentation could always be improved and it is surely the entry point to the project. Precisely because of this and because it is a subjective call on what the correct documentation looks like, I have a need to chat more with you to reach some interpersonal alignment. (This job would have been easier if we discussed more beforehand and we coordinated a bit, especially if I had some understanding of the pain points we observed, but I am also open to discuss after the fact! 😅 🙏) |
Thank you for the thoughtful and generous response — and fair point. I should have opened a discussion issue before sending a large PR. Let me explain what I observed and you can tell me how much, if any, of this aligns with gaps you've felt too. Pain points I noticed in the existing documentation: README.md No conceptual explanation of the IQB score The README links to PDFs and blog posts for conceptual background. There was no in-repo explanation — even a brief one — of what the score means, what use cases it covers, or how to interpret it. A policymaker or ISP engineer reading the README would still not know what a score of 0.57 means without leaving the repo. Developer onboarding is spread across multiple READMEs Useful information about coding conventions, where caching applies, how to safely extend the config, and how to add a dashboard page is implicitly scattered across CONTRIBUTING.md library/README.md prototype/README.md No architecture document The data flow (BigQuery → pipeline → Parquet → IQBCache → IQBCalculator → dashboard) is described briefly in the top-level README, but the boundaries between components — what the library owns vs. what the prototype is allowed to do — aren't written down. This matters when deciding where to put new code. What I'm not claiming: I'm not claiming the existing docs are broken or wrong. library/README.md data/README.md What I'd suggest as a path forward: Rather than merging all of this, I'm open to any of: Picking one file that addresses the clearest gap (e.g., only docs/architecture.md README.md |
|
After listing our project for GSoC, we received a large amount of pull requests across several repositories. We are dealing with the backlog, but this would take time. We will get back to this pull request eventually. In the meanwhile, if you are a GSoC applicant, please read our updated GSoC policy: https://github.com/m-lab/gsoc/. |
Summary
This PR introduces structured documentation to improve onboarding, system clarity, and long-term maintainability of the IQB repository.
The changes align with the project's documented "Documentation" expected outcome and aim to make the system easier to understand and extend for contributors, researchers, policymakers, and ISPs.
No functional changes are introduced.
Changes Included
1. README Improvements
2. Architecture Documentation (
docs/architecture.md)BigQuery → Pipeline → Cache → IQB Scoring → Dashboard
3. Developer Guide (
docs/developer_guide.md)4. User Guide (
docs/user_guide.md)Motivation
The IQB repository contains multiple components (library, pipeline, dashboard, analysis notebooks), but onboarding required navigating several separate README files.
This PR:
Scope
Testing
Documentation links were verified locally.
Markdown rendering was validated for GitHub compatibility.