Skip to content

Comments

docs: add structured documentation (architecture, developer & user guides) and improve onboarding#145

Open
Mario5T wants to merge 5 commits intom-lab:mainfrom
Mario5T:docs-improvement
Open

docs: add structured documentation (architecture, developer & user guides) and improve onboarding#145
Mario5T wants to merge 5 commits intom-lab:mainfrom
Mario5T:docs-improvement

Conversation

@Mario5T
Copy link

@Mario5T Mario5T commented Feb 19, 2026

Summary

This PR introduces structured documentation to improve onboarding, system clarity, and long-term maintainability of the IQB repository.

The changes align with the project's documented "Documentation" expected outcome and aim to make the system easier to understand and extend for contributors, researchers, policymakers, and ISPs.

No functional changes are introduced.


Changes Included

1. README Improvements

  • Clearer explanation of IQB and its 0–1 scoring scale
  • Explicit clarification of how IQB differs from traditional speed tests
  • Improved Quick Start instructions with explicit environment requirements
  • Added library usage example
  • Added CLI usage examples
  • Centralized documentation links
  • Added clarification note for staging dashboard

2. Architecture Documentation (docs/architecture.md)

  • High-level system overview
  • Clear explanation of repository structure
  • End-to-end data flow description:
    BigQuery → Pipeline → Cache → IQB Scoring → Dashboard
  • Separation of scoring logic and visualization layer
  • Extensibility and scalability considerations

3. Developer Guide (docs/developer_guide.md)

  • Local setup instructions
  • How to run the prototype dashboard
  • How to add metrics and dashboard pages
  • Guidance for extending scoring logic safely
  • Testing and linting expectations
  • Contribution workflow best practices

4. User Guide (docs/user_guide.md)

  • Conceptual explanation of IQB
  • Interpretation of IQB score and percentiles
  • Explanation of latency, throughput, and packet loss
  • Guidance on comparing ISPs responsibly
  • Sample size considerations
  • Limitations of Internet performance measurement

Motivation

The IQB repository contains multiple components (library, pipeline, dashboard, analysis notebooks), but onboarding required navigating several separate README files.

This PR:

  • Improves discoverability
  • Clarifies system architecture
  • Supports future dashboard extensions
  • Reduces onboarding friction for new contributors
  • Strengthens documentation for GSoC-level feature expansion

Scope

  • Documentation only
  • No changes to scoring logic, pipeline, or prototype functionality
  • No dependency modifications

Testing

Documentation links were verified locally.
Markdown rendering was validated for GitHub compatibility.

@Mario5T
Copy link
Author

Mario5T commented Feb 20, 2026

@zarsl @bassosimone @sermpezis plz review this PR.

@bassosimone bassosimone self-requested a review February 20, 2026 10:32
@bassosimone
Copy link
Collaborator

You just turned around our entire documentation structure w/o previous coordination and discussion. I think this is a bit too much for a first contribution. If you want to get this considered for merging, I recommend we engage into a conversation in which you explain to me the pain points in the original documentation. This would help me understand the motivation and rationale of your proposed changes and help me vet them and decide what to do.

Thank you for thinking about doing this, by the way. The documentation could always be improved and it is surely the entry point to the project. Precisely because of this and because it is a subjective call on what the correct documentation looks like, I have a need to chat more with you to reach some interpersonal alignment. (This job would have been easier if we discussed more beforehand and we coordinated a bit, especially if I had some understanding of the pain points we observed, but I am also open to discuss after the fact! 😅 🙏)

@Mario5T
Copy link
Author

Mario5T commented Feb 20, 2026

You just turned around our entire documentation structure w/o previous coordination and discussion. I think this is a bit too much for a first contribution. If you want to get this considered for merging, I recommend we engage into a conversation in which you explain to me the pain points in the original documentation. This would help me understand the motivation and rationale of your proposed changes and help me vet them and decide what to do.

Thank you for thinking about doing this, by the way. The documentation could always be improved and it is surely the entry point to the project. Precisely because of this and because it is a subjective call on what the correct documentation looks like, I have a need to chat more with you to reach some interpersonal alignment. (This job would have been easier if we discussed more beforehand and we coordinated a bit, especially if I had some understanding of the pain points we observed, but I am also open to discuss after the fact! 😅 🙏)

Thank you for the thoughtful and generous response — and fair point. I should have opened a discussion issue before sending a large PR. Let me explain what I observed and you can tell me how much, if any, of this aligns with gaps you've felt too.

Pain points I noticed in the existing documentation:

README.md
covered "what" but not "who for" The existing README described the repo structure well, but it didn't make clear at a glance which audience should read which part. A researcher arriving via a paper citation and a developer wanting to contribute have very different needs. The current entry point didn't route them.

No conceptual explanation of the IQB score The README links to PDFs and blog posts for conceptual background. There was no in-repo explanation — even a brief one — of what the score means, what use cases it covers, or how to interpret it. A policymaker or ISP engineer reading the README would still not know what a score of 0.57 means without leaving the repo.

Developer onboarding is spread across multiple READMEs Useful information about coding conventions, where caching applies, how to safely extend the config, and how to add a dashboard page is implicitly scattered across

CONTRIBUTING.md
,

library/README.md
, and

prototype/README.md
. There was no single place a new contributor could go to understand the full contribution surface.

No architecture document The data flow (BigQuery → pipeline → Parquet → IQBCache → IQBCalculator → dashboard) is described briefly in the top-level README, but the boundaries between components — what the library owns vs. what the prototype is allowed to do — aren't written down. This matters when deciding where to put new code.

What I'm not claiming:

I'm not claiming the existing docs are broken or wrong.

library/README.md
is genuinely good. The

data/README.md
CLI reference is precise and complete. The docs/internals/ and docs/design/ directories show real care. The gaps I'm pointing to are mostly about cross-cutting concerns and audience routing, not within-component accuracy.

What I'd suggest as a path forward:

Rather than merging all of this, I'm open to any of:

Picking one file that addresses the clearest gap (e.g., only

docs/architecture.md
, or only improving

README.md
)
Closing this PR and opening a discussion issue first, then iterating
Going through the files together and cutting anything you feel is redundant with existing docs
Happy to work in whatever direction makes sense to you.

@Mario5T Mario5T requested a review from sermpezis as a code owner February 20, 2026 17:45
@bassosimone
Copy link
Collaborator

After listing our project for GSoC, we received a large amount of pull requests across several repositories. We are dealing with the backlog, but this would take time. We will get back to this pull request eventually. In the meanwhile, if you are a GSoC applicant, please read our updated GSoC policy: https://github.com/m-lab/gsoc/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants