Skip to content

Latest commit

 

History

History
377 lines (249 loc) · 26.3 KB

CONTRIBUTING.md

File metadata and controls

377 lines (249 loc) · 26.3 KB

Welcome! Thank you for contributing to Apache Pekko HTTP!

We follow the standard GitHub fork & pull approach to pull requests. Just fork the official repo, develop in a branch, and submit a PR!

You're always welcome to submit your PR straight away and start the discussion (without reading the rest of this wonderful doc, or the README.md). The goal of these notes is to make your experience contributing to Pekko HTTP as smooth and pleasant as possible. We're happy to guide you through the process once you've submitted your PR.

Introduction

The README has details about the Pekko Community and building from source.

Navigating around the project & codebase

Branches summary

Depending on which version (or sometimes module) you want to work on, you should target a specific branch as explained below:

  • main – active development branch of pekko-http 1.0.0

Tags

Pekko HTTP uses tags to categorise issues into groups or mark their phase in development.

Most notably many tags start t: prefix (as in topic:), which categorises issues in terms of which module they relate to. Examples are:

In general all issues are open for anyone working on them, however if you're new to the project and looking for an issue that will be accepted and likely is a nice one to get started you should check out the following tags:

  • help wanted - which identifies issues that the core team will likely not have time to work on, or the issue is a nice entry level ticket. If you're not sure how to solve a ticket but would like to work on it feel free to ask in the issue about clarification or tips.
  • nice-to-have (low-priority) - are tasks which make sense, however are not very high priority (in face of other very high priority issues). If you see something interesting in this list, a contribution would be really wonderful!

Another group of tickets are those which start from a number. They're used to signal in what phase of development an issue is:

  • 0 - new - is assigned when a ticket is unclear on it's purpose or if it is valid or not. Sometimes the additional tag discuss is used to mark such tickets, if they propose large scale changed and need more discussion before moving into triaged (or being closed as invalid)
  • 1 - triaged - roughly speaking means "this ticket makes sense". Triaged tickets are safe to pick up for contributing in terms of likeliness of a patch for it being accepted. It is not recommended to start working on a ticket that is not triaged.
  • 2 - pick next - used to mark issues which are next up in the queue to be worked on. Sometimes it's also used to mark which PRs are expected to be reviewed/merged for the next release. The tag is non-binding, and mostly used as organisational helper.
  • 3 - in progress - means someone is working on this ticket. If you see a ticket that has the tag, however seems inactive, it could have been an omission with removing the tag, feel free to ping the ticket then if it's still being worked on.

The last group of special tags indicate specific states a ticket is in:

  • bug - bugs take priority in being fixed above features. The core team dedicates a number of days to working on bugs each sprint. Bugs which have reproducers are also great for community contributions as they're well isolated. Sometimes we're not as lucky to have reproducers though, then a bugfix should also include a test reproducing the original error along with the fix.
  • failed - tickets indicate a Jenkins failure (for example from a nightly build). These tickets usually start with the FAILED: ... message, and include a stacktrace + link to the Jenkins failure. The tickets are collected and worked on with priority to keep the build stable and healthy. Often times it may be simple timeout issues (Jenkins boxes are slow), though sometimes real bugs are discovered this way.

Pull Request validation states:

  • validating => [tested | needs-attention] - signify pull request validation status

Pekko HTTP contributing guidelines

These guidelines apply to all Apache Pekko projects, by which we mean both the apache/pekko repository, as well as any plugins or additional repos located under the Pekko GitHub organisation, e.g. apache/pekko-http and others.

These guidelines are meant to be a living document that should be changed and adapted as needed. We encourage changes that make it easier to achieve our goals in an efficient way.

Please also note that we have a Code of Conduct in place which aims keep our community a nice and helpful one. You can read its full text here: ASF Code of Conduct.

General Workflow

The below steps are how to get a patch into a main development branch. The steps are exactly the same for everyone involved in the project (be it core team, or first time contributor).

  1. To avoid duplicated effort, it might be good to check the issue tracker and existing pull requests for existing work.
    • If there is no ticket yet, feel free to create one to discuss the problem and the approach you want to take to solve it.
  2. Fork the project on GitHub. You'll need to create a feature-branch for your work on your fork, as this way you'll be able to submit a PullRequest against the mainline Pekko HTTP.
  3. Create a branch on your fork and work on the feature. For example: git checkout -b wip-custom-headers-pekko-http
    • Please make sure to follow the general quality guidelines (specified below) when developing your patch.
    • Please write additional tests covering your feature and adjust existing ones if needed before submitting your Pull Request. The validatePullRequest sbt task (explained below) may come in handy to verify your changes are correct.
  4. Once your feature is complete, prepare the commit following our Creating Commits And Writing Commit Messages. For example, a good commit message would be: Adding compression support for Manifests #22222 (note the reference to the ticket it aimed to resolve).
  5. Now it's finally time to submit the Pull Request!
  6. For large PRs, we may ask you to submit an Apache Software Foundation CLA.
  7. Now both committers and interested people will review your code. This process is to ensure the code we merge is of the best possible quality, and that no silly mistakes slip through. You're expected to follow-up these comments by adding new commits to the same branch. The commit messages of those commits can be more lose, for example: Removed debugging using printline, as they all will be squashed into one commit before merging into the main branch.
    • The community and team are really nice people, so don't be afraid to ask follow up questions if you didn't understand some comment, or would like to clarify how to continue with a given feature. We're here to help, so feel free to ask and discuss any kind of questions you might have during review!
  8. After the review you should fix the issues as needed (pushing a new commit for new review etc.), iterating until the reviewers give their thumbs up–which is signaled usually by a comment saying LGTM, which means "Looks Good To Me".
    • In general a PR is expected to get 2 LGTMs from the team before it is merged. If the PR is trivial, or under special circumstances (such as most of the team being on vacation, a PR was very thoroughly reviewed/tested and surely is correct) one LGTM may be fine as well.
  9. If the code change needs to be applied to other branches as well (for example a bugfix needing to be backported to a previous version), one of the team will either ask you to submit a PR with the same commit to the old branch, or do this for you.
    • Backport pull requests such as these are marked using the phrase for validation in the title to make the purpose clear in the pull request list. They can be merged once validation passes without additional review (if no conflicts).
  10. Once everything is said and done, your Pull Request gets merged 🎉 Your feature will be available with the next “earliest” release milestone (i.e. if back-ported so that it will be in release x.y.z, find the relevant milestone for that release). And of course you will be given credit for the fix in the release stats during the release's announcement. You've made it!

The TL;DR; of the above very precise workflow version is:

  1. Fork pekko-http
  2. Hack and test on your feature (on a branch)
  3. Submit a PR
  4. Sign the ASF CLA if necessary
  5. Keep polishing it until received enough LGTM
  6. Profit!

Note that the pekko-http sbt project is not as large as the Pekko one, so sbt should be able to run with less heap than with the Pekko project. In case you need to increase the heap, this can be specified using a command line argument sbt -mem 2048 or in the environment variable SBT_OPTS but then as a regular JVM memory flag, for example SBT_OPTS=-Xmx2G, on some platforms you can also edit the global defaults for sbt in /usr/local/etc/sbtopts.

The validatePullRequest task

The Pekko HTTP build includes a special task called validatePullRequest which investigates the changes made as well as dirty (uncommitted changes) in your local working directory and figures out which projects are impacted by those changes, then running tests only on those projects.

For example, changing something in pekko-http-core would cause tests to be run in all projects which depend on it (e.g. pekko-http-tests, pekko-http-marshallers-*, docs etc.).

To use the task simply type, and the output should include entries like shown below:

> validatePullRequest
[info] Diffing [HEAD] to determine changed modules in PR...
[info] Detected uncommitted changes in directories (including in dependency analysis): [pekko-protobuf,project]
[info] Detected changes in directories: [docs, project, pekko-http-tests, pekko-protobuf, pekko-http-testkit, pekko-http, pekko-http-core, pekko-stream]

By default, changes are diffed with the main branch when working locally, if you want to validate against a different target PR branch you can do so by setting the PR_TARGET_BRANCH environment variable for SBT:

PR_TARGET_BRANCH=origin/example sbt validatePullRequest

Developing against Pekko main

Pekko HTTP codebase is currently built with snapshot versions of the Pekko core libraries.

Since Pekko HTTP is released separately to Pekko "core", yet some features require changes in Pekko itself, it is sometimes very useful to be able to develop Pekko HTTP with Pekko's sources used directly instead of the binary dependency. You can check out the Pekko repository and run sbt with -Dpekko.sources=$HOME/pekko to develop Pekko HTTP with Pekko as a source dependency instead of a binary one.

This allows simple and fast iterations on changes that would need to be introduced in Pekko to develop a feature in HTTP that would require those.

Binary compatibility

Binary compatibility rules and guarantees are described in depth in the Binary Compatibility Rules section of the documentation.

Pekko HTTP uses MiMa (which is short for Lightbend Migration Manager) to validate binary compatibility of incoming Pull Requests. If your PR fails due to binary compatibility issues, you may see an error like this:

[info] pekko-stream: found 1 potential binary incompatibilities while checking against org.apache.pekko:pekko-stream_2.11:2.4.2  (filtered 222)
[error]  * method foldAsync(java.lang.Object,scala.Function2)org.apache.pekko.stream.scaladsl.FlowOps in trait org.apache.pekko.stream.scaladsl.FlowOps is present only in current version
[error]    filter with: ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.pekko.stream.scaladsl.FlowOps.foldAsync")

In such situations, it's good to consult with a core team member if the violation can be safely ignored (by adding the above snippet to the project's src/main/mima-filters), or if it would indeed break binary compatibility.

Situations where it may be fine to ignore a MiMa issued warning include:

  • if it is touching any class marked as private[pekko], /** INTERNAL API*/ or similar markers
  • if it is concerning internal classes (often recognizable by package names like dungeon, impl, internal etc.)
  • if it is adding API to classes / traits which are only meant for extension by Pekko itself, i.e. should not be extended by end-users
  • if it is touching any class marked with the @InternalApi, @DoNotInherit, and @ApiMayChange. See API stability annotations and comments
  • other tricky situations

If it turns out that the change can be safely ignored, please add the filter to a file with the issue number in the filename, placed in the submodule's src/main/mima-filters/<last-released-version>.backwards.excludes directory.

You can run mimaReportBinaryIssues on the sbt console to check if you introduced a binary incompatibility or whether an incompatibility has been successfully ignored after adding it to the filter file.

Pull Request Requirements

For a Pull Request to be considered at all, it has to meet these requirements:

  1. Regardless if the code introduces new features or fixes bugs or regressions, it must have comprehensive tests.

  2. The code must be well documented in the Lightbend's standard documentation format (see the ‘Documentation’ section below).

  3. The commit messages must properly describe the changes, see further below.

  4. All Apache projects must include Apache copyright notices. Each project can choose between one of two approaches:

    1. All source files in the project must have an Apache copyright notice in the file header.
    2. The Notices file for the project includes the Apache copyright notice and no other files contain copyright notices. See https://www.apache.org/legal/src-headers.html for instructions for managing this approach for copyrights.

    Pekko HTTP uses the first choice, having copyright notices in every file header.

Additional guidelines

Some additional guidelines regarding source code are:

  • keep the code DRY
  • apply the Boy Scout Rule whenever you have the chance to
  • Never delete or change existing copyright notices, just add additional info.
  • Do not use @author tags since it does not encourage Collective Code Ownership.
    • Contributors , each project should make sure that the contributors gets the credit they deserve—in a text file or page on the project website and in the release notes etc.

If these requirements are not met then the code should not be merged into main, or even reviewed - regardless of how good or important it is. No exceptions.

Whether or not a pull request (or parts of it) shall be back- or forward-ported will be discussed on the pull request discussion page, it shall therefore not be part of the commit messages. If desired the intent can be expressed in the pull request description.

Documentation

All documentation must abide by the following maxims:

  • Example code should be run as part of an automated test suite.
  • Version should be programmatically specifiable to the build.
  • Generation should be completely automated and available for scripting.
  • When renaming Markdown files, add a rewrite rule to the .htaccess file to not break external links.

All documentation is preferred to be in Lightbend's standard documentation format Paradox. The language used by Paradox is a super-set or Markdown which supports most Github Flavored Markdown extensions as well as additional directives to facilitate writing documentation for software projects. Refer to its documentation to learn about the more advanced features it provides (including code etc).

To generate documentation you can:

> project docs
> paradox

The rendered documentation will be available under docs/target/paradox/site/main/index.html.

JavaDoc

Pekko HTTP generates JavaDoc-style API documentation using the genjavadoc sbt plugin, since the sources are written mostly in Scala.

Generating JavaDoc is not enabled by default, as it's not needed on day-to-day development as it's expected to just work. If you'd like to check if you links and formatting looks good in JavaDoc (and not only in ScalaDoc), you can generate it by running:

sbt -pekko.genjavadoc.enabled=true javaunidoc:doc

Which will generate JavaDoc style docs in ./target/javaunidoc/index.html

External Dependencies

All the external runtime dependencies for the project, including transitive dependencies, must have an open source license that is equal to, or compatible with, Apache 2.

This must be ensured by manually verifying the license for all the dependencies for the project:

  1. Whenever a committer to the project changes a version of a dependency (including Scala) in the build file.
  2. Whenever a committer to the project adds a new dependency.
  3. Whenever a new release is cut (public or private for a customer).

Which licenses are compatible with Apache 2 are defined in this doc, where you can see that the licenses that are listed under Category A automatically compatible with Apache 2, while the ones listed under Category B needs additional action:

Each license in this category requires some degree of reciprocity. This may mean that additional action is warranted in order to minimize the chance that a user of an Apache product will create a derivative work of a differently-licensed portion of an Apache product without being aware of the applicable requirements.

Each project must also create and maintain a list of all dependencies and their licenses, including all their transitive dependencies. This can be done either in the documentation or in the build file next to each dependency.

Creating Commits And Writing Commit Messages

Follow these guidelines when creating public commits and writing commit messages.

  1. If your work spans multiple local commits (for example; if you do safe point commits while working in a feature branch or work in a branch for a long time doing merges/rebases etc.) then please do not commit it all but rewrite the history by squashing the commits into a single big commit which you write a good commit message for (like discussed in the following sections). For more info read this article: Git Workflow. Every commit should be able to be used in isolation, cherry picked etc.

  2. First line should be a descriptive sentence what the commit is doing, including the ticket number. It should be possible to fully understand what the commit does—but not necessarily how it does it—by just reading this single line. We follow the “imperative present tense” style for commit messages (more info here).

    It is not ok to only list the ticket number, type "minor fix" or similar. If the commit is a small fix, then you are done. If not, go to 3.

  3. Following the single line description should be a blank line followed by an enumerated list with the details of the commit.

  4. You can request review by a specific team member for your commit (depending on the degree of automation we reach, the list may change over time):

    • Review by @gituser - if you want to notify someone on the team. The others can, and are encouraged to participate.

Example:

core: fix bug #1234

* Details 1
* Details 2
* Details 3

Applying code style to the project

The project uses scalafmt to ensure code quality which is automatically checked on every PR. If you would like to check for any potential code style problems locally you can run sbt checkCodeStyle and if you want to apply the code style then you can run sbt applyCodeStyle.

Ignoring formatting commits in git blame

Throughout the history of the codebase various formatting commits have been applied as the scalafmt style has evolved over time, if desired one can setup git blame to ignore these commits. The hashes for these specific are stored in this file so to configure git blame to ignore these commits you can execute the following.

git config blame.ignoreRevsFile .git-blame-ignore-revs

Pull request validation workflow details

Pekko HTTP uses Github Actions to validate PRs. It builds it, runs the tests and comments on the Pull Request in GitHub.

In order to speed up PR validation times, the Pekko HTTP build contains a special sbt task called validatePullRequest, which is smart enough to figure out which projects should be built if a PR only has changes in some parts of the project. For example, if your PR only touches pekko-http-testkit, no pekko-parsing tests need to be run, however the task will validate all projects that depend on pekko-http-testkit (including samples). Also, tests tagged as PerformanceTest and the likes of it are excluded from PR validation.

In order to force the validatePullRequest task to build the entire project, regardless of dependency analysis of a PRs changes one can use the special PLS BUILD ALL command (typed in a comment on GitHub, on the Pull Request), which will cause the validator to test all projects.

Source style

Scala style

Pekko HTTP uses scalafmt to enforce some of the code style rules.

Java style

Java code is currently not automatically reformatted by sbt (expecting to have a plugin to do this soon). Thus we ask Java contributions to follow these simple guidelines:

Preferred ways to use timeouts in tests

Avoid short test timeouts, since Jenkins server may GC heavily causing spurious test failures. GC pause or other hiccup of 2 seconds is common in our CI environment. Please note that usually giving a larger timeout does not slow down the tests, as in an expectMessage call for example it usually will complete quickly.

There is a number of ways timeouts can be defined in Pekko tests. The following ways to use timeouts are recommended (in order of preference):

  • remaining is first choice (requires within block)
  • remainingOrDefault is second choice
  • 3.seconds is third choice if not using testkit
  • lower timeouts must come with a very good reason (e.g. Awaiting on a known to be "already completed" Future)

Special care should be given expectNoMessage calls, which indeed will wait the entire timeout before continuing, therefore a shorter timeout should be used in those, for example 200 or 300.millis.

You can read up on remaining and friends in TestKit.scala

Supporting infrastructure

Continuous Integration

pekko-http currently uses Github Actions for continuous integration and release automation.

Snapshots

We do not yet publish snapshots.

Related links

Development tools

HTTP/2

A great tool to inspect HTTP/2 frames of real-world connections is Wireshark.

HTTPS

When analyzing an HTTPS connection, you need a way to see the plaintext payloads. You can enable logging those on the Pekko HTTP side with pekko.http.server.log-unencrypted-network-bytes = 100 or pekko.http.server.http2.log-frames = true (same for client-side).

To see the traffic in Wireshark, some clients can be configured to dump an SSLKEYLOGFILE that Wireshark can use to decrypt the traffic.

mitmproxy is a nice tool to inspect HTTPS traffic, and supports dumping the SSLKEYLOGFILE. However, since it 'understands' HTTP, it might not be 'transparent' enough: especially when diagnosing protocol errors, adding this proxy might interfere with reproducing the problem.

sslsplit should be able to dump an intercepted stream to a pcap file directly, but currently its https proxy mode does not bridge the ALPN negotiation, so connections will downgrade to HTTP/1.1. This might be fixed when they add HTTP/2 support

alpnpass can be used to intercept the plaintext traffic in Wireshark: you can set its InterceptorPort to the same value as the ReturnPort and then sniff the loopback interface and filter on that port.

golang

When testing against a Go application, running with the environment variable GODEBUG set to http2debug=2 enabled additional logging on the Go side