Skip to content

Conversation

@asnare
Copy link

@asnare asnare commented Nov 17, 2025

Changes

This PR fixes the databricks labs list command so that all projects from Databricks Labs are listed: prior to this PR only the first 30 projects are listed.

The root cause of the problem is that the GitHub API being used is paginated, and defaults to only 30 results per page, and pagination was not implemented. This PR resolves the issue by:

  • updating the implementation to use the pagination mechanism; and
  • increasing the number of results per page from 30 (default) to 100 (max allowed).

Incidental changes include:

  • Some trace-level logging of the requests that are being made to the GitHub API, consistent with elsewhere in the codebase.
  • Some spelling fixes.

Out of scope for now:

  • Filtering the list of projects to only cover those that can be installed.

Why

The databricks labs list command was returning an incomplete list; projects that could be installed (eg. ucx) were not included in the list.

Tests

  • Additional test case to cover pagination of the GitHub API.
  • Manual testing.

This is currently incomplete because there are more projects now than the default pagination size, and pagination was not taking place. Here we implement support for pagination.
This reduces the number of API calls that we make.
@asnare asnare requested review from alexott and nfx as code owners November 17, 2025 15:23
@github-actions
Copy link

An authorized user can trigger integration tests manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 3935
  • Commit SHA: a4a43b6af53d6a8696ae8b6bd04905dff9507d22

Checks will be approved automatically on success.

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the max result size is 100, then the change could also just set that instead of adding the pagination. It probably works but won't be exercised until we actually have more than 100 repositories under labs.

return url
}
}
return ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a unit test for this. It feels brittle.

@pietern pietern added the labs issues related to `labs` command group (assign to @nfx) label Nov 20, 2025
@asnare
Copy link
Author

asnare commented Nov 20, 2025

If the max result size is 100, then the change could also just set that instead of adding the pagination. It probably works but won't be exercised until we actually have more than 100 repositories under labs.

That's fair, and I'm on the fence on this: when the code was originally implemented we were at (I believe) about a third of what we have now. So not supporting it feels like waiting for it to break again. The options I see are:

  • Drop pagination support, accepting it will probably break at some point. Much less complexity, kicks the can down the road, but maybe long enough that breaking point is never reached.
  • Leave the page-size at its default, resulting in more API calls. Unauthenticated calls like this are currently rate-limited to 60/hour/IP, and we're sometimes seeing folk hit this, so there's a reason for avoiding this option. However it ensures the code is exercised. And given the 24-hour cache maybe this is preferable.
  • Support it, even though it's not a path currently taken, but ensure a test covers it. This is the current state. (I also tested it manually with page-size set to 10.)

@pietern: With it laid out like this, what's your opinion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

labs issues related to `labs` command group (assign to @nfx)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants