Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Internal] Update Jobs list function to support paginated responses #896

Merged
merged 3 commits into from
Feb 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEXT_CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
### Documentation

### Internal Changes
* Update Jobs ListJobs API to support paginated responses ([#896](https://github.com/databricks/databricks-sdk-py/pull/896))
* Introduce automated tagging ([#888](https://github.com/databricks/databricks-sdk-py/pull/888))
* Update Jobs GetJob API to support paginated responses ([#869](https://github.com/databricks/databricks-sdk-py/pull/869)).

Expand Down
57 changes: 55 additions & 2 deletions databricks/sdk/mixins/jobs.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,64 @@
from typing import Optional
from typing import Iterator, Optional

from databricks.sdk.service import jobs
from databricks.sdk.service.jobs import Job
from databricks.sdk.service.jobs import BaseJob, Job


class JobsExt(jobs.JobsAPI):

def list(self,
*,
expand_tasks: Optional[bool] = None,
limit: Optional[int] = None,
name: Optional[str] = None,
offset: Optional[int] = None,
page_token: Optional[str] = None) -> Iterator[BaseJob]:
"""List jobs.

Retrieves a list of jobs. If the job has multiple pages of tasks, job_clusters, parameters or environments,
it will paginate through all pages and aggregate the results.

:param expand_tasks: bool (optional)
Whether to include task and cluster details in the response. Note that in API 2.2, only the first
100 elements will be shown. Use :method:jobs/get to paginate through all tasks and clusters.
:param limit: int (optional)
The number of jobs to return. This value must be greater than 0 and less or equal to 100. The
default value is 20.
:param name: str (optional)
A filter on the list based on the exact (case insensitive) job name.
:param offset: int (optional)
The offset of the first job to return, relative to the most recently created job. Deprecated since
June 2023. Use `page_token` to iterate through the pages instead.
:param page_token: str (optional)
Use `next_page_token` or `prev_page_token` returned from the previous request to list the next or
previous page of jobs respectively.

:returns: Iterator over :class:`BaseJob`
"""
# fetch jobs with limited elements in top level arrays
jobs_list = super().list(expand_tasks=expand_tasks,
limit=limit,
name=name,
offset=offset,
page_token=page_token)
if not expand_tasks:
yield from jobs_list

# fully fetch all top level arrays for each job in the list
for job in jobs_list:
if job.has_more:
job_from_get_call = self.get(job.job_id)
job.settings.tasks = job_from_get_call.settings.tasks
job.settings.job_clusters = job_from_get_call.settings.job_clusters
job.settings.parameters = job_from_get_call.settings.parameters
job.settings.environments = job_from_get_call.settings.environments
# Remove has_more fields for each job in the list.
# This field in Jobs API 2.2 is useful for pagination. It indicates if there are more than 100 tasks or job_clusters in the job.
# This function hides pagination details from the user. So the field does not play useful role here.
if hasattr(job, 'has_more'):
delattr(job, 'has_more')
yield job

def get_run(self,
run_id: int,
*,
Expand Down
Loading
Loading