JobProgressListener
is the SparkListener for web UI.
As a SparkListener
it intercepts Spark events and collect information about jobs, stages, and tasks that the web UI uses to present the status of a Spark application.
JobProgressListener
is interested in the following events:
Caution
|
FIXME What information does JobProgressListener track?
|
poolToActiveStages = HashMap[PoolName, HashMap[StageId, StageInfo]]()
poolToActiveStages
…
Caution
|
FIXME |
onJobStart(jobStart: SparkListenerJobStart): Unit
When called, onJobStart
reads the optional Spark Job group id (using SparkListenerJobStart.properties
and SparkContext.SPARK_JOB_GROUP_ID
key).
It then creates a JobUIData (as jobData
) based on the input jobStart
. status
attribute is JobExecutionStatus.RUNNING
.
The internal jobGroupToJobIds is updated with the job group and job ids.
The internal pendingStages is updated with StageInfo
for the stage id (for every StageInfo
in SparkListenerJobStart.stageInfos
collection).
numTasks
attribute in the jobData
(as JobUIData
instance created above) is set to the sum of tasks in every stage (from jobStart.stageInfos
) for which completionTime
attribute is not set.
The internal jobIdToData and activeJobs are updated with jobData
for the current job.
The internal stageIdToActiveJobIds is updated with the stage id and job id (for every stage in the input jobStart
).
The internal stageIdToInfo is updated with the stage id and StageInfo
(for every StageInfo
in jobStart.stageInfos
).
A StageUIData is added to the internal stageIdToData for every StageInfo
(in jobStart.stageInfos
).
Note
|
onJobStart is a part of SparkListener contract to handle…FIXME
|
schedulingMode
attribute is used to show the scheduling mode for the Spark application in Spark UI.
Note
|
It corresponds to spark.scheduler.mode setting. |
When SparkListenerEnvironmentUpdate is received, JobProgressListener
looks up spark.scheduler.mode
key in Spark Properties
map to set the internal schedulingMode
field.
Note
|
It is used in Jobs and Stages tabs. |