Configuration properties are used to fine-tune Spark Structured Streaming applications.
You can set them for a SparkSession
when it is created using config
method.
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder
.config("spark.sql.streaming.metricsEnabled", true)
.getOrCreate
Tip
|
Read up on SparkSession in The Internals of Spark SQL book. |
Name | Description |
---|---|
|
Default: Supported values:
Used when StatefulAggregationStrategy execution planning strategy is executed (and plans a streaming query with an aggregate that simply boils down to creating a StateStoreRestoreExec with the proper implementation version of StreamingAggregationStateManager) Among the checkpointed properties that are not supposed to be overriden after a streaming query has once been started (and could later recover from a checkpoint after being restarted) |
|
|
|
|
|
|
|
(internal) A comma-separated list of fully-qualified class names of data source providers for which MicroBatchReadSupport is disabled. Reads from these sources will fall back to the V1 Sources. Default: Use SQLConf.disabledV2StreamingMicroBatchReaders to get the current value. |
|
|
|
(internal) The maximum number of batches which will be retained in memory to avoid loading from files. Default: Maximum count of versions a State Store implementation should retain in memory. The value adjusts a trade-off between memory usage vs cache miss:
Used exclusively when |
|
Flag whether Dropwizard CodaHale metrics are reported for active streaming queries Default: Use SQLConf.streamingMetricsEnabled to get the current value |
|
Default: Use SQLConf.minBatchesToRetain to get the current value |
|
Global watermark policy that is the policy to calculate the global watermark value when there are multiple watermark operators in a streaming query Default: Supported values:
Cannot be changed between query restarts from the same checkpoint location. |
|
(internal) How long to wait between two progress events when there is no data (in millis) when Default: Use SQLConf.streamingNoDataProgressEventInterval to get the current value |
|
Number of progress updates to retain for a streaming query Default: |
|
(internal) Time delay (in ms) before Default: |
|
The initial delay and how often to execute StateStore’s maintenance task. Default: |
|
(internal) The fully-qualified class name of the StateStoreProvider implementation that manages state data in stateful streaming queries. This class must have a zero-arg constructor. Default: HDFSBackedStateStoreProvider Use SQLConf.stateStoreProviderClass to get the current value. |
|
(internal) When enabled ( Default: |