[spark] Disable compaction for data evolution table #6342

juntaozhang · 2025-09-26T12:18:53Z

Purpose

Spark data evolution table can appear inconsistent before and after compaction. Example:

CREATE TABLE s (id INT, b INT);
INSERT INTO s VALUES (1, 11), (2, 22);

CREATE TABLE t (id INT, b INT, c INT) TBLPROPERTIES ('row-tracking.enabled' = 'true', 'data-evolution.enabled' = 'true');
INSERT INTO t VALUES (2, 2, 2), (3, 3, 3);
MERGE INTO t
USING s
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET t.b = s.b
WHEN NOT MATCHED THEN INSERT (id, b, c) VALUES (id, b, 0);
select *, _ROW_ID, _SEQUENCE_NUMBER from t order by _ROW_ID asc;
CALL sys.compact(table => 't');
select *, _ROW_ID, _SEQUENCE_NUMBER from t order by _ROW_ID asc;

before compaction:

+----+----+---+---------+------------------+
| id |  b | c | _ROW_ID | _SEQUENCE_NUMBER |
+----+----+---+---------+------------------+
|  2 | 22 | 2 |       0 |                2 |
|  3 |  3 | 3 |       1 |                2 |
|  1 | 11 | 0 |       2 |                2 |
+----+----+---+---------+------------------+

after compaction:

+--------+----+--------+---------+------------------+
|     id |  b |      c | _ROW_ID | _SEQUENCE_NUMBER |
+--------+----+--------+---------+------------------+
| <NULL> | 22 | <NULL> |       0 |                2 |
|      2 |  2 |      2 |       0 |                1 |
| <NULL> |  3 | <NULL> |       1 |                2 |
|      3 |  3 |      3 |       1 |                1 |
|      1 | 11 |      0 |       2 |                2 |
+--------+----+--------+---------+------------------+

Disable compaction in Spark to align with Flink behavior (#6112).

Tests

API and Format

Documentation

JingsongLi · 2025-10-10T04:05:49Z

+1

[spark] Disable compaction for data evolution table

64d79a8

juntaozhang force-pushed the fix-disable-compact branch from 99da4e8 to 64d79a8 Compare September 28, 2025 03:07

[spark] Disable compaction for data evolution table - fix UT

10d8a53

JingsongLi merged commit db56793 into apache:master Oct 10, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[spark] Disable compaction for data evolution table #6342

[spark] Disable compaction for data evolution table #6342

Uh oh!

juntaozhang commented Sep 26, 2025

Uh oh!

JingsongLi commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[spark] Disable compaction for data evolution table #6342

[spark] Disable compaction for data evolution table #6342

Uh oh!

Conversation

juntaozhang commented Sep 26, 2025

Purpose

Tests

API and Format

Documentation

Uh oh!

JingsongLi commented Oct 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants