-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28578: Concurrency issue in updateTableColumnStatistics #5929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
...e-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Outdated
Show resolved
Hide resolved
...e-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Outdated
Show resolved
Hide resolved
|
|
There was another PR for the same: #5567 And since HMS provides -1 on the locking approach |
We can use Optimistic Locking instead - higher throughput and concurrency and retry if update affects 0 rows The same strategy is used in |
As far as I know in MySQL this will block others for the same row(the same for update), if the 0 row returns, means we have acquired the X But this inspires me we can alter the table during the updating the column statistics, so we don't ask for the row lock |
After some research, I find it's hard to use DataNucleus without the pure "UPDATE" query, mTable.setLastAccessTime((int) (System.currentTimeMillis()/1000));
pm.flush(); // here might flush the old MTable into data store
pm.refresh(mTable);In |
|
please give me some time to review and reply |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
|
@nrg4878 @saihemanth-cloudera @zhangbutao @wecharyu could you check this PR as well please? |
...e-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
Outdated
Show resolved
Hide resolved
saihemanth-cloudera
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
3df3cbf to
c77f11f
Compare
|
|
||
| Map<String, MPartitionColumnStatistics> oldStats = getPartitionColStats(table, statsDesc | ||
| .getPartName(), colNames, colStats.getEngine()); | ||
| lockForUpdate("PARTITIONS", "PART_ID", Optional.of("\"PART_ID\" = " + partitionIds.getFirst())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change it to look up the partition by primary key
c77f11f to
207853c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1
As far as I know in MySQL this will block others for the same row(the same for update), if the 0 row returns, means we have acquired the X tblId row lock, why do we need to retry again?
Using retries instead of blocking aligns better with optimistic concurrency control, allowing transactions to proceed without waiting and only retry if a conflict actually occurs.
The same approach is applied in several areas, including Iceberg’s conflict resolution (HIVE-26882) and compaction.
I don't think retry is the good way, it introduces another issue, the |
|
For optimistic concurrency control, IMO this is more suitable for read skew scenario, otherwise it's a waste of the resource to retry on conflict |
Locking TBLS and PARTITIONS isn’t necessarily a better solution. In large-scale, highly concurrent systems, locking often becomes a scalability bottleneck. Instead, most systems adopt optimistic locking strategies. Each operation assumes no conflict and only validates at commit, reducing contention and avoiding blocking. The idea is to balance consistency with throughput. Extending this principle to stats import (which btw is not a core functionality) we could consider a mechanism based on versioning rather than traditional locking. Such an approach generally scales far better in multi-instance HMS environments than depending solely on table or partition locks. |
Currently the HMS is lack of versioning, the RDMS nowadays provides MVCC for writes not block any reads, so the write is the main concern here. Even without the explicit lock, we still have some exclusive row lock under the scenes, the RDMS in background ensures strong consistency and reliability the HMS benefits from and relies on, and provides a milliseconds to seconds query execution time, that can also have a satisfied throughput. |
MVCC (Multi-Version Concurrency Control) and SELECT ... FOR UPDATE (S4U) operate at completely different conceptual levels — they’re not comparable mechanisms even though both relate to concurrency. -- MVCC style (optimistic) -- Pessimistic locking
Every iceberg table commits involves alter table operation and it's non-blocking ATM.
That’s the opposite of what CU experienced on a highly loaded MySQL cluster with S4U on the NEXT_TXN_ID table. |
Here we still obtained a row-level lock on As I explained, this would cause lost-update, In pm.flush() the old state of the table can get overwritten the one in data store, resulting to some columns missing in COLUMN_STATS_ACCURATE, for example:
The Iceberg commit relies on the DB transaction atomicity, it should involve the row lock behind the scenes, though the lock is quite small(TBL_ID, PARAM_KEY), if the table has multiple commits at the same time, only one is allowed to alter the
This is due to every HMS request needs to lock the only one |
3cbbf2c to
3499d20
Compare
|
|
I guess I'm a bit get your optimistic concurrency control, let me try |
|
Filed another PR: #6159 to catch the nowait for update |



What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
No
How was this patch tested?
Tested using the Repro.java against MySQL and Postgres, I didn't see the issue any more.