Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIVE-28658 Add Iceberg REST Catalog client support #5628

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zratkai
Copy link
Contributor

@zratkai zratkai commented Jan 31, 2025

Change-Id: I5bb2559f7ca602b71f8ca03c852e2deff1a1bc52

What changes were proposed in this pull request?

Iceberg REST client implementation added to support Iceberg REST server connection.

Why are the changes needed?

To support Iceberg REST server connection.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

How was this patch tested?

Unit test.

Class.forName("org.apache.iceberg.hive.HiveIcebergRESTCatalogClientAdapter", true, Utilities.getSessionSpecifiedClassLoader());
IMetaStoreClient restCatalogMetastoreClient = ReflectionUtils.newInstance(handlerClass, conf);
restCatalogMetastoreClient.reconnect();
return restCatalogMetastoreClient;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree with this approach. However, if we replace IMetaStoreClient, I'd like you to resolve HIVE-12679, which is more generic and allows users to use Glue or another service.

Committers have asked us to discover solutions to implement some of SessionHiveMetaStoreClient's traits.
#4444

That's why we should make configurable not IMetaStoreClient but ThriftHiveMetastore.Iface.
https://issues.apache.org/jira/browse/HIVE-28658?focusedCommentId=17905019&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17905019

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@okumin thanks for checking my PR. I see the point to make it configurable, but as I see the ticket you mentioned didn't get a conclusion and it was closed and not merged. So I would suggest to handle this separately and once your is merged we can refactor this. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@okumin could you please reopen your PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR has not reached any conclusion with which maintainers can agree. We have to reconsidering how we integrate some session features such as tmp tables.
This is the summary and my prososal. I'd like to hear your opinions as my recommendation is not aligned your PR

https://docs.google.com/document/d/1fFvB0DAXJvPYv27R8nLa3tOUT67-hY0owxBrDY-BxUA/edit?usp=sharing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the time has come for HIVE-12679. We should start a new discussion and proceed with the approach with more +1 votes.

@zhangbutao
Copy link
Contributor

@zratkai Thanks for your PR! Could you give an example of a test? Or the records & screenshots you've tested?

Change-Id: I5bb2559f7ca602b71f8ca03c852e2deff1a1bc52
@@ -333,6 +333,27 @@
<artifactId>slf4j-simple</artifactId>
<scope>test</scope>
</dependency>
<dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need http libs in metastore-common?

@@ -887,6 +887,21 @@
<artifactId>RoaringBitmap</artifactId>
<version>${roaringbit.version}</version>
</dependency>
<dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same thing, we need http libs here? client is in iceberg-catalog and it should bring dependencies transitively

<value>rest</value>
</property>
<property>
<name>iceberg.catalog.rest.type</name>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iceberg.catalog.type

<value>rest</value>
</property>
<property>
<name>iceberg.catalog.rest.uri</name>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iceberg.catalog.uri

@@ -60,4 +60,20 @@
<name>metastore.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
<property>
<name>iceberg.catalog</name>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added for debugging purposes, I remove it.

<value>http://playground-gravitino:9001/iceberg</value>
</property>
<property>
<name>metastore.type</name>
Copy link
Member

@deniskuzZ deniskuzZ Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

</property>
<property>
<name>iceberg.catalog.rest.uri</name>
<value>http://playground-gravitino:9001/iceberg</value>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HMS Iceberg catalog would be merged soon, so i think it would be better to use that one as default

@@ -1054,7 +1054,7 @@ public void testSetSnapshotSummary() throws Exception {
}
assertThat(JsonUtil.mapper().writeValueAsString(summary).length()).isLessThan(4000);
Map<String, String> parameters = Maps.newHashMap();
ops.setSnapshotSummary(parameters, snapshot);
ops.getIcebergTableConverter().setSnapshotSummary(parameters, snapshot);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do an implicit translation?

@@ -5839,13 +5842,33 @@ public HiveMetaHook getHook(
}
};

if ("rest".equals(conf.get("metastore.type", "hms"))) {
Copy link
Member

@deniskuzZ deniskuzZ Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. conf.get(catalog-type)
    enum Catalog (HIVE, ICEBERG),
  2. conf.get(catalog-impl)
    enum CatalogType (HMS, HADOOP, REST)

note: REST would be supported for non-ACID as well via HIVE catalog


import static org.apache.iceberg.TableProperties.GC_ENABLED;

public class IcebergTableConverter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this converter? is this just a refactor? if yes, you should create an Iceberg PR first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the separate PR, please review it, so I can rebase this PR on top of it:
#5676

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is in iceberg repo:
apache/iceberg#12461

if (conf.getBoolVar(ConfVars.METASTORE_FASTPATH)) {
return new SessionHiveMetaStoreClient(conf, hookLoader, allowEmbedded);
} else {
return RetryingMetaStoreClient.getProxy(conf, hookLoader, metaCallTimeMap,
SessionHiveMetaStoreClient.class.getName(), allowEmbedded);
}
}
private IMetaStoreClient getHiveIcebergRESTCatalog(HiveConf conf, HiveMetaHookLoader hookLoader) throws MetaException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we create IMetaStoreClientFactory and extract this logic there

if ("rest".equals(conf.get("metastore.type", "hms"))) {
return getHiveIcebergRESTCatalog(conf, hookLoader);
}

if (conf.getBoolVar(ConfVars.METASTORE_FASTPATH)) {
return new SessionHiveMetaStoreClient(conf, hookLoader, allowEmbedded);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note. SessionHiveMetaStoreClient has critical traits to achieve some features, e.g., temporary tables. My question: Do we want to include those capabilities when integrating with an Iceberg REST Catalog, or do we have valid reasons why we don't need to support them?
IMetaStoreClient might not be a good place to override if we need them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants