-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-28658 Add Iceberg REST Catalog client support #5628
base: master
Are you sure you want to change the base?
Conversation
|
Class.forName("org.apache.iceberg.hive.HiveIcebergRESTCatalogClientAdapter", true, Utilities.getSessionSpecifiedClassLoader()); | ||
IMetaStoreClient restCatalogMetastoreClient = ReflectionUtils.newInstance(handlerClass, conf); | ||
restCatalogMetastoreClient.reconnect(); | ||
return restCatalogMetastoreClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree with this approach. However, if we replace IMetaStoreClient, I'd like you to resolve HIVE-12679, which is more generic and allows users to use Glue or another service.
Committers have asked us to discover solutions to implement some of SessionHiveMetaStoreClient's traits.
#4444
That's why we should make configurable not IMetaStoreClient but ThriftHiveMetastore.Iface.
https://issues.apache.org/jira/browse/HIVE-28658?focusedCommentId=17905019&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17905019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@okumin thanks for checking my PR. I see the point to make it configurable, but as I see the ticket you mentioned didn't get a conclusion and it was closed and not merged. So I would suggest to handle this separately and once your is merged we can refactor this. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@okumin could you please reopen your PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR has not reached any conclusion with which maintainers can agree. We have to reconsidering how we integrate some session features such as tmp tables.
This is the summary and my prososal. I'd like to hear your opinions as my recommendation is not aligned your PR
https://docs.google.com/document/d/1fFvB0DAXJvPYv27R8nLa3tOUT67-hY0owxBrDY-BxUA/edit?usp=sharing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the time has come for HIVE-12679. We should start a new discussion and proceed with the approach with more +1 votes.
@zratkai Thanks for your PR! Could you give an example of a test? Or the records & screenshots you've tested? |
Change-Id: I5bb2559f7ca602b71f8ca03c852e2deff1a1bc52
ac2388f
to
444db9e
Compare
@@ -333,6 +333,27 @@ | |||
<artifactId>slf4j-simple</artifactId> | |||
<scope>test</scope> | |||
</dependency> | |||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need http libs in metastore-common?
@@ -887,6 +887,21 @@ | |||
<artifactId>RoaringBitmap</artifactId> | |||
<version>${roaringbit.version}</version> | |||
</dependency> | |||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same thing, we need http libs here? client is in iceberg-catalog and it should bring dependencies transitively
<value>rest</value> | ||
</property> | ||
<property> | ||
<name>iceberg.catalog.rest.type</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iceberg.catalog.type
<value>rest</value> | ||
</property> | ||
<property> | ||
<name>iceberg.catalog.rest.uri</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iceberg.catalog.uri
@@ -60,4 +60,20 @@ | |||
<name>metastore.metastore.event.db.notification.api.auth</name> | |||
<value>false</value> | |||
</property> | |||
<property> | |||
<name>iceberg.catalog</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added for debugging purposes, I remove it.
<value>http://playground-gravitino:9001/iceberg</value> | ||
</property> | ||
<property> | ||
<name>metastore.type</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
</property> | ||
<property> | ||
<name>iceberg.catalog.rest.uri</name> | ||
<value>http://playground-gravitino:9001/iceberg</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HMS Iceberg catalog would be merged soon, so i think it would be better to use that one as default
@@ -1054,7 +1054,7 @@ public void testSetSnapshotSummary() throws Exception { | |||
} | |||
assertThat(JsonUtil.mapper().writeValueAsString(summary).length()).isLessThan(4000); | |||
Map<String, String> parameters = Maps.newHashMap(); | |||
ops.setSnapshotSummary(parameters, snapshot); | |||
ops.getIcebergTableConverter().setSnapshotSummary(parameters, snapshot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do an implicit translation?
@@ -5839,13 +5842,33 @@ public HiveMetaHook getHook( | |||
} | |||
}; | |||
|
|||
if ("rest".equals(conf.get("metastore.type", "hms"))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- conf.get(
catalog-type
)
enum Catalog (HIVE, ICEBERG), - conf.get(
catalog-impl
)
enum CatalogType (HMS, HADOOP, REST)
note: REST would be supported for non-ACID as well via HIVE catalog
|
||
import static org.apache.iceberg.TableProperties.GC_ENABLED; | ||
|
||
public class IcebergTableConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this converter? is this just a refactor? if yes, you should create an Iceberg PR first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the separate PR, please review it, so I can rebase this PR on top of it:
#5676
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is in iceberg repo:
apache/iceberg#12461
if (conf.getBoolVar(ConfVars.METASTORE_FASTPATH)) { | ||
return new SessionHiveMetaStoreClient(conf, hookLoader, allowEmbedded); | ||
} else { | ||
return RetryingMetaStoreClient.getProxy(conf, hookLoader, metaCallTimeMap, | ||
SessionHiveMetaStoreClient.class.getName(), allowEmbedded); | ||
} | ||
} | ||
private IMetaStoreClient getHiveIcebergRESTCatalog(HiveConf conf, HiveMetaHookLoader hookLoader) throws MetaException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we create IMetaStoreClientFactory and extract this logic there
if ("rest".equals(conf.get("metastore.type", "hms"))) { | ||
return getHiveIcebergRESTCatalog(conf, hookLoader); | ||
} | ||
|
||
if (conf.getBoolVar(ConfVars.METASTORE_FASTPATH)) { | ||
return new SessionHiveMetaStoreClient(conf, hookLoader, allowEmbedded); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note. SessionHiveMetaStoreClient
has critical traits to achieve some features, e.g., temporary tables. My question: Do we want to include those capabilities when integrating with an Iceberg REST Catalog, or do we have valid reasons why we don't need to support them?
IMetaStoreClient might not be a good place to override if we need them.
Change-Id: I5bb2559f7ca602b71f8ca03c852e2deff1a1bc52
What changes were proposed in this pull request?
Iceberg REST client implementation added to support Iceberg REST server connection.
Why are the changes needed?
To support Iceberg REST server connection.
Does this PR introduce any user-facing change?
No.
Is the change a dependency upgrade?
How was this patch tested?
Unit test.