Skip to content

HIVE-28658 Add Iceberg REST Catalog client support #5628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

zratkai
Copy link
Contributor

@zratkai zratkai commented Jan 31, 2025

Change-Id: I5bb2559f7ca602b71f8ca03c852e2deff1a1bc52

What changes were proposed in this pull request?

Iceberg REST client implementation added to support Iceberg REST server connection.

Why are the changes needed?

To support Iceberg REST server connection.

Does this PR introduce any user-facing change?

No.

Is the change a dependency upgrade?

How was this patch tested?

Unit test.

Copy link

@zhangbutao
Copy link
Contributor

@zratkai Thanks for your PR! Could you give an example of a test? Or the records & screenshots you've tested?

zratkai added 2 commits July 7, 2025 09:54
Change-Id: I803b1d1b523a4e77116b0227324afaaa7acf659f
Change-Id: I3aa2bd6571aa8fe6b314ac097483c0ed9ae30198
@zratkai zratkai force-pushed the HIVE-28658-IcebergRESTCatalogClient branch from 82328f7 to abe559e Compare July 7, 2025 07:58
@deniskuzZ deniskuzZ requested review from okumin, SourabhBadhya and ngsg and removed request for ngsg and SourabhBadhya July 7, 2025 08:29
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class HiveIcebergRESTCatalogClientAdapter implements IMetaStoreClient {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it should inherit from BaseMetaStoreClient, is that right @ngsg?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, similar to ThriftHiveMetaStoreClient, it would be better to inherit from BaseMetaStoreClient to simplify the implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be renamed to HiveIcebergRESTCatalogClient and should be moved to the client package

README.md Outdated
@@ -109,6 +109,50 @@ Upgrading from older versions of Hive
different database for your MetaStore you will need to provide
your own upgrade script.

Using Iceberg REST catalog
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange that adding some properties into README.md.

Maybe we should add these stuff into hive website, such as https://hive.apache.org/docs/latest/apache-hive-4-0-x_282102245/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deniskuzZ WDYT? I think you suggested to me to add documentation about the new configs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be moved to /data/conf/iceberg/llap/hive-site.xml, but I agree it doesn't belong in the main README.
Also it would nice if we had a dedicated page under hive-site for setting up the RestClient

Map<String, String> properties = getCatalogPropertiesFromConf(conf);
String catalogName = properties.get(WAREHOUSE);
restCatalog = new RESTCatalog();
restCatalog.initialize(catalogName, properties);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about connecting to a REST catalog in the constructor, similar to ThriftHiveMetaStoreClient? It feels a bit weird to require an explicit call to reconnect() before using this class.
Also, I think we should call close() before creating a new RESTCatalog instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not in the constructor?

Separation of concerns
Constructors should only initialize the object’s internal state, not perform complex logic like I/O operations.

Easier error handling
If a network connection fails in the constructor, you can’t catch the exception cleanly when the object is being created. This leads to brittle code or forced exception handling.

MyClient client = new MyClient(); // What if constructor throws IOException?

Testability
Classes that do heavy work in constructors (like opening sockets or database connections) are harder to test, mock, or even instantiate in unit tests.

Flexible lifecycle management
By separating the setup logic (init()) from object creation (constructor), you can retry, delay, or configure the connection after object construction.

Why prefer init() or a similar method?

You control when the connection happens.
It’s easier to handle and report errors.
It allows dependency injection or configuration before setup.
It aligns with the "construct → configure → initialize → use" lifecycle pattern.

@@ -73,6 +71,8 @@ public final class Catalogs {
public static final String SNAPSHOT_REF = "snapshot_ref";

private static final String NO_CATALOG_TYPE = "no catalog";
private static final String REST_CATALOG_TYPE = "rest";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg already has constant for that: CatalogUtil.ICEBERG_CATALOG_TYPE_REST

@@ -275,7 +270,9 @@ private static Map<String, String> getCatalogProperties(Configuration conf, Stri
config.getValue());
}
});

if (REST_CATALOG_TYPE.equals(catalogType)) {
catalogProperties.put("type", "rest");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catalogProperties.put(CatalogUtil.ICEBERG_CATALOG_TYPE, catalogType);

@@ -298,7 +295,8 @@ private static String getCatalogType(Configuration conf, String catalogName) {
return catalogType;
}
} else {
String catalogType = conf.get(CatalogUtil.ICEBERG_CATALOG_TYPE);
String catalogType = conf.get(InputFormatConfig.catalogPropertyConfigKey(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Some of the comments seem very detailed or explanatory. It might be more efficient to focus the review on issues like logic, readability, or potential bugs. If anything’s unclear, feel free to try running the code — that might help clarify things faster.


import java.util.concurrent.ConcurrentHashMap;

public class IMetaStoreClientFactory {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this factory is redundant, there is HiveMetaStoreClientBuilder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this ticket (https://issues.apache.org/jira/browse/HIVE-20189) was open for 7 years (Created: 16/Jul/18 20:45 ), and it was just merged 20 hours ago by you. Sorry, I haven't checked the new commits, I will have a look on it!

iceberg/pom.xml Outdated
@@ -221,6 +221,21 @@
<artifactId>value</artifactId>
<version>${immutables.value.version}</version>
</dependency>
<dependency>
Copy link
Member

@deniskuzZ deniskuzZ Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are these dependency declarations used? why duplicate the root pom DM?

@deniskuzZ
Copy link
Member

@zratkai, good luck with the PR. Frankly, I've had enough of the arrogance.

Change-Id: I74283d2010197675219ca45543b433413578c95f
@zratkai
Copy link
Contributor Author

zratkai commented Jul 21, 2025

If you continue work on this and reuse any part of my work, add my name as an author.

@zratkai zratkai closed this Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants