Exploring Open Formats Beyond Apache Iceberg for AWS Table Buckets #607

soumilshah1995 · 2024-12-20T14:37:16Z

soumilshah1995
Dec 20, 2024

Currently, AWS Table Buckets natively use Apache Iceberg as the format for storing data. While Iceberg offers many benefits, there are concerns around potential lock-ins, especially since it's tightly coupled with AWS services.

Is there any effort from the community or AWS itself to adopt more open formats like Apache XTable or others? By supporting multiple open-source formats, users would have the flexibility to choose the format that best fits their use case, rather than being constrained by a single proprietary option.

What are your thoughts on this? Would supporting multiple formats lead to better interoperability, or are there challenges that might make this difficult to implement?

Let's keep this professional and focus on how we can guide the industry in the right direction. We should discuss how users can gain the ability to choose the tools that best meet their needs, and explore how or if we can incorporate XTable into S3 table buckets.

vinishjail97 · 2024-12-20T18:18:28Z

vinishjail97
Dec 20, 2024
Collaborator

Is there any effort from the community to adopt more open formats like Apache XTable or others?

We have an existing PR #593 by @the-other-tim-brown for configuring RunSyncTool in a continuous mode at a scheduled interval.
Users can run this in their AWS account using the xtable-sync-lambda lambda function you wrote.

If we are looking for something which requires minimal user configuration - Building an async service which discovers new tables in an s3 bucket (using s3 events ?) and then synchronize the table's metadata for the table formats excluding source. Is this something AWS can integrate as part of s3 table buckets ?

0 replies

soumilshah1995 · 2024-12-20T18:59:05Z

soumilshah1995
Dec 20, 2024
Author

I am actually referring to S3 table Buckets I think you are referring to Standard S3 Buckets

https://aws.amazon.com/blogs/aws/new-amazon-s3-tables-storage-optimized-for-analytics-workloads/

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exploring Open Formats Beyond Apache Iceberg for AWS Table Buckets #607

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Exploring Open Formats Beyond Apache Iceberg for AWS Table Buckets #607

soumilshah1995 Dec 20, 2024

Replies: 2 comments

vinishjail97 Dec 20, 2024 Collaborator

soumilshah1995 Dec 20, 2024 Author

soumilshah1995
Dec 20, 2024

vinishjail97
Dec 20, 2024
Collaborator

soumilshah1995
Dec 20, 2024
Author