Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Protocol Change] Add the Redirection spec change proposal #3705

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kamcheungting-db
Copy link
Contributor

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (Protocol Change Documentation)

Description

This PR describes the detail of redirection, which include the feature definition, the procedure of enablement and disablement, and query redirection flow.

This PR is created to submit all necessary documentation of the

How was this patch tested?

Documentation PR doesn't need test case.

Does this PR introduce any user-facing changes?

No.

* Redirect Destination Table: The Delta table that is created as a redirected table of an existing source table. Any access to the redirect source table would be dispatched to this redirect destination table.
## Overview
The proposal includes two distinct table features:
* **RedirectReaderWriter**: This feature enables both read and write operations to be redirected from the source to the destination for Modern Delta clients. It blocks all read and write operations from Legacy Delta clients.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we we remove this term "modern" from everywhere? really ambiguous term... what modern compared to what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modern is defined in the terminology section above?

@@ -0,0 +1,116 @@
# Redirection
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the table of proposed RFCs - https://github.com/delta-io/delta/tree/master/protocol_rfcs#table-of-rfcs

in fact, i realized that the instructions in that protocol_rfcs/README does not say to update the table. Can you please add that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@tdas tdas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass

* Redirect Destination Table: The Delta table that is created as a redirected table of an existing source table. Any access to the redirect source table would be dispatched to this redirect destination table.
## Overview
The proposal includes two distinct table features:
* **RedirectReaderWriter**: This feature enables both read and write operations to be redirected from the source to the destination for Modern Delta clients. It blocks all read and write operations from Legacy Delta clients.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modern is defined in the terminology section above?

## Terminology:
* Modern Delta clients: These are Delta clients that are compatible with the latest Delta protocol and support the redirection feature.
* Legacy Delta clients: These refer to Delta clients that do not support the redirection feature, as they are not up-to-date with the latest Delta protocol.
* Redirect Source Table: The source Delta table that is redirected to a new storage location.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Redirect Source Table: The source Delta table that is redirected to a new storage location.
* Redirect Source Table: The source Delta table that is redirected to a new storage location. The redirect table feature is added to this source table.

* **State**: This value indicates the status of the redirect table. It has three possible values: **ENABLE-REDIRECT-IN-PROGRESS**, **READY** and **DROP-REDIRECT-IN-PROGRESS**.
* **ENABLE-REDIRECT-IN-PROGRESS**: This state indicates that the redirect process is still going on. The modern delta client can still read the table on the redirect source table but can't write data to the table. This state also accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
* **READY**: This state indicates that the redirect process is completed and the modern delta client can read and write the table. Both table-based and path-based read and write statements are redirected to the destination table.
* **DROP-REDIRECT-IN-PROGRESS**: The table redirection is under withdrawal and the redirection property is going to be removed from the delta table. In this state, the modern delta client stops redirecting new queries to redirect destination tables, and only accepts read-only queries to access the redirect source table. The on-going redirected write or metadata transactions, which are visiting redirect destinations, can not commit. This state accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **DROP-REDIRECT-IN-PROGRESS**: The table redirection is under withdrawal and the redirection property is going to be removed from the delta table. In this state, the modern delta client stops redirecting new queries to redirect destination tables, and only accepts read-only queries to access the redirect source table. The on-going redirected write or metadata transactions, which are visiting redirect destinations, can not commit. This state accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
* **DROP-REDIRECT-IN-PROGRESS**: The table redirection is under withdrawal and the redirection property is going to be removed from the delta table. In this state, the modern delta client stops redirecting new queries to redirect destination tables, and only accepts read-only queries to access the redirect source table. The on-going redirected write or metadata transactions, which are visiting redirect destinations, can not commit. This state accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly" properties.

* **ENABLE-REDIRECT-IN-PROGRESS**: This state indicates that the redirect process is still going on. The modern delta client can still read the table on the redirect source table but can't write data to the table. This state also accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
* **READY**: This state indicates that the redirect process is completed and the modern delta client can read and write the table. Both table-based and path-based read and write statements are redirected to the destination table.
* **DROP-REDIRECT-IN-PROGRESS**: The table redirection is under withdrawal and the redirection property is going to be removed from the delta table. In this state, the modern delta client stops redirecting new queries to redirect destination tables, and only accepts read-only queries to access the redirect source table. The on-going redirected write or metadata transactions, which are visiting redirect destinations, can not commit. This state accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
* Concurrent write transactions are aborted using the existing conflict checker mechanism. Therefore, state transitions should commit to both the redirect source and destination tables. This ensures that the Delta table's conflict checker can automatically manage concurrent write transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrent write transactions are aborted using the existing conflict checker mechanism.

aborted -> handled? Also, I did not fully understand this in the context of state. Could you elaborate a bit more?

* **ENABLE-REDIRECT-IN-PROGRESS**: This state indicates that the redirect process is still going on. The modern delta client can still read the table on the redirect source table but can't write data to the table. This state also accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
* **READY**: This state indicates that the redirect process is completed and the modern delta client can read and write the table. Both table-based and path-based read and write statements are redirected to the destination table.
* **DROP-REDIRECT-IN-PROGRESS**: The table redirection is under withdrawal and the redirection property is going to be removed from the delta table. In this state, the modern delta client stops redirecting new queries to redirect destination tables, and only accepts read-only queries to access the redirect source table. The on-going redirected write or metadata transactions, which are visiting redirect destinations, can not commit. This state accepts the transaction that updates the value of "delta.redirectReaderWriter" and "delta.redirectWriterOnly".
* Concurrent write transactions are aborted using the existing conflict checker mechanism. Therefore, state transitions should commit to both the redirect source and destination tables. This ensures that the Delta table's conflict checker can automatically manage concurrent write transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state transitions should commit to both the redirect source and destination tables.

Does this mean we are adding the table features to both? Is this documented somewhere else?

```

* **NoRedirectRules**: This attribute contains a list of rules for allowing transactions on the redirect source table. It allows maintenance workloads to run on the redirect source tables, for instance, the REFRESH UNIFORM and VACUUM command on the redirect source table. Each rule includes two attributes:
* **AppName**: This is the name of the applications that are allowed to execute commands on the redirect source table. When the sessions with these AppName try to access the redirected table, the modern delta client disables table redirection and their queries would be run on the redirect source table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* **AppName**: This is the name of the applications that are allowed to execute commands on the redirect source table. When the sessions with these AppName try to access the redirected table, the modern delta client disables table redirection and their queries would be run on the redirect source table.
* **AppName**: This is the name of the applications that are allowed to execute commands on the redirect source table. When a session with this AppName tries to access the redirected table, the modern delta client disables table redirection and their queries would be run on the redirect source table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants