diff --git a/modules/ROOT/pages/connect-clients-to-proxy.adoc b/modules/ROOT/pages/connect-clients-to-proxy.adoc index 985ef1a4..f4f31007 100644 --- a/modules/ROOT/pages/connect-clients-to-proxy.adoc +++ b/modules/ROOT/pages/connect-clients-to-proxy.adoc @@ -179,7 +179,7 @@ The following sample client applications demonstrate how to use the Java driver See your driver's documentation for code samples that are specific to your chosen driver, including cluster connection examples and statement execution examples. -You can use the provided sample client applications, in addition to your own client applications, to validate that your {product-proxy} deployment is orchestrating read and write requests as expected between the origin cluster, target cluster, and your client applications. +You can use the provided sample client applications, in addition to your own client applications, to validate that your {product-proxy} deployment orchestrates read and write requests as expected between the origin cluster, target cluster, and client applications. {product-demo}:: https://github.com/alicel/zdm-demo-client/[{product-demo}] is a minimal Java web application which provides a simple, stripped-down example of an application built to work with {product-proxy}. diff --git a/modules/ROOT/pages/deploy-proxy-monitoring.adoc b/modules/ROOT/pages/deploy-proxy-monitoring.adoc index 44193388..d1522d71 100644 --- a/modules/ROOT/pages/deploy-proxy-monitoring.adoc +++ b/modules/ROOT/pages/deploy-proxy-monitoring.adoc @@ -27,9 +27,9 @@ ubuntu@52772568517c:~$ . List (`ls`) the contents of the Ansible Control Host Docker container, and then find the `zdm-proxy-automation` directory. -. Change (`cd`) to the `zdm-proxy-automation/ansible` directory. +. Change (`cd`) to the `zdm-proxy-automation/ansible/vars` directory. -. List the contents of the `ansible` directory, and then find the following YAML configuration files: +. List the contents of the `vars` directory, and then find the following YAML configuration files: + * `zdm_proxy_container_config.yml`: Internal configuration for the proxy container itself. * `zdm_proxy_cluster_config.yml`: Configuration properties to connect {product-proxy} to the origin and target clusters. @@ -228,7 +228,13 @@ For more information, see xref:ROOT:manage-proxy-instances.adoc[]. Typically the advanced configuration variables don't need to be changed. Only modify the variables in `zdm_proxy_advanced_config.yml` if you have a specific use case that requires changing them. -If the following advanced configuration variables need to be changed, only do so _before_ deploying {product-proxy}: +[IMPORTANT] +==== +The following advanced configuration variables are immutable. +If you need to change these variables, {company} recommends that you do so _before_ deploying {product-proxy}. +Future changes require you to recreate your entire {product-proxy} deployment. +For more information, see xref:ROOT:manage-proxy-instances.adoc#change-immutable-configuration-variables[Change immutable configuration variables]. +==== Multi-datacenter clusters:: For xref:ROOT:deployment-infrastructure.adoc#multiple-datacenter-clusters[multi-datacenter origin clusters], specify the name of the datacenter that {product-proxy} should consider local. @@ -241,8 +247,8 @@ For information about downloading a region-specific {scb-short}, see xref:astra- [#ports] Ports:: -Each {product-proxy} instance listens on port 9042 by default, like a regular {cass-short} cluster. -This can be overridden by setting `zdm_proxy_listen_port` to a different value. +Each {product-proxy} instance listens on port 9042 by default, like a default {cass-short} cluster. +You can override this by setting `zdm_proxy_listen_port` to your preferred port. This can be useful if the origin nodes listen on a port that is not 9042 and you want to configure {product-proxy} to listen on that same port to avoid changing the port in your client application configuration. + {product-proxy} exposes metrics on port 14001 by default. @@ -467,6 +473,8 @@ If you want to enable TLS after the initial deployment, you must rerun the deplo After modifying all necessary configuration variables, you are ready to deploy your {product-proxy} instances. . From your shell connected to the Control Host, make sure you are in the `ansible` directory at `/home/ubuntu/zdm-proxy-automation/ansible`. ++ +If you are in the `vars` directory, then you must go up one level to the `ansible` directory. . Run the deployment playbook: + diff --git a/modules/ROOT/pages/feasibility-checklists.adoc b/modules/ROOT/pages/feasibility-checklists.adoc index 4f33d3a3..f0f121c8 100644 --- a/modules/ROOT/pages/feasibility-checklists.adoc +++ b/modules/ROOT/pages/feasibility-checklists.adoc @@ -8,13 +8,98 @@ You might need to adjust your data model or application logic to ensure compatib If you cannot meet these requirements, particularly the cluster and schema compatibility requirements, see xref:ROOT:components.adoc[] for alternative migration tools and strategies. +[#supported-cassandra-native-protocol-versions] == Supported {cass-short} Native Protocol versions -{product-proxy} supports protocol versions `v3`, `v4`, `DSE_V1`, and `DSE_V2`. +include::ROOT:partial$cassandra-protocol-versions.adoc[] -{product-proxy} technically doesn't support `v5`. -If `v5` is requested, the proxy handles protocol negotiation so that the client application properly downgrades the protocol version to `v4`. -This means that you can use {product-proxy} with any client application that uses a driver version supporting protocol version `v5`, as long as the application doesn't use `v5`-specific functionality. +When a specific protocol version is requested, {product-proxy} handles protocol negotiation to ensure the requested version is supported by both clusters. +For example, to use protocol `V5` with {product-proxy}, both the origin and target clusters must support `V5`, such as {hcd-short} or open source {cass-reg} 4.0 or later. +Otherwise, a lower protocol version must be used. + +If the requested version isn't mutually supported, then {product-proxy} can force the client application to downgrade to a mutually supported protocol version. +If automatic forced downgrade isn't possible, then the connection fails, and you must modify your client application to request a different protocol version. + +.Determine your client application's supported and negotiated protocol versions +[%collapsible] +==== +Outside of a migration scenario (without {product-proxy}), the supported protocol versions depend on your origin cluster's version and client application's driver version. + +Generally, when connecting to a cluster, the driver requests the highest protocol version that it supports. +If the cluster supports that version, then the connection uses that version. +If the cluster doesn't support that version, then the driver progressively requests lower versions until it finds a mutually supported version. + +For example, if the cluster and driver both support `V5`, then your client application uses `V5` automatically unless you explicitly disable `V5` in your driver configuration. + +If you upgrade your cluster, driver, or both to a version with a higher mutually supported protocol version, then the driver automatically starts using the higher version unless you explicitly disable it in your driver configuration. + +When you introduce {product-proxy}, the target cluster is integrated into the protocol negotiation process to ensure that the negotiated protocol version is supported by the origin cluster, target cluster, and driver. +==== + +=== Considerations and requirements for `V5` + +Required {product-proxy} version:: +Official support for `V5` requires {product-proxy} version 2.4.0 or later. + +Use cases requiring `V5`:: +You are required to use `V5` only if your client application uses `V5`-specific functionality. + +Potential performance impact between `V5` and earlier versions:: +Protocol `V5` has improved integrity checks compared to earlier versions. +This can cause slight performance degradation when your client application begins to use `V5` after using an earlier version. ++ +{company} performance tests showed potential throughput reductions ranging from 0 to 15 percent. +This performance impact can occur with and without {product-proxy}. ++ +[TIP] +==== +If your client application already uses `V5`, it is likely that you already adjusted to any potential performance impact, and the protocol version will have little or no impact on performance during your migration. +==== ++ +If you plan to upgrade to a `V5`-compatible driver before or during your migration, then the potential performance impact depends on which clusters support `V5`: ++ +-- +* **Neither cluster supports `V5`**: You won't notice any protocol-related performance impact before or during the migration because the driver and {product-proxy} cannot negotiate `V5` in this scenario. + +* **Only the target cluster supports `V5`**: You won't notice any protocol-related performance impact during the migration because {product-proxy} must negotiate a protocol version that is supported by both clusters. +If the origin cluster doesn't support `V5`, then {product-proxy} cannot negotiate `V5` during the migration, and the driver cannot negotiate `V5` before the migration. ++ +However, you might experience a protocol-related performance impact at the end of the migration when you connect your client application directly to the target cluster. +This phase removes {product-proxy} and the origin cluster from the protocol negotiation, allowing the driver to negotiate directly with the target cluster. +If the target cluster supports `V5`, the driver can use `V5` automatically. + +* **Both clusters support `V5`**: Unless you <>, you might experience performance impacts because the driver and {product-proxy} can use `V5` automatically in this scenario. +Consider upgrading the driver before or after the migration so you can isolate the impact of that change without the added complexity of the migration. +As a best practice for any significant version upgrade, run performance tests in lower environments to evaluate the potential impact before making the change in production. +-- + +[#disallow-or-explicitly-downgrade-the-protocol-version] +=== Disallow or explicitly downgrade the protocol version + +You can restrict protocol versions in the driver and {product-proxy} configuration: + +Driver configuration:: +You can explicitly downgrade the protocol version in your client application's driver configuration. +Make sure the enforced protocol version is supported by both clusters. ++ +Use this option if you need to enforce the protocol version outside of the migration. +For example: ++ +* Both clusters and the driver support `V5` but you don't want to use `V5`: Configure the protocol version in the driver before the migration if you haven't done so already. +* The origin cluster _doesn't_ support `V5` and you want to ensure `V5` isn't used automatically after the migration: Configure the protocol version in the driver at any point before the end of the migration when you connect your client application directly to the target cluster. +* You observe unacceptable performance degradation when using `V5` before the migration (without {product-proxy}): +Either mitigate the performance issues before the migration, or configure the protocol version in the driver before the migration. + +{product-proxy} configuration:: +You can use the `xref:ROOT:manage-proxy-instances.adoc#blocked-protocol-versions[blocked_protocol_versions]` configuration variable to block specific protocol versions at the proxy level. +Make sure at least one mutually supported protocol version isn't blocked. ++ +This option applies _only_ while {product-proxy} is in use. +It _doesn't_ persist after the migration. ++ +Use this option if you observe unacceptable performance degradation when {product-proxy} is active _and_ it negotiates `V5`. +If unacceptable performance degradation occurs _without_ {product-proxy}, then configure the protocol version in the driver instead. +However, be aware that {product-proxy} itself can have a performance impact, regardless of the protocol version. === Thrift isn't supported by {product-proxy} @@ -160,6 +245,11 @@ For more information, see xref:datastax-drivers:developing:query-idempotence.ado [#client-compression] == Client compression +[IMPORTANT] +==== +LZ4 and Snappy compression algorithms require {product-proxy} version 2.4.0 or later. +==== + The binary protocol used by {astra}, {dse-short}, {hcd-short}, and open-source {cass-short} supports optional compression of transport-level requests and responses that reduces network traffic at the cost of CPU overhead. When establishing connections from client applications, {product-proxy} responds with a list of compression algorithms supported by both clusters. diff --git a/modules/ROOT/pages/manage-proxy-instances.adoc b/modules/ROOT/pages/manage-proxy-instances.adoc index e6ed30a7..73fc440e 100644 --- a/modules/ROOT/pages/manage-proxy-instances.adoc +++ b/modules/ROOT/pages/manage-proxy-instances.adoc @@ -96,22 +96,25 @@ After you edit mutable variables in their corresponding configuration files (`va === Mutable variables in `vars/zdm_proxy_core_config.yml` -* `primary_cluster`: Determines which cluster is currently considered the xref:ROOT:faqs.adoc#what-are-origin-target-primary-and-secondary-clusters[primary cluster], either `ORIGIN` or `TARGET`. +`primary_cluster`:: +Determines which cluster is currently considered the xref:ROOT:faqs.adoc#what-are-origin-target-primary-and-secondary-clusters[primary cluster], either `ORIGIN` or `TARGET`. + At the start of the migration, the primary cluster is the origin cluster because it contains all of the data. After all the existing data has been transferred and validated/reconciled on the new cluster, you can switch the primary cluster to the target cluster. -* `read_mode`: Determines how reads are handled by {product-proxy}: +`read_mode`:: +Determines how reads are handled by {product-proxy}: + -** `PRIMARY_ONLY` (default): Reads are sent synchronously to the primary cluster only. -** `DUAL_ASYNC_ON_SECONDARY`: Reads are sent synchronously to the primary cluster, and also asynchronously to the secondary cluster. +* **`PRIMARY_ONLY` (default)**: Reads are sent synchronously to the primary cluster only. +* **`DUAL_ASYNC_ON_SECONDARY`**: Reads are sent synchronously to the primary cluster, and also asynchronously to the secondary cluster. See xref:enable-async-dual-reads.adoc[]. + Typically, you only set `read_mode` to `DUAL_ASYNC_ON_SECONDARY` if the `primary_cluster` variable is set to `ORIGIN`. This is because asynchronous dual reads are primarily intended to help test production workloads against the target cluster near the end of the migration. When you are ready to switch `primary_cluster` to `TARGET`, revert `read_mode` to `PRIMARY_ONLY` because there is no need to send writes to both clusters at that point in the migration. -* `log_level`: Set the {product-proxy} log level as `INFO` (default) or `DEBUG`. +`log_level`:: +Set the {product-proxy} log level as `INFO` (default) or `DEBUG`. + Only use `DEBUG` while temporarily troubleshooting an issue. Revert to `INFO` as soon as possible because the extra logging can impact performance slightly. @@ -120,23 +123,23 @@ For more information, see xref:ROOT:troubleshooting-tips.adoc#proxy-logs[Check { === Mutable variables in `vars/zdm_proxy_cluster_config.yml` -* Origin username and password - -* Target username and password +In `vars/zdm_proxy_cluster_config.yml`, you can change the connection credentials (username and password) for the origin and target clusters. === Mutable variables in `vars/zdm_proxy_advanced_config.yml` -* `zdm_proxy_max_clients_connections`: The maximum number of client connections that {product-proxy} can accept. +`zdm_proxy_max_clients_connections`:: +The maximum number of client connections that {product-proxy} can accept. Each client connection results in additional cluster connections and causes the allocation of several in-memory structures. A high number of client connections per proxy instance can cause performance degradation, especially at high throughput. Adjust this variable to limit the total number of connections on each instance. + Default: `1000` -* `replace_cql_functions`: Whether {product-proxy} replaces standard `now()` CQL function calls in write requests with an explicit timeUUID value computed at proxy level. +`replace_cql_functions`:: +Whether {product-proxy} replaces standard `now()` CQL function calls in write requests with an explicit timeUUID value computed at proxy level. + -If `false` (default), replacement of `now()` is disabled. -If `true`, {product-proxy} replaces instances of `now()` in write requests with an explicit timeUUID value before sending the write to each cluster. +* **`false` (default)**: Replacement of `now()` is disabled. +* **`true`**: {product-proxy} replaces instances of `now()` in write requests with an explicit timeUUID value before sending the write to each cluster. + [IMPORTANT] ==== @@ -153,7 +156,9 @@ If `now()` is used in any of your primary key columns, {company} recommends enab For more information, see xref:ROOT:feasibility-checklists.adoc#cql-function-replacement[Server-side non-deterministic functions in the primary key]. ==== -* [[zdm_proxy_request_timeout_ms]]`zdm_proxy_request_timeout_ms`: Global timeout in milliseconds of a request at proxy level. +[[zdm_proxy_request_timeout_ms]] +`zdm_proxy_request_timeout_ms`:: +Global timeout in milliseconds of a request at proxy level. Determines how long {product-proxy} waits for one cluster (for reads) or both clusters (for writes) to reply to a request. Upon reaching the timeout limit, {product-proxy} abandons the request and no longer considers it pending, which frees up internal resources to processes other requests. + @@ -165,31 +170,35 @@ If the client has an especially high timeout because it routinely generates long + Default: `10000` -* `origin_connection_timeout_ms` and `target_connection_timeout_ms`: Timeout in milliseconds for establishing a connection from the proxy to the origin or target cluster, respectively. +`origin_connection_timeout_ms` and `target_connection_timeout_ms`:: +Timeout in milliseconds for establishing a connection from the proxy to the origin or target cluster, respectively. + Default: `30000` -* `async_handshake_timeout_ms`: Timeout in milliseconds for the initialization (handshake) of the connection that is used solely for asynchronous dual reads between the proxy and the secondary cluster. +`async_handshake_timeout_ms`:: +Timeout in milliseconds for the initialization (handshake) of the connection that is used solely for asynchronous dual reads between the proxy and the secondary cluster. + Upon reaching the timeout limit, the asynchronous reads aren't sent because the connection failed to be established. This has no impact on the handling of synchronous requests: {product-proxy} continues to handle all synchronous reads and writes as normal against the primary cluster. + Default: `4000` -* `heartbeat_interval_ms`: The interval in milliseconds that heartbeats are sent to keep idle cluster connections alive. +`heartbeat_interval_ms`:: +The interval in milliseconds that heartbeats are sent to keep idle cluster connections alive. This includes all control and request connections to the origin and the target clusters. + Default: `30000` -* `metrics_enabled`: Whether to enable metrics collection. +`metrics_enabled`:: +Whether to enable metrics collection. + -If `false`, {product-proxy} metrics collection is completely disabled. +* **`true` (default)**: {product-proxy} collects and exposes metrics. +* **`false`**: {product-proxy} metrics collection is completely disabled. This isn't recommended. -+ -Default: `true` (enabled) [[zdm_proxy_max_stream_ids]] -* `zdm_proxy_max_stream_ids`: Set the maximum pool size of available stream IDs managed by {product-proxy} per client connection. +`zdm_proxy_max_stream_ids`:: +Set the maximum pool size of available stream IDs managed by {product-proxy} per client connection. Use the same value as your driver's maximum stream IDs configuration. + In the CQL protocol, every request has a unique stream ID. @@ -200,19 +209,58 @@ If you have a custom driver configuration with a higher value, make sure `zdm_pr + Default: `2048` +[#blocked-protocol-versions] +`blocked_protocol_versions`:: +This variable requires {product-proxy} and {product-automation} version 2.4.0 or later. ++ +Use `blocked_protocol_versions` to block specific {cass-short} Native Protocol versions that are supported by {product-proxy}. +Blocking unsupported versions isn't necessary because {product-proxy} automatically rejects those versions. ++ +Allowed values include the following: ++ +* **Omitted or empty (default)**: All supported protocol versions are allowed. +include::ROOT:partial$cassandra-protocol-versions.adoc[] +* **One or more protocol versions**: Provide a case-insensitive, comma-separated list of one or more protocol versions that you want to block. +The `v` prefix is optional. ++ +For example, all of the following configurations are valid: ++ +[source,yml] +---- +# Block none +blocked_protocol_versions: + +# Block one +blocked_protocol_versions: V2 +blocked_protocol_versions: v2 +blocked_protocol_versions: 2 + +# Block multiple +blocked_protocol_versions: V2,V3,DSEV1 +blocked_protocol_versions: v2,v3,dsev1 +blocked_protocol_versions: 2,3,dse1 +---- ++ +This variable can be useful if you notice performance degradation with specific protocol versions, and you want to disallow the protocol version at the proxy level instead of the driver level. +For more information, see xref:ROOT:feasibility-checklists.adoc#supported-cassandra-native-protocol-versions[Supported {cass-short} Native Protocol versions]. + === Deprecated mutable variables Deprecated variables will be removed in a future {product-proxy} release. Replace them with their recommended alternatives as soon as possible. -* `forward_client_credentials_to_origin`: Whether to use the credentials provided by the client application to connect to the origin cluster. -If `false` (default), the credentials from the client application were used to connect to the target cluster. -If `true`, the credentials from the client application were used to connect to the origin cluster. +`forward_client_credentials_to_origin`:: +Whether to use the credentials provided by the client application to connect to the origin cluster. ++ +* **`false` (default)**: The credentials from the client application were used to connect to the target cluster. +* **`true`**: The credentials from the client application were used to connect to the origin cluster. + + This deprecated variable is no longer functional. Instead, the expected credentials are based on the authentication requirements of the origin and target clusters. For more information, see xref:ROOT:connect-clients-to-proxy.adoc#connect-applications-to-zdm-proxy[Connect applications to {product-proxy}]. +[#change-immutable-configuration-variables] == Change immutable configuration variables All configuration variables not listed in <> are _immutable_ and can only be changed by recreating the deployment with the xref:ROOT:deploy-proxy-monitoring.adoc[initial deployment playbook] (`deploy_zdm_proxy.yml`): diff --git a/modules/ROOT/pages/troubleshooting-tips.adoc b/modules/ROOT/pages/troubleshooting-tips.adoc index 50b75650..70ec7336 100644 --- a/modules/ROOT/pages/troubleshooting-tips.adoc +++ b/modules/ROOT/pages/troubleshooting-tips.adoc @@ -314,9 +314,8 @@ For more information, see the documentation for your version of the Java driver: These messages indicate that a protocol version downgrades happened because {product-proxy} or one of the clusters doesn't support the version requested by the client. -`V5` downgrades are enforced by {product-proxy}. -Any other downgrade results from a request by a cluster that doesn't support the version that the client requested. -{product-proxy} supports `V3`, `V4`, `DSE_V1` and `DSE_V2`. +Protocol version downgrades result from a request by a cluster that doesn't support the version that the client requested. +include::ROOT:partial$cassandra-protocol-versions.adoc[] In the following example, notice that the `PROTOCOL ERROR` message is introduced by `level=debug`, indicating that it isn't a true error: @@ -333,7 +332,7 @@ If you observe this behavior in your logs, <> s === Proxy fails to start due to invalid or unsupported protocol version -If the {product-proxy} logs contain `debug` messages with `Invalid or unsupported protocol version: 3`, this means that one of the origin clusters doesn't support protocol version `V3` or later. +If the {product-proxy} logs contain `debug` messages with `Invalid or unsupported protocol version: 3`, this means that one of the origin clusters doesn't support protocol `V3` or later. .Invalid or unsupported protocol version logs [%collapsible] @@ -368,7 +367,7 @@ time="2022-10-01T19:58:15+01:00" level=error msg="Couldn't start proxy, retrying ==== Specifically, this happens with {cass-short} 2.0 and {dse-short} 4.6. -{product-short} cannot be used for these migrations because the {product-proxy} control connections don't perform protocol version negotiation, they only attempt to use `V3`. +{product-short} cannot be used for these migrations because the {product-proxy} control connections don't perform protocol version negotiation; they only attempt to use `V3`. === Authentication errors diff --git a/modules/ROOT/partials/cassandra-protocol-versions.adoc b/modules/ROOT/partials/cassandra-protocol-versions.adoc new file mode 100644 index 00000000..c52b697c --- /dev/null +++ b/modules/ROOT/partials/cassandra-protocol-versions.adoc @@ -0,0 +1 @@ +{product-proxy} supports protocol `V2`, `V3`, `V4`, `V5`, `DSE_V1`, and `DSE_V2`. \ No newline at end of file