Skip to content

Commit

Permalink
PD-5304: Databricks corrections
Browse files Browse the repository at this point in the history
  • Loading branch information
adrian-velonis1 committed Nov 27, 2024
1 parent b9e5d78 commit 5a9cafd
Show file tree
Hide file tree
Showing 11 changed files with 179 additions and 153 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
</h3>
<p><b>Optional.</b>
</p>
<p>Specifies additional properties about the table. You can use this to specify new properties or replace existing ones.</p>
<p>Specifies additional properties. You can use this to specify new properties or replace existing ones.</p>
<p><code class="language-text">setExtendedTableProperties</code> has the following nested attributes:</p>
<ul>
<li><code class="language-text">tblProperties</code> (string) <b>(required)</b>: <MadCap:snippetText src="change-type-tbl-properties.flsnp"></MadCap:snippetText></li>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
<head>
</head>
<body>
<p>The table properties you want to specify. Specify properties using the format <code class="language-text">'key'='value'</code>. Specify multiple properties in a comma-separated list.</p>
<p>The table properties you want to specify. Specify properties using the format <code class="language-text">'key'='value'</code>. Separate multiple values using commas.</p>
</body>
</html>
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="utf-8"?>
<html xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd">
<body>
<p class="note" MadCap:autonum="&lt;b&gt;Note: &lt;/b&gt;">You can either specify <code class="language-text">clusterColumns</code> or <code class="language-text">partitionColumns</code>, but not both.</p>
</body>
</html>
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
</ul>
<p><MadCap:variable name="General.Liquibase" />&#160;<MadCap:variable name="General.changetypes" />s that accept Databricks <MadCap:variable name="General.Param/Attribute" />s or sub-tags:</p>
<ul>
<li><code><MadCap:xref href="../../../../change-types/create-table.html">createTable</MadCap:xref></code>:&#160;create a table&#160;<ul><li>Databricks sub-tag: <code><MadCap:xref href="../../../../change-types/databricks/nested-tags/cluster-columns.html">clusterColumns</MadCap:xref></code>: create a clustered table</li><li>Databricks sub-tag: <code><MadCap:xref href="../../../../change-types/databricks/nested-tags/extended-table-properties.html">extendedTableProperties</MadCap:xref></code>: specify additional properties on a table, including partitions</li></ul></li>
<li><code><MadCap:xref href="../../../../change-types/create-table.html">createTable</MadCap:xref></code>:&#160;create a table&#160;<ul><li>Databricks sub-tag: <code><MadCap:xref href="../../../../change-types/databricks/nested-tags/extended-table-properties.html">extendedTableProperties</MadCap:xref></code>: specify additional properties on a table, including clusters and partitions</li></ul></li>
<li><code><MadCap:xref href="../../../../change-types/create-view.html">createView</MadCap:xref></code>: create a view on a Databricks table<ul><li>Databricks <MadCap:variable name="General.Param/Attribute" />: <code class="language-text">tblProperties</code>: similar to <code class="language-text">extendedTableProperties</code>, but for a view</li></ul></li>
</ul>
</body>
Expand Down
30 changes: 19 additions & 11 deletions Content/change-types/create-table.html
Original file line number Diff line number Diff line change
Expand Up @@ -144,18 +144,10 @@ <h2 id="nested-tags">Nested tags</h2>
<td>all</td>
<td>yes</td>
</tr>
<tr MadCap:conditions="">
<td><code><MadCap:xref href="databricks/nested-tags/cluster-columns.html">clusterColumns</MadCap:xref></code>
</td>
<td>Creates a clustered table.</td>
<td>&#160;</td>
<td>databricks</td>
<td>no</td>
</tr>
<tr MadCap:conditions="">
<td><code><MadCap:xref href="databricks/nested-tags/extended-table-properties.html">extendedTableProperties</MadCap:xref></code>
</td>
<td>Specifies additional properties on a table you're creating.</td>
<td>Specifies additional properties on a table you're creating, such as whether to create clustered or partitioned columns.</td>
<td>&#160;</td>
<td>databricks</td>
<td>no</td>
Expand Down Expand Up @@ -202,6 +194,7 @@ <h2>Examples</h2>
author: your.name
changes:
- createTable:
tableName: test_table_complex_types
columns:
- column:
name: my_arrs
Expand All @@ -215,7 +208,11 @@ <h2>Examples</h2>
- column:
name: my_struct
type: 'STRUCT&lt;FIELD1: STRING NOT NULL, FIELD2: INT&gt;'
tableName: test_table_complex_types</code></pre>
extendedTableProperties:
clusterColumns: my_arrs, my_arrbi
tableFormat: delta
tableLocation: s3://databricks-external-folder/test_table_properties
tblProperties: 'this.is.my.key'=12,'this.is.my.key2'=true</code></pre>
</div>
<div id="json_example" class="js-tabcontent">
<p>General example:</p><pre xml:space="preserve"><code class="language-json" data-lang="json">{
Expand Down Expand Up @@ -256,6 +253,7 @@ <h2>Examples</h2>
"changes": [
{
"createTable": {
"tableName": "test_table_complex_types",
"columns": [
{
"column": {
Expand All @@ -282,7 +280,12 @@ <h2>Examples</h2>
}
}
],
"tableName": "test_table_complex_types"
"extendedTableProperties": {
"clusterColumns": "my_arrs, my_arrbi",
"tableFormat": "delta",
"tableLocation": "s3://databricks-external-folder/test_table_properties",
"tblProperties": "'this.is.my.key'=12,'this.is.my.key2'=true"
}
}
}
]
Expand Down Expand Up @@ -313,6 +316,11 @@ <h2>Examples</h2>
&lt;column name="my_arrbi" type="ARRAY&amp;lt;BIGINT&amp;gt;" /&gt;
&lt;column name="my_map" type="MAP&amp;lt;STRING, BIGINT&amp;gt;" /&gt;
&lt;column name="my_struct" type="STRUCT&amp;lt;FIELD1: STRING NOT NULL, FIELD2: INT&amp;gt;" /&gt;

&lt;databricks:extendedTableProperties clusterColumns="my_arrs, my_arrbi"
tableFormat="delta"
tableLocation="s3://databricks-external-folder/test_table_properties"
tblProperties="'this.is.my.key'=12,'this.is.my.key2'=true"/&gt;
&lt;/createTable&gt;
&lt;/changeSet&gt;

Expand Down
19 changes: 12 additions & 7 deletions Content/change-types/databricks/alter-cluster.html
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
<h1 id="alter-cluster"><code>alterCluster</code>
</h1>
<p><code class="language-text">alterCluster</code> is a <MadCap:variable name="General.changetypes" /> in the <a href="../../start/tutorials/databricks/home.htm"><MadCap:variable name="General.LBCommunity" /> Databricks extension</a> that alters a cluster on a table.</p>
<p>To create a cluster, see <code><MadCap:xref href="nested-tags/cluster-columns.html">clusterColumns</MadCap:xref></code>.</p>
<p>To create a new table with a cluster, see <code><MadCap:xref href="nested-tags/extended-table-properties.html">extendedTableProperties</MadCap:xref></code>.</p>
<h2>Uses</h2>
<p>Clustered columns can help optimize performance for some database queries. If you have previously created a table with one or more clustered columns, you can modify which columns are clustered using <code class="language-text">alterCluster</code>. Specify which columns to cluster using <code>clusterBy</code>.</p>
<p>Clustered columns can help optimize performance for some database queries. If you have previously created a table with one or more clustered columns, you can modify which columns are clustered using <code class="language-text">alterCluster</code>. Specify which columns to de-cluster by using <code>clusterBy</code>. You can also specify a new column to override existing clustering logic.</p>
<p>Changing which columns are clustered can be useful if your data changes significantly or if you begin using different filters to query your data. Better clustering can improve the read efficiency of the new queries.</p>
<p>Databricks does not allow you to drop tables containing clustered columns. You can use <code class="language-text">alterCluster</code> to remove clustering and then drop the table.</p>
<p>For more information, see <a href="https://docs.databricks.com/en/delta/clustering.html">Use liquid clustering for Delta tables</a> and <a href="https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-alter-table.html">ALTER&#160;TABLE</a>.</p>
Expand Down Expand Up @@ -48,15 +48,15 @@ <h3><code>clusterBy</code>
</h3>
<p><b>Optional.</b>
</p>
<p>Specifies how to cluster the table. Use this to remove clustering from a column.</p>
<p>Specifies how to cluster the table. Use this only to remove clustering from a column, not add clustering.</p>
<p><code class="language-text">clusterBy</code> has the following nested attributes:</p>
<ul>
<li><code class="language-text">none</code> (Boolean) <b>(required)</b>: if <code class="language-text">true</code>, turns <b>off</b> clustering for the table being altered. If <code class="language-text">false</code>, <MadCap:variable name="General.Liquibase" /> throws an error.</li>
</ul>
<h3><code>columns</code>/<code>column</code></h3>
<p><b>Optional.</b>
</p>
<p>An array of <code>column</code> objects that describes columns in the table. The column order does not matter.</p>
<p>An array of <code class="language-text">column</code> objects that describes columns in the table. The column order does not matter. Use this to overwrite an existing <code class="language-text">CLUSTER BY</code> SQL clause or add clustering to a column.</p>
<p><code class="language-text">column</code> has the following nested attributes:</p>
<ul>
<li><code class="language-text">name</code> (string) <b>(required)</b>: the name of the column to alter.</li>
Expand Down Expand Up @@ -139,9 +139,16 @@ <h2 id="examples">Examples</h2>
&lt;/databaseChangeLog&gt;</code></pre>
</div>
</div>
<MadCap:snippetBlock src="../../Z_Resources/Snippets/text/change-type-databricks-database-support.flsnp" />
<h2>Troubleshooting</h2>
<h3><code>clusterBy</code> parsing error</h3>
<p>If you set the <code class="language-text">clusterBy</code>&#160;<MadCap:variable name="General.Param/Attribute" />&#160;<code class="language-text">none=false</code>, <MadCap:variable name="General.Liquibase" /> throws this error:</p><pre xml:space="preserve"><code class="language-text">Unexpected error running Liquibase: Error parsing line 13 column 49 of generated.xml: cvc-enumeration-valid: Value 'false' is not facet-valid with respect to enumeration '[true]'. It must be a value from the enumeration.</code></pre>
<p>The purpose of this <MadCap:variable name="General.Param/Attribute" /> is to <b>remove</b> clustering from a column, so it only accepts <code class="language-text">none=true</code>.</p>
<p>If you were trying to <b>add</b> a clustered column to an existing table, you must simply use <code class="language-text">alterCluster</code> to specify <code class="language-text">column</code>. In this case, you must omit <code class="language-text">clusterBy</code>.</p>
<p>If you were trying to <b>create</b> a new table with clustering, you must use <code><MadCap:xref href="../create-table.html">createTable</MadCap:xref></code> to specify <code><MadCap:xref href="nested-tags/extended-table-properties.html">extendedTableProperties</MadCap:xref></code>. Then, you can use <code class="language-text">clusterColumns</code> to specify the columns you want to cluster.</p>
<h3><code>clusterBy</code> and <code>columns</code> both null</h3>
<p>If you don't specify either <code class="language-text">clusterBy</code> or <code class="language-text">columns</code>, <MadCap:variable name="General.Liquibase" /> throws this error:</p><pre><code class="language-text">Alter Cluster change require list of columns or element 'ClusterBy', please add at least one option.</code></pre>
<p>If you were trying to change the name of your table, you must use <code><MadCap:xref href="../rename-table.html">renameTable</MadCap:xref></code> instead.</p>
<MadCap:snippetBlock src="../../Z_Resources/Snippets/text/change-type-databricks-database-support.flsnp" />
<h2>Related links</h2>
<ul>
<li><a href="https://docs.databricks.com/en/delta/clustering.html">Databricks: Use liquid clustering for Delta tables</a>
Expand All @@ -150,8 +157,6 @@ <h2>Related links</h2>
</li>
<li><a href="https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-cluster-by.html">Databricks SQL&#160;Reference:&#160;CLUSTER&#160;BY clause (TABLE)</a>
</li>
<li><code><MadCap:xref href="nested-tags/cluster-columns.html">clusterColumns</MadCap:xref></code>
</li>
</ul>
</body>
</html>
4 changes: 2 additions & 2 deletions Content/change-types/databricks/analyze-table.html
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ <h2 id="available-attributes">Available <MadCap:variable name="General.Param/Att
<td><code class="language-text">analyzeColumns</code>
</td>
<td>String</td>
<td>Name of the column(s) to analyze. Specify multiple columns in a comma-separated list.</td>
<td>Name of the column(s) to analyze. Separate multiple values using commas.</td>
<td>Optional</td>
</tr>
</tbody>
Expand Down Expand Up @@ -99,7 +99,7 @@ <h2 id="examples">Examples</h2>
<MadCap:snippetBlock src="../../Z_Resources/Snippets/text/change-type-databricks-database-support.flsnp" />
<h2>Related links</h2>
<ul>
<li><a href="https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-aux-analyze-table.html">Databricks SQL&#160;Reference:&#160;ANALYZE</a>
<li><a href="https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-aux-analyze-table.html">Databricks SQL Reference: ANALYZE</a>
</li>
</ul>
</body>
Expand Down
Loading

0 comments on commit 5a9cafd

Please sign in to comment.