Skip to content

Commit

Permalink
feat: Javadoc and Source Generation (#367)
Browse files Browse the repository at this point in the history
  • Loading branch information
stanbrub authored Oct 18, 2024
1 parent c4aac88 commit 6f9b8e1
Show file tree
Hide file tree
Showing 71 changed files with 250 additions and 105 deletions.
5 changes: 5 additions & 0 deletions .github/scripts/make-release-distro.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ set -o nounset

# Create a tar file with the given version using the git project located in the
# working directory.
#
# ex. .github/scripts/make-release-distro.sh 0.33.4 .github/distro

if [[ $# != 2 ]]; then
echo "$0: Missing release version or distro source argument"
Expand Down Expand Up @@ -35,8 +37,11 @@ cp ${DISTRO_SOURCE}/* ${DISTRO_DEST}
rm ${DISTRO_DEST}/dependency-pom.xml
cp target/${ARTIFACT}.jar ${DISTRO_DEST}/libs/
cp target/${ARTIFACT}-tests.jar ${DISTRO_DEST}/libs/
cp target/${ARTIFACT}-sources.jar ${DISTRO_DEST}/libs/
cp target/${ARTIFACT}-javadoc.jar ${DISTRO_DEST}/libs/
echo "VERSION=${RELEASE_VERSION}" > ${DISTRO_DEST}/.env

cd ${DISTRO_DEST}
tar cvzf ../${ARTIFACT}.tar * .env


2 changes: 2 additions & 0 deletions .github/workflows/publish-benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,8 @@ jobs:
deephaven-benchmark-${{env.VERSION}}.tar
deephaven-benchmark-${{env.VERSION}}-results.tar
release-notes.md
deephaven-benchmark-${{env.VERSION}}-sources.jar
deephaven-benchmark-${{env.VERSION}}-javadoc.jar
- name: Publish Github Release
if: ${{ github.ref_name == 'main' }}
Expand Down
36 changes: 35 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,41 @@
</executions>
</plugin>
<plugin>
<!-- For spotless to work on Windows, Set git config global 'core.autocrlf' to true -->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<version>3.3.1</version>
<executions>
<execution>
<id>attach-sources</id>
<phase>install</phase>
<goals>
<goal>jar-no-fork</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>3.10.1</version>
<configuration>
<excludePackageNames>*.connect,*.controller,*.generator,*.jfr,*.metric,*.run,*.util</excludePackageNames>
<show>public</show>
<nohelp>true</nohelp>
</configuration>
<executions>
<execution>
<id>attach-javadocs</id>
<phase>install</phase>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<!-- For spotless to work on Windows, Set git config global
'core.autocrlf' to true -->
<groupId>com.diffplug.spotless</groupId>
<artifactId>spotless-maven-plugin</artifactId>
<version>2.43.0</version>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@
* A wrapper for the Bench api that allows running tests for the purpose of comparing Deephaven to other products that
* perform similar operations. It allows running Deephaven operations or using Deephaven as an agent to run command line
* python tests in the same environment (e.g. Docker).
* <p/>
* <p>
* One of two initializers must be called to set up which type of tests is desired; {@code initDeephaven()} or
* {@code initPython()}. Deephaven tests run queries inside of Deephaven like the standard benchmarks. Python tests use
* Deephaven as an agent to run python scripts from the command line by first installing required pip modules in a
* python virtual environment and then running each test from there.
* <p/>
* <p>
* Note: This runner requires test ordering, so it follows that tests in a single test class are meant to be run as a
* group. This violates the standard Benchmark convention that every test be able to be run by itself. This is done for
* practical purposes, though it is not ideal.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
* Product comparison tests for the average by group operation. Tests read the same parquet data. To avoid an unfair
* advantage where some products may partition or group data during the read, parquet read time is included in the
* benchmark results.
* <p/>
* <p>
* Each test calculates two new average columns and groups by a string and an integer.
* <p/>
* <p>
* Data generation only happens in the first tests, the Deephaven test. Tests can be run individually, but only after
* the desired data has been generated.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
* Product comparison tests for the distinct (or select distinct) group operation. Tests read the same parquet data. To
* avoid an unfair advantage where some products may partition or group data during the read, parquet read time is
* included in the benchmark results.
* <p/>
* <p>
* Each test produces a table result that contains rows unique according to a string and an integer.
* <p/>
* <p>
* Data generation only happens in the first tests, the Deephaven test. Tests can be run individually, but only after
* the desired data has been generated.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@
* Product comparison tests for filter (where) operations. Tests read the same parquet data. To avoid an unfair
* advantage where some products may partition or group data during the read, parquet read time is included in the
* benchmark results.
* <p/>
* <p>
* Each test produces a table result filtered by three criteria; value is an exact string, value > an integer, value <
* an integer.
* <p/>
* <p>
* Data generation only happens in the first tests, the Deephaven test. Tests can be run individually, but only after
* the desired data has been generated.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@
* Product comparison tests for iterating and summing table columns. Tests read the same parquet data. To avoid an
* unfair advantage where some products may partition or group data during the read, parquet read time is included in
* the benchmark results.
* <p/>
* <p>
* Each test produces a table result containing one row with one column that is the total of the result of the sum of
* two columns for each row. ex. sum((r1c1 + r1c2)..(rNc1 + rNc2)). This is achieved without creating an extra column to
* hold the column sums.
* <p/>
* <p>
* Data generation only happens in the first test, the Deephaven test. Tests can be run individually, but only after the
* desired data has been generated.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
* Product comparison tests for inner join operations. Tests read the same parquet data. To avoid an unfair advantage
* where some products may partition or group data during the read, parquet read time is included in the benchmark
* results.
* <p/>
* <p>
* Each test produces a table that is the result of two tables intersected by a string and an integer.
* <p/>
* <p>
* Data generation only happens in the first tests, the Deephaven test. Tests can be run individually, but only after
* the desired data has been generated.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
/**
* Product comparison tests for sort operations. Tests read the same parquet data. To avoid an unfair advantage where
* some products may partition or group data during the read, parquet read time is included in the benchmark results.
* <p/>
* <p>
* Each test sorts a table by a string and an integer.
* <p/>
* <p>
* Data generation only happens in the first tests, the Deephaven test. Tests can be run individually, but only after
* the desired data has been generated.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
* boilerplate logic like imports, parquet reads, time measurement logic, etc. Each <code>test</code> runs two tests;
* one reading from a static parquet, and the other exercising ticking tables through the
* <code>AutotuningIncrementalReleaseFilter</code>.
* <p/>
* <p>
* Note: This class is for running tests in the <code>experimental</code> package. It will change as new experiments are
* added and may require external setup (i.e. parquet files) to work.
*/
Expand Down Expand Up @@ -112,7 +112,7 @@ public void table(String name, long rowCount) {
* Run the benchmark test according to the operation and the columns loaded from the source table. The name will
* show in the benchmark result output. The expected row count, since tests can scale, is an upper bound what result
* row count is expected.
* <p/>
* <p>
* This method assembles and runs two queries according to the settings provided previously: static and incremental
* release. Both runs are expected to produce the same resulting row count.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
* This tests a sort operation based on generated rows of data. The expected row count is achieved by fully generating
* the data to a parquet file or partially generating the data and merging that by <code>tableFactor</code> to get row
* count.
* <p/>
* <p>
* Note: For best results, use base and row counts that are highly divisible and clear like 1,000,000 so that the
* <code>tableFactor</code> is a whole number.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ public void setup() {
* </ol>
* This test is identical to {@link #joinTwoTablesFromKafkaStream_Shorthand} except without the use of
* <code>bench_api_</code> functions for Kafka consumers and table waiting.
* <p/>
* <p>
* Properties (e.g. ${kafka.consumer.addr}) are automatically filled in during query execution.
*/
@Test
Expand Down Expand Up @@ -116,7 +116,7 @@ with exclusive_lock(table):
* </ol>
* This test is identical to {@link #joinTwoTablesFromKafkaStream_Longhand} except without the use of
* <code>bench_api_</code> functions for Kafka consumers and table waiting.
* <p/>
* <p>
* Properties (e.g. ${kafka.consumer.addr}) are automatically filled in during query execution.
*/
@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ public Bench api() {
* Generate the given pre-defined tables according to the default data distribution defined by the
* <code>default.data.distribution</code> property. The first table name provided will be the main
* <code>source</code> table.
* <p/>
* <p>
* This method should only be called once per test.
*
* @param names the table names
Expand Down Expand Up @@ -81,7 +81,7 @@ public void table(String name, String distribution) {
/**
* Generate a pre-defined table and set a column grouping for the resulting table. The given table name will be used
* as the main table used by subsequent queries.
* <p/>
* <p>
*
* @param name the table name to generate
* @param groups
Expand Down Expand Up @@ -163,7 +163,7 @@ public void test(String name, String operation, String... loadColumns) {
* <li>If static test duration <code>&lt; scale.elapsed.time.target</code>, scale row count and do it again</li>
* <li>Run test with auto increment release filter according to the previously determined row count</li>
* <li>Assert that both static and incremental result tables have the same number of rows</li>
* <p/>
* <p>
*
* @param name the name of the test as it will show in the result file
* @param expectedRowCount the max row count expected from the operation regardless of scale, or zero if the count
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the groupBy table operation. Ungroups column content. It is the inverse of groupBy.
* Ungroup unwraps columns containing Deephaven arrays or vectors.
* <p/>
* <p>
* Note: These tests do group then ungroup, since the data generator does not support arrays
*/
public class UngroupTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@

/**
* Standard tests for running user-defined functions. These tests are meant to be compared, and so use the same data.
* <p/>
* <p>
* Note: When scaling row count, vector size should not get bigger. That would cause more than one axis change and
* invalidate any expected comparisons.
* <p/>
* <p>
* Note: The "No Hints" tests have casts to make them equivalent to the hints tests, otherwise the return value would
* always be a PyObject and not really the same test. They use two formulas to achieve this, otherwise vectorization
* would not happen on "No Hints" benchmarks.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a tick-based exponential moving maximum for specified
* columns and places the result into a new column for each row.
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_ticks</code> that are comparable to
* <code>EmMaxTimeTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a time-based exponential moving maximum for specified
* columns and places the result into a new column for each row.
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_time</code> that are comparable to <code>EmMaxTickTest</code>
*/
public class EmMaxTimeTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a tick-based exponential moving minimum for specified
* columns and places the result into a new column for each row. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_time</code> that are comparable to <code>EmMinTickTest</code>
*/
public class EmMinTickTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a time-based exponential moving minimum for specified
* columns and places the result into a new column for each row.
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_ticks</code> that are comparable to
* <code>EmMinTimeTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a tick-based exponential moving standard deviation for
* specified columns and places the result into a new column for each row. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_ticks</code> that are comparable to
* <code>EmStdTimeTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a time-based exponential moving standard deviation for
* specified columns and places the result into a new column for each row. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_time</code> that are comparable to <code>EmStdTickTest</code>
*/
public class EmStdTimeTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a tick-based exponential moving average for specified
* columns and places the result into a new column for each row.
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_ticks</code> that are comparable to <code>EmaTimeTest</code>
*/
public class EmaTickTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a time-based exponential moving average for specified
* columns and places the result into a new column for each row.
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_time</code> that are comparable to <code>EmaTickTest</code>
*/
public class EmaTimeTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a tick-based exponential moving sum for specified columns
* and places the result into a new column for each row. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_ticks</code> that are comparable to <code>EmsTimeTest</code>
*/
public class EmsTickTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Calculates a time-based exponential moving average for specified
* columns and places the result into a new column for each row. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>decay_time</code> that are comparable to <code>EmsTickTest</code>
*/
public class EmsTimeTest {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Defines a tick-based rolling average. The result table contains
* additional columns with windowed rolling averages for each specified column in the source table. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>rev_ticks/fwd_ticks</code> that are comparable to
* <code>RollingAvgTimeTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Defines a time-based rolling average. The result table contains
* additional columns with windowed rolling averages for each specified column in the source table. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>rev_time/fwd_time</code> that are comparable to
* <code>RollingAvgTickTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Defines a tick-based rolling count. The result table contains
* additional columns with windowed rolling count1 for each specified column in the source table.
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>rev_ticks/fwd_ticks</code> that are comparable to
* <code>RollingCountTimeTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Defines a time-based rolling count. The result table contains
* additional columns with windowed rolling counts for each specified column in the source table. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>rev_time/fwd_time</code> that are comparable to
* <code>RollingCountTickTest</code>
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
/**
* Standard tests for the updateBy table operation. Defines a tick-based rolling group. The result table contains
* additional columns with windowed rolling groups for each specified column in the source table. *
* <p/>
* <p>
* Note: This test must contain benchmarks and <code>rev_ticks/fwd_ticks</code> that are comparable to
* <code>RollingGroupTimeTest</code>
*/
Expand Down
Loading

0 comments on commit 6f9b8e1

Please sign in to comment.