Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Open Clover for code coverage analysis #48

Merged
Changes from 1 commit
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
eb9dc8f
cleaning up the main branch for new version.
prodriguezdefino Jun 27, 2023
0695673
updates in ignore and readme files
prodriguezdefino Jun 27, 2023
e36cc44
prepping the pom addition, added parent's compliance tools
prodriguezdefino Jun 27, 2023
e167803
adding parent pom and the connector impl project pom
prodriguezdefino Jun 28, 2023
8043378
adding common functionalities
prodriguezdefino Jun 28, 2023
b455b4e
added the bigquery services wrapper and factories
prodriguezdefino Jun 28, 2023
27cd837
creates the split, its state and the enumerator state
prodriguezdefino Jun 28, 2023
4d4b60f
added configs, split reader and split assignment impls
prodriguezdefino Jun 28, 2023
0567b58
applying recommendations from sonartype-lift
prodriguezdefino Jun 28, 2023
9006714
adding the Datastream source implementation for BigQuery
prodriguezdefino Jun 28, 2023
d5d95bf
added Table API implementation for BigQuery
prodriguezdefino Jun 29, 2023
1263768
adding the example and shaded distro jar, fixes a NPE when the provid…
prodriguezdefino Jun 29, 2023
5743292
merge changes from master (previous pom deletion resolution)
prodriguezdefino Jul 6, 2023
cab9115
fixing the package name for the schema namespace
prodriguezdefino Jul 6, 2023
3375d54
Merge branch 'common_code_source' into bq_services_wrappers
prodriguezdefino Jul 10, 2023
849f769
merged main branch and took care of few lift comments
prodriguezdefino Jul 11, 2023
fd70b95
Merge branch 'bq_services_wrappers' into source_splits
prodriguezdefino Jul 11, 2023
0f18d14
merge from source_split
prodriguezdefino Jul 11, 2023
6b08119
fixing lift recommendations and spotless
prodriguezdefino Jul 11, 2023
5973d26
Merge branch 'split_assigner_and_reader' into source_functionality
prodriguezdefino Jul 11, 2023
2d9635d
Merge branch 'source_functionality' into table_api
prodriguezdefino Jul 11, 2023
135beeb
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Jul 11, 2023
3ffb582
fixes namespace error in test and spotless
prodriguezdefino Jul 11, 2023
cc45b2f
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Jul 11, 2023
f473d57
addressing comments from review
prodriguezdefino Jul 27, 2023
c178f83
merge from main
prodriguezdefino Aug 1, 2023
09eaaa4
merge from master
prodriguezdefino Aug 1, 2023
def3cc4
Merge branch 'source_splits' into split_assigner_and_reader
prodriguezdefino Aug 1, 2023
ceabb12
fixed type reference Int -> Long
prodriguezdefino Aug 1, 2023
0dc8875
Merge branch 'split_assigner_and_reader' into source_functionality
prodriguezdefino Aug 3, 2023
963c80b
Merge branch 'source_functionality' into table_api
prodriguezdefino Aug 3, 2023
5206ef1
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Aug 3, 2023
1734bac
merge from main
prodriguezdefino Aug 8, 2023
e96ff59
addressing comments from review
prodriguezdefino Aug 8, 2023
3b78492
Merge branch 'split_assigner_and_reader' into source_functionality
prodriguezdefino Aug 8, 2023
c492f02
improved hashcode and equals readability
prodriguezdefino Aug 8, 2023
6a05498
Merge branch 'split_assigner_and_reader' into source_functionality
prodriguezdefino Aug 8, 2023
820fb3b
Merge branch 'source_functionality' into table_api
prodriguezdefino Aug 8, 2023
dd5be94
improve readibility for hashcode and equals
prodriguezdefino Aug 8, 2023
fbd07c6
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Aug 8, 2023
9aae0af
changed tests to use google-truth instead of junit or assertj asserti…
prodriguezdefino Aug 9, 2023
61e5644
Merge branch 'split_assigner_and_reader' into source_functionality
prodriguezdefino Aug 9, 2023
517de82
added google-truth to tests
prodriguezdefino Aug 9, 2023
d916a1c
Merge branch 'source_functionality' into table_api
prodriguezdefino Aug 9, 2023
11fce0d
added google-truth to tests
prodriguezdefino Aug 9, 2023
099078e
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Aug 10, 2023
9da7706
merge from master after #44
prodriguezdefino Oct 31, 2023
a10470c
Merge branch 'source_functionality' into table_api
prodriguezdefino Oct 31, 2023
6b28a0c
removing guava dependency from file
prodriguezdefino Nov 1, 2023
7160ff9
merge from master
prodriguezdefino Nov 1, 2023
87780c5
adding serializable autovalue annotation to avoid storing an Optional…
prodriguezdefino Nov 2, 2023
3f4d1be
addressing comments from review
prodriguezdefino Nov 2, 2023
c8eb789
addressed comments from review
prodriguezdefino Nov 2, 2023
fd79610
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Nov 2, 2023
cce6567
partition list in 1024 chunks
prodriguezdefino Nov 3, 2023
f36f91f
Merge branch 'table_api' into add_example_and_shadedsqljar
prodriguezdefino Nov 3, 2023
5260073
Address review comments
jayehwhyehentee Nov 23, 2023
936076b
adding the example and shaded distro jar, fixes a NPE when the provid…
prodriguezdefino Jun 29, 2023
7977576
Merge branch 'add_example_and_shadedsqljar' of https://github.com/pro…
jayehwhyehentee Nov 24, 2023
1460af6
added covertura analysis through open clover and improved few tests
prodriguezdefino Jun 30, 2023
a8499c3
removed repeated mvn reference from command
prodriguezdefino Jul 11, 2023
229f33d
Merge branch 'main' into use_openclover_coverage
jayehwhyehentee Nov 24, 2023
68d0bf9
Merge branch 'main' into add_example_and_shadedsqljar
jayehwhyehentee Nov 24, 2023
608e3c7
Merge branch 'add_example_and_shadedsqljar' into use_openclover_coverage
jayehwhyehentee Nov 24, 2023
c9f73ee
Address review comments
jayehwhyehentee Nov 27, 2023
8c9cc26
Merge branch 'add_example_and_shadedsqljar' into use_openclover_coverage
jayehwhyehentee Nov 27, 2023
27c34d0
Make checkpoint interval configurable in example jar
jayehwhyehentee Nov 27, 2023
fa4f5a4
Merge branch 'add_example_and_shadedsqljar' into use_openclover_coverage
jayehwhyehentee Nov 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Address review comments
  • Loading branch information
jayehwhyehentee committed Nov 27, 2023
commit c9f73ee25423b272ad7cd1068c61cbc907680adf
Original file line number Diff line number Diff line change
@@ -32,19 +32,27 @@
import org.slf4j.LoggerFactory;

/**
* A simple BigQuery table read example with Flink's DataStream API.
* A simple Flink application using DataStream API and BigQuery connector.
*
* <p>The Flink pipeline will try to read the specified BigQuery table, potentially limiting the
* element count to the specified row restriction and limit count, returning {@link GenericRecord}
* representing the rows, to finally prints out some aggregated values given the provided payload's
* field.
* <p>The Flink pipeline will try to read the specified BigQuery table, limiting the element count
* to the specified row restriction and limit, returning {@link GenericRecord} representing the
* rows, and finally print out some aggregated values given the provided payload's field. The
* sequence of operations in this pipeline is: <i>source > flatMap > keyBy > max > print</i>.
*
* <p>Note on row restriction: In case of including a restriction with a temporal reference,
* something like {@code "TIMESTAMP_TRUNC(ingestion_timestamp, HOUR) = '2023-06-20 19:00:00'"}, and
* launching the job from Flink's Rest API is known the single quotes are not supported and will
* make the pipeline fail. As a workaround for that case using \u0027 as a replacement will make it
* work, example {@code "TIMESTAMP_TRUNC(ingestion_timestamp, HOUR) = \u00272023-06-20
* 19:00:00\u0027"}.
* <p>Flink command line format to execute this application: <br>
* flink run {additional runtime params} {path to this jar}/BigQueryExample.jar <br>
* --gcp-project {required; project ID which contains the BigQuery table} <br>
* --bq-dataset {required; name of BigQuery dataset containing the desired table} <br>
* --bq-table {required; name of BigQuery table to read} <br>
* --agg-prop {required; record property to aggregate in Flink job} <br>
* --restriction {optional; SQL-like filter applied at the BigQuery table before reading} <br>
* --limit {optional; maximum records to read from BigQuery table}
*
* <p>Note on row restriction: In case a restriction relies on temporal reference, something like
* {@code "TIMESTAMP_TRUNC(ingestion_timestamp, HOUR) = '2023-06-20 19:00:00'"}, and if launching
* the job from Flink's Rest API, a known issue is that single quotes are not supported and will
* cause the pipeline to fail. As a workaround, using \u0027 instead of the quotes will work. For
* example {@code "TIMESTAMP_TRUNC(ingestion_timestamp, HOUR) = \u00272023-06-20 19:00:00\u0027"}.
*/
public class BigQueryExample {

@@ -59,8 +67,8 @@ public static void main(String[] args) throws Exception {
"Missing parameters!\n"
+ "Usage: flink run <additional runtime params> BigQuery.jar"
+ " --gcp-project <gcp-project> --bq-dataset <dataset name>"
+ " --bq-table <table name> --agg-prop <payload's property>"
+ " --restriction <single-quoted string with row predicate>"
+ " --bq-table <table name> --agg-prop <record property>"
+ " --restriction <optional: row filter predicate>"
+ " --limit <optional: limit records returned>");
return;
}
@@ -114,7 +122,7 @@ private static void runFlinkJob(

env.fromSource(bqSource, WatermarkStrategy.noWatermarks(), "BigQuerySource")
.flatMap(new FlatMapper(recordPropertyToAggregate))
.keyBy(t -> t.f0)
.keyBy(mappedTuple -> mappedTuple.f0)
.max("f1")
.print();