Skip to content

Commit

Permalink
prepare relase 0.22.0
Browse files Browse the repository at this point in the history
  • Loading branch information
davidrabinowitz committed Aug 10, 2021
1 parent 781b8e0 commit 47904b4
Show file tree
Hide file tree
Showing 3 changed files with 30 additions and 21 deletions.
5 changes: 5 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# Release Notes

## 0.22.0 - 2021-06-22
* PR #404: Added support for BigNumeric
* PR #430: Added HTTP and gRPC proxy support
* Issue #273: Resolved the streaming write issue for spark 3.x

## 0.21.1 - 2021-06-22
* PR #413: Pushing all filters to BigQuery Storage API
* Issue #412: Supporting WITH queries
Expand Down
43 changes: 23 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ The latest version of the connector is publicly available in the following links

| version | Link |
| --- | --- |
| Scala 2.11 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.21.1.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.21.1.jar)) |
| Scala 2.12 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.21.1.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.21.1.jar)) |
| Scala 2.11 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.22.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.22.0.jar)) |
| Scala 2.12 | `gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.22.0.jar` ([HTTP link](https://storage.googleapis.com/spark-lib/bigquery/spark-bigquery-with-dependencies_2.12-0.22.0.jar)) |

The connector is also available from the
[Maven Central](https://repo1.maven.org/maven2/com/google/cloud/spark/)
Expand All @@ -78,8 +78,8 @@ repository. It can be used using the `--packages` option or the

| version | Connector Artifact |
| --- | --- |
| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.21.1` |
| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.21.1` |
| Scala 2.11 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.22.0` |
| Scala 2.12 | `com.google.cloud.spark:spark-bigquery-with-dependencies_2.12:0.22.0` |

If you want to keep up with the latest version of the connector the following links can be used. Notice that for production
environments where the connector version should be pinned, one of the above links should be used.
Expand Down Expand Up @@ -703,13 +703,15 @@ Code examples:
**Scala:**

```
import org.apache.spark.bigquery.BigNumeric
val df = spark.read
.format("bigquery")
.load({project}.{dataset}.{table_name})
.load("PROJECT.DATASET.TABLE")
val rows: Array[java.math.BigDecimal] = df
.collect()
.map(row => row.get({columnPositionOfBignumeric}).asInstanceOf[BigNumeric].getNumber)
.map(row => row.get("BIG_NUMERIC_COLUMN").asInstanceOf[BigNumeric].getNumber)
rows.foreach(value => System.out.println("BigNumeric value " + value.toPlainString))
```
Expand All @@ -721,9 +723,9 @@ creating the job or added during runtime. See examples below:
1) Adding python files while launching pyspark
```
# use appropriate version for jar depending on the scala version
pyspark --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.21.0.jar
--py-files gs://spark-lib/bigquery/spark-bigquery-support-0.21.0.zip
--files gs://spark-lib/bigquery/spark-bigquery-support-0.21.0.zip
pyspark --jars gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.22.0.jar
--py-files gs://spark-lib/bigquery/spark-bigquery-support-0.22.0.zip
--files gs://spark-lib/bigquery/spark-bigquery-support-0.22.0.zip
```

2) Adding python files in Jupyter Notebook
Expand All @@ -732,9 +734,9 @@ from pyspark.sql import SparkSession
# use appropriate version for jar depending on the scala version
spark = SparkSession.builder\
.appName('BigNumeric')\
.config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.21.0.jar')\
.config('spark.submit.pyFiles', 'gs://spark-lib/bigquery/spark-bigquery-support-0.21.0.zip')\
.config('spark.files', 'gs://spark-lib/bigquery/spark-bigquery-support-0.21.0.zip')\
.config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.22.0.jar')\
.config('spark.submit.pyFiles', 'gs://spark-lib/bigquery/spark-bigquery-support-0.22.0.zip')\
.config('spark.files', 'gs://spark-lib/bigquery/spark-bigquery-support-0.22.0.zip')\
.getOrCreate()
```

Expand All @@ -743,10 +745,10 @@ spark = SparkSession.builder\
# use appropriate version for jar depending on the scala version
spark = SparkSession.builder\
.appName('BigNumeric')\
.config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.21.0.jar')\
.config('spark.jars', 'gs://spark-lib/bigquery/spark-bigquery-with-dependencies_2.11-0.22.0.jar')\
.getOrCreate()
spark.sparkContext.addPyFile("gs://spark-lib/bigquery/spark-bigquery-support-0.21.0.zip")
spark.sparkContext.addPyFile("gs://spark-lib/bigquery/spark-bigquery-support-0.22.0.zip")
```

Usage Example:
Expand All @@ -760,7 +762,8 @@ for row in data:
print(str(bigNumeric.number))
```

Incase the above code throws ModuleNotFoundError, use the following code before reading the BigNumeric data.
In case the above code throws ModuleNotFoundError, please add the following code
before reading the BigNumeric data.

```
try:
Expand Down Expand Up @@ -848,7 +851,7 @@ using the following code:
```python
from pyspark.sql import SparkSession
spark = SparkSession.builder\
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.21.1")\
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.22.0")\
.getOrCreate()
df = spark.read.format("bigquery")\
.load("dataset.table")
Expand All @@ -857,15 +860,15 @@ df = spark.read.format("bigquery")\
**Scala:**
```python
val spark = SparkSession.builder
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.21.1")
.config("spark.jars.packages", "com.google.cloud.spark:spark-bigquery-with-dependencies_2.11:0.22.0")
.getOrCreate()
val df = spark.read.format("bigquery")
.load("dataset.table")
```

In case Spark cluster is using Scala 2.12 (it's optional for Spark 2.4.x,
mandatory in 3.0.x), then the relevant package is
com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.21.1. In
com.google.cloud.spark:spark-bigquery-with-dependencies_**2.12**:0.22.0. In
order to know which Scala version is used, please run the following code:

**Python:**
Expand All @@ -889,14 +892,14 @@ To include the connector in your project:
<dependency>
<groupId>com.google.cloud.spark</groupId>
<artifactId>spark-bigquery-with-dependencies_${scala.version}</artifactId>
<version>0.21.1</version>
<version>0.22.0</version>
</dependency>
```

### SBT

```sbt
libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.21.1"
libraryDependencies += "com.google.cloud.spark" %% "spark-bigquery-with-dependencies" % "0.22.0"
```

## FAQ
Expand Down
3 changes: 2 additions & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ lazy val grpcVersion = "1.37.1"
lazy val nettyVersion = "4.1.65.Final"
// should match the dependency in grpc-netty
lazy val nettyTcnativeVersion = "2.0.39.Final"
lazy val artifactVersion = "0.21.2-SNAPSHOT"
lazy val artifactVersion = "0.22.0"

lazy val commonSettings = Seq(
organization := "com.google.cloud.spark",
Expand Down Expand Up @@ -76,6 +76,7 @@ lazy val spark3support = (project in file("spark3support"))
.enablePlugins(BuildInfoPlugin)
.settings(
commonSettings,
publishSettings,
name := "spark3support",
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.12" % "3.0.0" % "provided",
Expand Down

0 comments on commit 47904b4

Please sign in to comment.