Releases: G-Research/spark-dgraph-connector
Releases · G-Research/spark-dgraph-connector
v0.8.0 (Spark 2.4) - 2022-01-19
v0.7.0 (Spark 2.4) - 2021-10-02
Fixed
- Support latest dgraph release 21.03.0 (#101)
v0.7.0 (Spark 3.1) - 2021-10-01
v0.7.0 (Spark 3.0) - 2021-10-01
v0.6.0 (Spark 3.1) - 2021-03-05
Added
- Adds support to read string predicates with language tags like
<http://www.w3.org/2000/01/rdf-schema#label@en>
(issue #63).
This works with any source and mode except the node source in wide mode. Note that reading into GraphFrames is based on the wide mode, so only the untagged language strings can be read there. Filter pushdown is not supported for multi-language predicates yet (issue #68). - Adds readable exception and suggests next steps when GRPC fails with
RESOURCE_EXHAUSTED
code. - Missing
maxLeaseId
in cluster state response defaults to1000L
to avoid an exception.
Changed
- Improves predicate partitioning on projection pushdown as it creates full partitions.
- Fixes bug that did not push predicate value filter correctly down to Dgraph causing incorrect results (issue #82)
- Fixes bug in reading
geo
andpassword
data types. - Tests against Dgraph 20.03, 20.07 and 20.11.
- Moved Java Dgraph client to 20.11.0.
- Upgraded all dependencies to latest versions.
v0.6.0 (Spark 3.0) - 2021-03-05
Added
- Adds support to read string predicates with language tags like
<http://www.w3.org/2000/01/rdf-schema#label@en>
(issue #63).
This works with any source and mode except the node source in wide mode. Note that reading into GraphFrames is based on the wide mode, so only the untagged language strings can be read there. Filter pushdown is not supported for multi-language predicates yet (issue #68). - Adds readable exception and suggests next steps when GRPC fails with
RESOURCE_EXHAUSTED
code. - Missing
maxLeaseId
in cluster state response defaults to1000L
to avoid an exception.
Changed
- Improves predicate partitioning on projection pushdown as it creates full partitions.
- Fixes bug that did not push predicate value filter correctly down to Dgraph causing incorrect results (issue #82)
- Fixes bug in reading
geo
andpassword
data types. - Tests against Dgraph 20.03, 20.07 and 20.11.
- Moved Java Dgraph client to 20.11.0.
- Upgraded all dependencies to latest versions.
v0.6.0 (Spark 2.4) - 2021-03-05
Added
- Adds support to read string predicates with language tags like
<http://www.w3.org/2000/01/rdf-schema#label@en>
(issue #63).
This works with any source and mode except the node source in wide mode. Note that reading into GraphFrames is based on the wide mode, so only the untagged language strings can be read there. Filter pushdown is not supported for multi-language predicates yet (issue #68). - Adds readable exception and suggests next steps when GRPC fails with
RESOURCE_EXHAUSTED
code. - Missing
maxLeaseId
in cluster state response defaults to1000L
to avoid an exception.
Changed
- Improves predicate partitioning on projection pushdown as it creates full partitions.
- Fixes bug that did not push predicate value filter correctly down to Dgraph causing incorrect results (issue #82)
- Fixes bug in reading
geo
andpassword
data types. - Tests against Dgraph 20.03, 20.07 and 20.11.
- Moved Java Dgraph client to 20.11.0.
- Upgraded all dependencies to latest versions.
v0.5.0 (Spark 3.0) - 2020-10-21
Added
- Optionally reads all partitions within the same transaction. This guarantees a consistent snapshot of the graph (issue #6).
However, concurrent mutations reduce the lifetime of such a transaction and will cause an exception when lifespan exceeds. - Add Python API that mirrors the Scala API. The README.md fully documents how to load Dgraph data in PySpark.
- Fixed dependency conflicts between connector dependencies and Spark
by shading the Java Dgraph client and all its dependencies.
Changed
- Refactored connector API, renamed
spark.read.dgraph*
methods tospark.read.dgraph.*
. - Moved
triples
,edges
andnodes
sources from packageuk.co.gresearch.spark.dgraph.connector
touk.co.gresearch.spark.dgraph
. - Moved Java Dgraph client to 20.03.1 and Dgraph test cluster to 20.07.0.
v0.5.0 (Spark 2.4) - 2020-10-21
Added
- Load data from Dgraph cluster as GraphFrames
GraphFrame
. - Optionally reads all partitions within the same transaction. This guarantees a consistent snapshot of the graph (issue #6).
However, concurrent mutations reduce the lifetime of such a transaction and will cause an exception when lifespan exceeds. - Add Python API that mirrors the Scala API. The README.md fully documents how to load Dgraph data in PySpark.
- Fixed dependency conflicts between connector dependencies and Spark
by shading the Java Dgraph client and all its dependencies.
Changed
- Refactored connector API, renamed
spark.read.dgraph*
methods tospark.read.dgraph.*
. - Moved
triples
,edges
andnodes
sources from packageuk.co.gresearch.spark.dgraph.connector
touk.co.gresearch.spark.dgraph
. - Moved Java Dgraph client to 20.03.1 and Dgraph test cluster to 20.07.0.
v0.4.2 (Spark 3.0) - 2020-07-28
Fixed
- Fixed dependency conflicts between connector dependencies and Spark.