`spark-custom-addons`

Experimental set-up to add dependencies onto spark-custom Docker images. Builds for both Debian and Alpine.

This adds the following:

AWS Hadoop SDK JAR
- Appends spark.hadoop.fs.s3a.impl org.apache.hadoop.fs.s3a.S3AFileSystem into spark-defaults.conf
Google Cloud Storage SDK JAR
MariaDB JDBC Connector JAR

Additionally, all Alpine builds have gcompat and libc6-compat installed to prevent glibc shared library related issues.

AWS Java SDK Version Derivation

The version of AWS Java SDK is dependent on the Hadoop version. An example of how to derive this version number for Hadoop 3.1.0 is here:

For Linux user, you can download Tera CLI v0.4 at https://github.com/guangie88/tera-cli/releases and place it in PATH.

Otherwise, you will need cargo, which can be installed via rustup.

Once cargo is installed, simply run cargo install tera-cli --version=^0.4.0.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
alpine		alpine
debian		debian
templates		templates
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
push-images.sh		push-images.sh