Contains a deployment of a custom tika 2.x server as a docker image. Please refer to https://github.com/puthurr/tika-fork for more details.
Based on Apache Tika version : 2.7.0
Refer to Apache Tika 2.x breaking changes documentation https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0
docker build -f Dockerfile -t puthurr/tika2:2.7.0-20230704 .
ACR_NAME=<registry-name>
az acr build --image puthurr/tika2:2.7.0-20230704 --registry $ACR_NAME --file Dockerfile .
https://www.oracle.com/java/technologies/downloads/#java8
We use the Linux x64 Server JRE distribution aka server-jre-8u311-linux-x64.tar.gz
https://www.oracle.com/a/tech/docs/8u311checksum.html
server-jre-8u311-linux-x64.tar.gz
sha256: 4132d53f500fea109386a5734dc156468558d792082cfbd39f0a097e6f55e710
md5: 01c29e7adf7eae704a3d04f0d353f624
ENV JAVA_VERSION=1.8.0_311 \
JAVA_PKG=server-jre-8u311-linux-x64.tar.gz \
JAVA_SHA256=4132d53f500fea109386a5734dc156468558d792082cfbd39f0a097e6f55e710 \
JAVA_HOME=/usr/java/jdk-8