Skip to content

Latest commit

 

History

History
498 lines (350 loc) · 15.1 KB

bigtop.rst

File metadata and controls

498 lines (350 loc) · 15.1 KB

Bigtop

JDK is needed to run ./gradlew toolchain which installs JDK. puppet apply directly can be used as done in toolchain task .:

$ sudo puppet apply --modulepath="/home/admin/srcs/bigtop:/etc/puppet/modules:/usr/share/puppet/modules:/etc/puppetlabs/code/modules:/etc/puppet/code/modules" -e "include bigtop_toolchain::installer"
sudo yum groupinstall 'Development Tools'
git clone https://github.com/apache/bigtop
cd bigtop
sudo bigtop_toolchain/bin/puppetize.sh
./gradlew toolchain-puppetmodules
./gradlew toolchain

Docker for testing deployment and smoke-tests.:

sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io
sudo usermod -G docker centos
sudo systemctl start docker
$ cd provisioner/docker
$ ./docker-hadoop.sh \
    --create 1 \
    --image bigtop/puppet:trunk-centos-8 \
    --memory 16g \
    --repo file:///bigtop-home/output \
    --disable-gpg-check \
    --stack hdfs,yarn,mapreduce
$ cd provisioner/docker
$ ./docker-hadoop.sh \
    --create 1 \
    --image bigtop/puppet:trunk-ubuntu-22.04 \
    --docker-compose-yml docker-compose-cgroupv2.yml \
    --docker-compose-plugin \
    --memory 16g \
    --repo file:///bigtop-home/output/apt \
    --disable-gpg-check \
    --stack hdfs,yarn,mapreduce

bigtop::bigtop_repo_apt_key must match the public key used for packaging. Add --disable-gpg-check otherwise.

For DEB, available platforms are amd64, aarch64 and ppc64el.

$ cd provisioner/docker
$ ./docker-hadoop.sh \
    --create 1 \
    --image bigtop/puppet:3.2.1-ubuntu-22.04 \
    --docker-compose-yml docker-compose-cgroupv2.yml \
    --docker-compose-plugin \
    --memory 16g \
    --repo http://repos.bigtop.apache.org/releases/3.2.1/ubuntu/22.04/amd64 \
    --stack hdfs,yarn,mapreduce

For RPM, available platforms are x86_64, aarch64 and ppc64le.

$ cd provisioner/docker
$ ./docker-hadoop.sh \
    --create 1 \
    --image bigtop/puppet:3.1.1-rockylinux-8 \
    --docker-compose-yml docker-compose-cgroupv2.yml \
    --docker-compose-plugin \
    --memory 16g \
    --repo http://repos.bigtop.apache.org/releases/3.1.1/rockylinux/8/x86_64 \
    --stack hdfs,yarn,mapreduce,hbase

rockylinux-8:

# puppet --version
6.26.0

rockylinux-9:

# puppet --version
7.27.0

openeuler-22.03:

# puppet --version
7.22.0

fedora-38:

# puppet --version
8.3.1

fedora-40:

# puppet --version
8.5.1

ubuntu-22.04:

# puppet --version
5.5.22

ubuntu-24.04:

# puppet --version
8.4.0

debian-11:

# puppet --version
5.5.22

debian-12:

# puppet --version
7.23.0

Setting environment variable DH_VERBOSE to non null makes dpkg-buildpackage more verbose. For Bigtop, dpkg-buildpackage is called in the following part of packages.gradle:

exec {
  workingDir DEB_BLD_DIR
  commandLine "dpkg-buildpackage -uc -us -sa -S".split(' ')
  environment "DH_VERBOSE", "1
}
$ sudo /bin/bash -x -c 'export SHELLOPTS && SYSTEMCTL_SKIP_REDIRECT=true /etc/init.d/hadoop-httpfs start'

dh_strip_nondeterminism takes quite long time on hadoop-deb packaging. adding blank override_dh_strip_nondeterminism section to bigtop-packages/src/deb/hadoop/rules makes it skipped:

override_dh_strip_nondeterminism:

adding local repository create by ./gradlew repo:

$ sudo bash -c 'echo "deb [trusted=yes] file:///home/admin/srcs/bigtop/output/apt bigtop contrib" > /etc/apt/sources.list.d/bigtop-home_output.list'
$ sudo apt update

you can leverage Docker by *-pkg-ind and repo-ind task.:

$ ./gradlew hadoop-pkg-ind repo-ind -POS=ubuntu-22.04 -Pprefix=trunk -Pdocker-run-option="--privileged" -Pmvn-cache-volume=true
  • -Pdocker-run-option="--privileged" is needed on the Fedora-35 and Ubuntu-22.04 now (depending on the version of systemd).
  • -Pmvn-cache-volume=true attaches docker volume to reuse local repository (~/.m2) to make repeatable build faster.
  • We can not use -Dbuildwithdeps=true for invoking packging of hadoop dependencies (such as bigtop-utils and zookeeper) with *-ind task now.

You can deploy a cluster and run smoke-tests in container by docker provisioner which requires docker-compose.:

$ cd provisioner/docker
$ ./docker-hadoop.sh \
    --create 3 \
    --image bigtop/puppet:trunk-ubuntu-22.04 \
    --docker-compose-yml docker-compose-cgroupv2.yml \
    --docker-compose-plugin \
    --memory 8g \
    --repo file:///bigtop-home/output/apt \
    --disable-gpg-check \
    --stack hdfs,yarn,mapreduce \
    --smoke-tests hdfs,yarn,mapreduce
  • --docker-compose-yml docker-compose-cgroupv2.yml is needed on cgroup v2.
  • --docker-compose-plugin is for using docker compose instead of docker-compose.
  • use --repo file:///bigtop-home/output for RPM instead of DEB.

You can log in to the node and see files if you need.:

$ ./docker-hadoop.sh -dcp --exec 1 /bin/bash

Kerberos authentication can be enabled by `adding hiera variables to generated site.yaml< https://github.com/apache/bigtop/blob/rel/3.3.0/provisioner/docker/docker-hadoop.sh#L154-L162>`_

 $ git diff
 diff --git a/provisioner/docker/docker-hadoop.sh b/provisioner/docker/docker-hadoop.sh
 index 38ece152..feadd8f7 100755
 --- a/provisioner/docker/docker-hadoop.sh
 +++ b/provisioner/docker/docker-hadoop.sh
 @@ -172,6 +172,13 @@ bigtop::bigtop_repo_gpg_check: $gpg_check
  hadoop_cluster_node::cluster_components: $3
  hadoop_cluster_node::cluster_nodes: [$node_list]
  hadoop::common_yarn::yarn_resourcemanager_scheduler_class: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 +hadoop::hadoop_security_authentication: "kerberos"
 +kerberos::krb_site::domain: "bigtop.apache.org"
 +kerberos::krb_site::realm: "BIGTOP.APACHE.ORG"
 +kerberos::krb_site::kdc_server: "%{hiera('bigtop::hadoop_head_node')}"
 +kerberos::krb_site::kdc_port: "88"
 +kerberos::krb_site::admin_port: "749"
 +kerberos::krb_site::keytab_export_dir: "/var/lib/bigtop_keytabs"
  EOF
}

and adding kerberos to the list of stacks.:

$ ./docker-hadoop.sh \
    --create 1 \
    --image bigtop/puppet:trunk-rockylinux-8 \
    --docker-compose-yml docker-compose-cgroupv2.yml \
    --docker-compose-plugin \
    --memory 16g \
    --repo http://repos.bigtop.apache.org/releases/3.0.1/centos/8/x86_64 \
    --stack kerberos,hdfs,yarn,mapreduce

Example of rockylinux-8 built by https://ci.bigtop.apache.org/job/Bigtop-3.2.1-aarch64/

BASEARCH is used as $basearch of Yum variables. Possible values are x86_64, aarch64 and ppc64le. It is used as the name of Jenkins job too.

PLATFORM is label set to agent of Jenkins. Possible values are amd64-slave, aarch64-slave and ppc64le-slave here.

$ export GPG_TTY=$(tty)
$ export VERSION=3.3.0
$ export OS=rockylinux
$ export OSVER=8
$ export BASEARCH=aarch64
$ export PLATFORM=aarch64-slave
$ mkdir -p releases/${VERSION}/${OS}/${OSVER}/${BASEARCH}
$ cd releases/${VERSION}/${OS}/${OSVER}/${BASEARCH}
$ for product in bigtop-groovy bigtop-jsvc bigtop-select bigtop-utils alluxio flink hadoop hbase hive kafka livy phoenix ranger solr spark tez zeppelin zookeeper
  do
    rm -rf ${product} &&
    curl -L -o ${product}.zip https://ci.bigtop.apache.org/job/Bigtop-${VERSION}-${BASEARCH}/DISTRO=${OS}-${OSVER},PLATFORM=${PLATFORM},PRODUCT=${product}/lastSuccessfulBuild/artifact/*zip*/archive.zip &&
    jar xf ${product}.zip &&
    mv archive/output/${product} . &&
    rmdir -p archive/output &&
    rm ${product}.zip
  done
$ find . -name '*.rpm' | xargs rpm --define '_gpg_name Masatake Iwasaki' --addsign

$ rm -rf repodata
$ createrepo .
$ gpg --detach-sign --armor repodata/repomd.xml

$ aws --profile iwasakims s3 sync --acl public-read . s3://repos.bigtop.apache.org/releases/${VERSION}/${OS}/${OSVER}/${BASEARCH}/

Example of debian-11 built by https://ci.bigtop.apache.org/job/Bigtop-3.2.1-x86_64/

ARCH is used as $(ARCH) of deb. Possible values are amd64, arm64 and ppc64el as shown by dpkg-architecture -L It is ppc64el for Deb packaging while ppc64le is used for RPM packaging.

BASEARCH is used as $basearch of Yum variables. Possible values are x86_64, aarch64 and ppc64le. Since it is used as the name of Jenkins jobs, it must be defined even on Deb packaging too.

PLATFORM is label set to agent of Jenkins. Possible values are amd64-slave, aarch64-slave and ppc64le-slave here.

$ export GPG_TTY=$(tty)
$ export VERSION=3.3.0
$ export OS=debian
$ export OSVER=11
$ export ARCH=amd64
$ export BASEARCH=x86_64
$ export PLATFORM=amd64-slave
$ export SIGN_KEY=36243EECE206BB0D
$ mkdir -p releases/${VERSION}/${OS}/${OSVER}/${ARCH}
$ cd releases/${VERSION}/${OS}/${OSVER}/${ARCH}
$ for product in bigtop-groovy bigtop-jsvc bigtop-utils alluxio flink hadoop hbase hive kafka livy phoenix ranger solr spark tez zeppelin zookeeper
  do
    rm -rf ${product} &&
    curl -L -o ${product}.zip https://ci.bigtop.apache.org/job/Bigtop-${VERSION}-${BASEARCH}/DISTRO=${OS}-${OSVER},PLATFORM=${PLATFORM},PRODUCT=${product}/lastSuccessfulBuild/artifact/*zip*/archive.zip &&
    jar xf ${product}.zip &&
    mv archive/output/${product} . &&
    rmdir -p archive/output &&
    rm ${product}.zip
  done
$ find . -name '*.deb' | xargs dpkg-sig --cache-passphrase --sign builder --sign-changes force_full

$ mkdir -p conf

$ cat > conf/distributions <<__EOT__
Origin: Bigtop
Label: Bigtop
Suite: stable
Codename: bigtop
Version: ${VERSION}
Architectures: ${ARCH} source
Components: contrib
Description: Apache Bigtop
SignWith: ${SIGN_KEY}
__EOT__

$ cat > conf/options <<__EOT__
verbose
ask-passphrase
__EOT__

$ find . -name '*.deb' | xargs reprepro --ask-passphrase -Vb . includedeb bigtop
$ mkdir tmprepo
$ mv conf db dists pool tmprepo/

$ aws --profile iwasakims s3 sync --acl public-read ./tmprepo s3://repos.bigtop.apache.org/releases/${VERSION}/${OS}/${OSVER}/${ARCH}/

tweak file name and download site of source tarball.:

$ git clone https://github.com/apache/bigtop
$ cd bigtop
$ vi bigtop.bom
$ git diff .
diff --git a/bigtop.bom b/bigtop.bom
index ff6d4e1..d4ce521 100644
--- a/bigtop.bom
+++ b/bigtop.bom
@@ -144,12 +144,12 @@ bigtop {
     'hadoop' {
       name    = 'hadoop'
       relNotes = 'Apache Hadoop'
-      version { base = '2.7.3'; pkg = base; release = 1 }
+      version { base = '2.7.4'; pkg = base; release = 1 }
       tarball { destination = "${name}-${version.base}.tar.gz"
-                source      = "${name}-${version.base}-src.tar.gz" }
+                source      = "${name}-${version.base}-RC0-src.tar.gz" }
       url     { download_path = "/$name/common/$name-${version.base}"
-                site = "${apache.APACHE_MIRROR}/${download_path}"
-                archive = "${apache.APACHE_ARCHIVE}/${download_path}" }
+                site = "http://home.apache.org/~shv/hadoop-2.7.4-RC0/"
+                archive = "" }
     }
     'ignite-hadoop' {
       name    = 'ignite-hadoop'

build with depended components then run smoke-tests.:

$ ./gradlew hadoop-rpm yum -Dbuildwithdeps=true
$ ./docker-hadoop.sh \
    --create 3 \
    --image bigtop/puppet:trunk-centos-8 \
    --memory 8g \
    --repo file:///bigtop-home/output \
    --disable-gpg-check \
    --stack hdfs,yarn,mapreduce \
    --smoke-tests hdfs,yarn,mapreduce

systemd 237 or above checks the pid and the permission of PID file of non-root service as a fix for CVE-2018-16888 . /sys/fs/cgroups must be mounted to run service via systemd inside containers.

The article of Red Hat elaborate the workaround.

BIGTOP-3302 addressed the issue.

CVE-2018-16888 affects init script run via systemd. runuser must be used instead of su (without - or -l) to pass the check of pid file.

See BIGTOP-3302 for details.

# systemctl cat hadoop-mapreduce-historyserver.service

# systemctl list-dependencies hadoop-mapreduce-historyserver.service

# SYSTEMD_LOG_LEVEL=debug systemctl status hadoop-mapreduce-historyserver.service

assuming 22.03 LTS SP3.

https://docs.openeuler.org/en/docs/22.03_LTS/docs/Container/installation-and-deployment-3.html

docker-engine package provides all required resources.:

$ sudo dnf install docker-engine
$ sudo usermod -aG docker openeuler
$ sudo systemctl start docker

standalone docker-compose can be used as usual.:

$ sudo curl -SL https://github.com/docker/compose/releases/download/v2.27.0/docker-compose-linux-aarch64 -o /usr/local/bin/docker-compose
$ sudo chmod a+x /usr/local/bin/docker-compose
$ sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
$ docker-compose --version