Skip to content

Commit

Permalink
Merge pull request #116 from arenadata/feature/ADH-5052-krb-test-stand
Browse files Browse the repository at this point in the history
[ADH-5052] Support kerberos in docker test stand
  • Loading branch information
iamlapa authored Sep 23, 2024
2 parents 9461749 + 9b03d18 commit 15c724e
Show file tree
Hide file tree
Showing 24 changed files with 891 additions and 61 deletions.
2 changes: 2 additions & 0 deletions build-images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ case $CLUSTER_TYPE in
--build-arg="HADOOP_VERSION=${HADOOP_VERSION}" \
--build-arg="SSM_APP_VERSION=${SSM_APP_VERSION}" .

docker build -f ./supports/tools/docker/multihost/kerberos/Dockerfile-kdc -t cloud-hub.adsw.io/library/ssm-kdc-server:${HADOOP_VERSION} .

docker build -f ./supports/tools/docker/multihost/datanode/Dockerfile-hadoop-datanode -t cloud-hub.adsw.io/library/hadoop-datanode:${HADOOP_VERSION} .

docker build -f ./supports/tools/docker/multihost/namenode/Dockerfile-hadoop-namenode -t cloud-hub.adsw.io/library/hadoop-namenode:${HADOOP_VERSION} .
Expand Down
77 changes: 64 additions & 13 deletions supports/tools/docker/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Run Hadoop cluster with SSM in docker containers

There are two cluster types:

* singlehost
* multihost

Expand All @@ -19,7 +20,7 @@ Command to build docker images in singlehost cluster mode (from project root dir
./build-images.sh --cluster=singlehost --hadoop=3.3
```

Command to start docker containers
Command to start docker containers

```shell
cd ./supports/tools/docker
Expand All @@ -32,6 +33,7 @@ cd ./supports/tools/docker
* Hadoop namenode, node manager, resource manager in container
* SSM Server container
* SSM metastore as postgres container
* Kerberos KDC container

Command to build docker images in multihost cluster mode (from project root dir)

Expand All @@ -46,17 +48,53 @@ cd ./supports/tools/docker
./start-demo.sh --cluster=multihost --hadoop=3.3
```

Use one of the following credentials to log in to the Web UI

| Login | Password | Type |
|----------------|-----------|----------|
| john | 1234 | static |
| krb_user1@DEMO | krb_pass1 | kerberos |
| krb_user2@DEMO | krb_pass2 | kerberos |

### Testing SPNEGO auth

In order to test SPNEGO authentication provider, you need to:

1. Move the `supports/tools/docker/multihost/kerberos/krb5.conf` Kerberos configuration file to the `/etc` directory
(after backing up your old config file)
2. Log in to the KDC server with one of the Kerberos principals

```shell
kinit krb_user1
```

3. Add the following lines to the `/etc/hosts` file

```
127.0.0.1 ssm-server.demo
127.0.0.1 kdc-server.demo
```

4. Try to access any SSM resource. Following query should respond with code 200 and json body:

```shell
curl --negotiate http://ssm-server.demo:8081/api/v2/audit/events
```

# Run/Test SSM with Docker

Docker can greately reduce boring time for installing and maintaining software on servers and developer machines. This document presents this basic workflow of Run/test ssm with docker. [Docker Quick Start](https://docs.docker.com/get-started/)
Docker can greately reduce boring time for installing and maintaining software on servers and developer machines. This
document presents this basic workflow of Run/test ssm with
docker. [Docker Quick Start](https://docs.docker.com/get-started/)

## Necessary Components

### MetaStore(Postgresql) on Docker

#### Launch a postgresql container

Pull latest postgresql official image from docker store. You can use `postgres:tag` to specify the Postgresql version (`tag`) you want.
Pull latest postgresql official image from docker store. You can use `postgres:tag` to specify the Postgresql version (
`tag`) you want.

```
docker pull postgres
Expand All @@ -67,19 +105,22 @@ Launch a postgres container with a given {passowrd} on 5432, and create a test d
```bash
docker run -p 5432:5432 --name {container_name} -e POSTGRES_PASSWORD={password} -e POSTGRES_DB={database_name} -d postgres:latest
```

**Parameters:**

- `container_name` name of container
- `password` root password of user root for login and access.
- `database_name` Create a new database/schema with given name.
- `database_name` Create a new database/schema with given name.

### HDFS on Docker

**Note that this part is not suggested on OSX (mac), becasue the containers' newtork is limited on OSX.**

Pull a well-known third-party hadoop image from docker store. You can use `hadoop-docker:tag` to specify the Hadoop version (`tag`) you want.
Pull a well-known third-party hadoop image from docker store. You can use `hadoop-docker:tag` to specify the Hadoop
version (`tag`) you want.

#### Set a HDFS Container

```bash
docker pull sequenceiq/hadoop-docker
```
Expand All @@ -89,15 +130,19 @@ Launch a Hadoop container with a exposed namenode.rpcserver.
```bash
docker run -it --add-host=moby:127.0.0.1 --ulimit memlock=2024000000:2024000000 -p 9000:9000 --name=hadoop sequenceiq/hadoop-docker /etc/bootstrap.sh -bash
```
Note that we try to launch a interactive docker container. Use the following command to check HDFS status. We also set `memlock=2024000000` for cache size.

Note that we try to launch a interactive docker container. Use the following command to check HDFS status. We also set
`memlock=2024000000` for cache size.

```
cd $HADOOP_PREFIX
bin/hdfs dfs -ls /
```

#### Configure HDFS with multiple storage types and cache
Edit `$HADOOP_PREFIX/etc/hadoop/hdfs-site.xml` and add the property below. This will turn off premission check to avoid `Access denied for user ***. Superuser privilege is required`.

Edit `$HADOOP_PREFIX/etc/hadoop/hdfs-site.xml` and add the property below. This will turn off premission check to avoid
`Access denied for user ***. Superuser privilege is required`.

```
<property>
Expand All @@ -106,7 +151,8 @@ Edit `$HADOOP_PREFIX/etc/hadoop/hdfs-site.xml` and add the property below. This
</property>
```

Create `/tmp/hadoop-root/dfs/data1~3` for different storage types. Delete all content in `/tmp/hadoop-root/dfs/data` and `/tmp/hadoop-root/dfs/name`, then use `bin/hdfs namenode -format` to format HDFS.
Create `/tmp/hadoop-root/dfs/data1~3` for different storage types. Delete all content in `/tmp/hadoop-root/dfs/data` and
`/tmp/hadoop-root/dfs/name`, then use `bin/hdfs namenode -format` to format HDFS.

Add the following properties to `$HADOOP_PREFIX/etc/hadoop/hdfs-site.xml`.

Expand All @@ -129,8 +175,8 @@ Add the following properties to `$HADOOP_PREFIX/etc/hadoop/hdfs-site.xml`.
</property>
```


Restart HDFS.

```
$HADOOP_PREFIX/sbin/stop-dfs.sh
$HADOOP_PREFIX/sbin/start-dfs.sh
Expand All @@ -151,14 +197,18 @@ Assuming you are in SSM root directory, modify `conf/druid.xml` to enable SSM to
<entry key="username">root</entry>
<entry key="password">{root_password}</entry>
```
Wait for at least 10 seconds. Then, use `bin/start-smart.sh -format` to format (re-init) the database. Also, you can use this command to clear all data in database in tests.

#### Stop/Remove Postgres container
Wait for at least 10 seconds. Then, use `bin/start-smart.sh -format` to format (re-init) the database. Also, you can use
this command to clear all data in database in tests.

You can use the `docker stop {contrainer_name}` to stop postgres container. Then, this postgres service cannot be accessed, until you start it again with `docker start {contrainer_name}`. Note that, `stop/start` will not remove any data from your postgres container.
#### Stop/Remove Postgres container

Use `docker rm {container_name}` to remove postgres container, if this container is not necessary. If you don't remember the specific name of container, you can use `docker ps -a` to look for it.
You can use the `docker stop {contrainer_name}` to stop postgres container. Then, this postgres service cannot be
accessed, until you start it again with `docker start {contrainer_name}`. Note that, `stop/start` will not remove any
data from your postgres container.

Use `docker rm {container_name}` to remove postgres container, if this container is not necessary. If you don't remember
the specific name of container, you can use `docker ps -a` to look for it.

### HDFS

Expand All @@ -167,6 +217,7 @@ Use `docker rm {container_name}` to remove postgres container, if this container
Configure `namenode.rpcserver` in `smart-site.xml`.

```xml

<configuration>
<property>
<name>smart.dfs.namenode.rpcserver</name>
Expand Down
2 changes: 1 addition & 1 deletion supports/tools/docker/multihost/Dockerfile-hadoop-base
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ ENV SSM_HOME=/opt/ssm
ENV HADOOP_URL https://archive.apache.org/dist/hadoop/core/hadoop-$HADOOP_VERSION/hadoop-$HADOOP_VERSION.tar.gz

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
net-tools curl wget netcat procps gnupg libsnappy-dev && rm -rf /var/lib/apt/lists/*
net-tools curl wget netcat procps gnupg libsnappy-dev krb5-user && rm -rf /var/lib/apt/lists/*

# Install SSH server
RUN apt-get update \
Expand Down
2 changes: 1 addition & 1 deletion supports/tools/docker/multihost/conf/agents
Original file line number Diff line number Diff line change
@@ -1 +1 @@
hadoop-datanode
hadoop-datanode.demo
15 changes: 14 additions & 1 deletion supports/tools/docker/multihost/conf/core-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,24 @@
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-namenode:8020</value>
<value>hdfs://hadoop-namenode.demo:8020</value>
</property>
<property>
<name>fs.hdfs.impl</name>
<value>org.smartdata.hadoop.filesystem.SmartFileSystem</value>
<description>The FileSystem for hdfs URL</description>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>

<property>
<name>smart.server.kerberos.principal</name>
<value>ssm/ssm-server.demo@DEMO</value>
</property>
</configuration>
2 changes: 1 addition & 1 deletion supports/tools/docker/multihost/conf/druid.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<entry key="url">jdbc:postgresql://ssm-metastore-db:5432/metastore</entry>
<entry key="url">jdbc:postgresql://ssm-metastore-db.demo:5432/metastore</entry>
<entry key="username">ssm</entry>
<entry key="password">ssm</entry>

Expand Down
54 changes: 46 additions & 8 deletions supports/tools/docker/multihost/conf/hdfs-site.xml
Original file line number Diff line number Diff line change
Expand Up @@ -20,30 +20,68 @@
<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- Turn security off for tests by default -->
<property>
<name>hadoop.security.authentication</name>
<value>simple</value>
</property>
<!-- Disable min block size since most tests use tiny blocks -->
<property>
<name>dfs.namenode.fs-limits.min-block-size</name>
<value>0</value>
</property>
<property>
<name>smart.server.rpc.address</name>
<value>ssm-server:7042</value>
<value>ssm-server.demo:7042</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>[RAM_DISK]file://hadoop/dfs/ram-data,[SSD]file://hadoop/dfs/ssd-data,[DISK]file://hadoop/dfs/data,[ARCHIVE]file://hadoop/dfs/archive-data</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<name>hadoop.user.group.static.mapping.overrides</name>
<value>ssm=supergroup;agent=supergroup</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>1048576</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/etc/secrets/namenode.keytab</value>
</property>
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>namenode/_HOST@DEMO</value>
</property>
<property>
<name>dfs.namenode.delegation.token.max-lifetime</name>
<value>604800000</value>
<description>The maximum lifetime in milliseconds for which a delegation token is valid.</description>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/secrets/datanode.keytab</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>datanode/_HOST@DEMO</value>
</property>
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>

<!-- Set privileged ports -->
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:1004</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:1006</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>0.0.0.0:1007</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:1005</value>
</property>
</configuration>
2 changes: 1 addition & 1 deletion supports/tools/docker/multihost/conf/servers
Original file line number Diff line number Diff line change
@@ -1 +1 @@
ssm-server
ssm-server.demo
Loading

0 comments on commit 15c724e

Please sign in to comment.