Skip to content

Latest commit

 

History

History
423 lines (315 loc) · 16.3 KB

File metadata and controls

423 lines (315 loc) · 16.3 KB

docker-compose-rabbitmq-mssql

Overview

This repository illustrates a reference implementation of Senzing using RabbitMQ as the queue and MSSQL as the underlying database.

The instructions show how to set up a system that:

  1. Reads JSON lines from a file on the internet.
  2. Sends each JSON line to a message queue.
    1. In this implementation, the queue is RabbitMQ.
  3. Reads messages from the queue and inserts into Senzing.
    1. In this implementation, Senzing keeps its data in a MSSQL database.
  4. Reads information from Senzing via Senzing REST API server.
  5. Views resolved entities in a web app.

The following diagram shows the relationship of the docker containers in this docker composition. Arrows represent data flow.

Image of architecture

This docker formation brings up the following docker containers:

  1. bitnami/rabbitmq
  2. mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04
  3. mcr.microsoft.com/mssql-tools
  4. senzing/adminer
  5. senzing/console
  6. senzing/entity-web-search-app
  7. senzing/init-container
  8. senzing/jupyter
  9. senzing/redoer
  10. senzing/stream-producer
  11. senzing/senzing-api-server
  12. senzing/stream-loader

Contents

  1. Expectations
    1. Space
    2. Time
    3. Background knowledge
  2. Preparation
    1. Prerequisite software
    2. Clone repository
  3. Using docker-compose
    1. Volumes
    2. SSH port
    3. Set sshd password
    4. EULA
    5. Install Senzing
    6. Install Senzing license
    7. Install MS SQL driver
    8. Run docker formation
  4. View data
    1. View docker containers
    2. Use SSH
    3. View RabbitMQ
    4. View MSSQL
    5. View Senzing API
    6. View Senzing Entity Search WebApp
    7. View X-Term
  5. Cleanup
  6. Advanced
    1. Re-run docker formation
    2. Configuration
  7. Notes
    1. Running non-root

Legend

  1. 🤔 - A "thinker" icon means that a little extra thinking may be required. Perhaps you'll need to make some choices. Perhaps it's an optional step.
  2. ✏️ - A "pencil" icon means that the instructions may need modification before performing.
  3. ⚠️ - A "warning" icon means that something tricky is happening, so pay attention.

Expectations

Space

This repository and demonstration require 7 GB free disk space.

Time

Budget 2 hours to get the demonstration up-and-running, depending on CPU and network speeds.

Background knowledge

This repository assumes a working knowledge of:

  1. Docker
  2. Docker-compose

Preparation

Prerequisite software

The following software programs need to be installed:

  1. docker
  2. docker-compose
  3. git

Clone repository

For more information on environment variables, see Environment Variables.

  1. Set these environment variable values:

    export GIT_ACCOUNT=senzing
    export GIT_REPOSITORY=docker-compose-demo
    export GIT_ACCOUNT_DIR=~/${GIT_ACCOUNT}.git
    export GIT_REPOSITORY_DIR="${GIT_ACCOUNT_DIR}/${GIT_REPOSITORY}"
  2. Follow steps in clone-repository to install the Git repository.

Using docker-compose

Volumes

  1. ✏️ Specify the directory where Senzing should be installed on the local host. Example:

    export SENZING_VOLUME=/opt/my-senzing
    1. ⚠️ macOS - File sharing must be enabled for SENZING_VOLUME.
    2. ⚠️ Windows - File sharing must be enabled for SENZING_VOLUME.
  2. Identify directories on the local host. Example:

    export SENZING_DATA_DIR=${SENZING_VOLUME}/data
    export SENZING_DATA_VERSION_DIR=${SENZING_DATA_DIR}/2.0.0
    export SENZING_ETC_DIR=${SENZING_VOLUME}/etc
    export SENZING_G2_DIR=${SENZING_VOLUME}/g2
    export SENZING_OPT_MICROSOFT_DIR=${SENZING_VOLUME}/opt-microsoft
    export SENZING_VAR_DIR=${SENZING_VOLUME}/var
    
    export MSSQL_DIR=${SENZING_VAR_DIR}/mssql
    export RABBITMQ_DIR=${SENZING_VAR_DIR}/rabbitmq
  3. Create directory for RabbitMQ persistence. Note: Although the RABBITMQ_DIR directory will have open permissions, the directories created within RABBITMQ_DIR will be restricted. Example:

    sudo mkdir -p ${RABBITMQ_DIR}
    sudo chmod 770 ${RABBITMQ_DIR}

SSH port

🤔 Optional If you do not plan on using the senzing/sshd container then these ssh sections can be ignored

🤔 Normally port 22 is already in use for ssh. So a different port may be needed by the running docker container.

  1. 🤔 Optional: See if port 22 is already in use. Example:

    sudo lsof -i -P -n | grep LISTEN | grep :22
  2. ✏️ Choose port for docker container. Example:

    export SENZING_SSHD_PORT=9181
  3. Construct parameter for docker run. Example:

    export SENZING_SSHD_PORT_PARAMETER="--publish ${SENZING_SSHD_PORT:-9181}:22"

Set sshd password

🤔 Optional The default password set for the sshd containers is senzingsshdpassword. However, this can be set by setting the following variable

✏️ Set the SENZING_SSHD_PASSWORD variable to change the password to access the sshd container

export SENZING_SSHD_PASSWORD=<Pass_You_Want>

EULA

To use the Senzing code, you must agree to the End User License Agreement (EULA).

  1. ⚠️ This step is intentionally tricky and not simply copy/paste. This ensures that you make a conscious effort to accept the EULA. Example:

    export SENZING_ACCEPT_EULA="<the value from this link>"

Install Senzing

  1. If Senzing has not been installed, install Senzing. Example:

    cd ${GIT_REPOSITORY_DIR}
    sudo \
      --preserve-env \
      docker-compose --file resources/senzing/docker-compose-senzing-installation.yaml up
    1. This will download and extract a 3GB file. It may take 5-15 minutes, depending on network speeds.

Install Senzing license

Senzing comes with a trial license that supports 10,000 records.

  1. 🤔 Optional: If more than 10,000 records are desired, see Senzing license.

Install MS SQL driver

  1. Install MS SQL driver and initialize files. Example:

    cd ${GIT_REPOSITORY_DIR}
    sudo \
      --preserve-env \
      docker-compose --file resources/mssql/docker-compose-mssql-driver.yaml up
  2. Wait until completion.

  3. Change directory permissions. Note: Although the MSSQL_DIR directory will have open permissions, the directories created within MSSQL_DIR will be restricted. Example:

    sudo chmod 777 ${MSSQL_DIR}

Run docker formation

  1. Launch docker-compose formation. Example:

    cd ${GIT_REPOSITORY_DIR}
    sudo \
      --preserve-env \
      docker-compose --file resources/mssql/docker-compose-rabbitmq-mssql.yaml up
  2. Allow time for the components to come up and initialize.

    1. There will be errors in some docker logs as they wait for dependent services to become available. docker-compose isn't the best at orchestrating docker container dependencies.

View data

Once the docker-compose formation is running, different aspects of the formation can be viewed.

Username and password for the following sites were either passed in as environment variables or are the default values seen in docker-compose-rabbitmq-mssql.yaml.

View docker containers

  1. A good tool to monitor individual docker logs is Portainer. When running, Portainer is viewable at localhost:9170.

Use SSH

Instructions to use the senzing/sshd container are viewable in the senzing/docker-sshd repository

View RabbitMQ

  1. RabbitMQ is viewable at localhost:15672.
    1. Defaults: username: user password: bitnami
  2. See additional tips for working with RabbitMQ.

View MSSQL

  1. MSSQL is viewable at localhost:9177.
    1. System: MS SQL (beta)
    2. Server: senzing-mysql
    3. Username: sa
    4. Password: Passw0rd
    5. Database: G2
  2. See additional tips for working with MSSQL.

View Senzing API

View results from Senzing REST API server. The server supports the Senzing REST API.

  1. OpenApi Editor is viewable at localhost:9180.
  2. Example Senzing REST API request: localhost:8250/heartbeat
  3. See additional tips for working with Senzing API server.

View Senzing Entity Search WebApp

  1. Senzing Entity Search WebApp is viewable at localhost:8251.
  2. See additional tips for working with Senzing Entity Search WebApp.

View X-Term

The web-based Senzing X-term can be used to run Senzing command-line programs.

  1. Senzing X-term is viewable at localhost:8254.
  2. See additional tips for working with Senzing X-Term.

Cleanup

When the docker-compose formation is no longer needed, it can be brought down and directories can be deleted.

  1. Bring down docker formation. Example:

    cd ${GIT_REPOSITORY_DIR}
    sudo docker-compose --file resources/senzing/docker-compose-senzing-installation.yaml down
    sudo docker-compose --file resources/mssql/docker-compose-mssql-driver.yaml down
    sudo docker-compose --file resources/mssql/docker-compose-rabbitmq-mssql.yaml down
    sudo docker-compose --file resources/mssql/docker-compose-rabbitmq-mssql-again.yaml down
  2. Remove directories from host system. The following directories were created during the demonstration:

    1. ${SENZING_VOLUME}
    2. ${GIT_REPOSITORY_DIR}

    They may be safely deleted.

Advanced

The following topics discuss variations to the basic docker-compose demonstration.

Re-run docker formation

🤔 Optional: After the launch and shutdown of the original docker formation, the docker formation can be brought up again without requiring initialization steps. The following shows how to bring up the prior docker formation again without initialization.

  1. Launch docker-compose formation. Example:

    cd ${GIT_REPOSITORY_DIR}
    sudo \
      --preserve-env \
      docker-compose --file resources/mssql/docker-compose-rabbitmq-mssql-again.yaml up

Configuration

Configuration values specified by environment variable or command line parameter.

Notes

Running non-root

  1. The senzing/stream-loader and senzing/senzing-api-server containers are run under user nobody (65534). The reason for this is that a UID need to be selected that has a "home" directory when using ODBC. Rather than "hard-coding" docker images with a specific userid, an existing non-root userid is used. This is a known issue:
    1. github.com/microsoft/mssql-docker/issues/431.
  2. The practice of "hard-coding" docker images with a specific userid, specifically the use of useradd, are problematic with system like OpenShift which determine the UID of a docker container based on the project. See OpenShift: Why do my applications run as a random user ID?