Skip to content
This repository has been archived by the owner on Mar 8, 2023. It is now read-only.

Latest commit

 

History

History
837 lines (605 loc) · 36.4 KB

p-rep-installation-and-configuration-for-mainnet.md

File metadata and controls

837 lines (605 loc) · 36.4 KB
title excerpt
P-Rep Installation and Configuration
General information about the ICON P-Rep election - https://icon.community/iconsensus/

This document is a guideline detailing how to install and operate a Public Representative (“P-Rep”) node on the MainNet using a docker. P-Reps are the consensus nodes that produce, verify blocks and participate in network policy decisions on the ICON Network.

Intended Audience

We recommend all P-Rep candidates to go through this guideline.

Pre-requisites

We assume that you have previous knowledge and experience in:

  • IT infrastructure management
  • Linux or UNIX system administration
  • Docker container
  • Linux server and docker service troubleshooting

HW Requirements for MainNet

Below specification is a minimum requirement .

Description Minimum Specifications Recommended Specifications
CPU model Intel(R) Xeon(R) CPU @ 3.00GHz Intel(R) Xeon(R) CPU @ 3.00GHz
vCPU (core) 16 36
RAM 32G 72G
Disk 200G 500G
Network 1Gbps 1Gbps

SW Requirements

OS requirements

  • Linux (CentOS 7 or Ubuntu 16.04+)

Package requirements

  • Docker 18.x or higher

For your reference, ICON node depends on the following packages. The packages are included in the P-Rep docker image that we provide, so you don't need to install them separately.

  • Python 3.6.5+ or 3.7.x.
  • RabbitMQ 3.7 or higher

Network Diagram of P-Rep nodes

P-Rep Networking Model

Above diagram shows how P-Rep nodes are interacting with each other in the test environment.

  • Endpoint: https://ctz.solidwallet.io

    • Endpoint is the load balancer that accepts the transaction requests from DApps and relays the requests to an available P-Rep node.
      In the test environment, ICON foundation is running the endpoint. It is also possible for each P-Rep to set up their own endpoint to directly serve DApps (as depicted in PRep-Node4), but that configuration is out of the scope of this document.
  • Tracker: https://tracker.icon.foundation

    • A block and transaction explorer attached to the MainNet network.
  • IP List: https://download.solidwallet.io/conf/prep_iplist.json

    • ICON foundation will maintain the IP list of P-Reps. The JSON file will contain the list of IPs. You should configure your firewalls to allow in/outbound traffic from/to the IP addresses. Following TCP ports should be open.
    • Port 7100: Used by gRPC for peer to peer communication between nodes.
    • Port 9000: Used by JSON-RPC API server.
    • The IP whitelist will be automatically updated on a daily basis from the endpoint of the seed node inside the P-Rep Node Docker.

If you need more detailed information, Refer to the link below

Inside a P-Rep Node

A process view of a P-Rep node

There are five processes, iconrpcserver, iconservice, loopchain, loop-queue, and loop-logger.

P-Rep Architecture Diagram

  • iconrpcserver

    • iconrpcserver handles JSON-RPC message requests
    • ICON RPC Server receives request messages from external clients and sends responses. When receiving a message, ICON RPC Server identifies the method that the request wants to invoke and transfers it to an appropriate component, either loopchain or ICON Service.
  • iconservice

    • ICON Service manages the state of ICON network (i.e., states of user accounts and SCOREs) using LevelDB.
  • Before processing transactions, ICON Service performs the syntax check on the request messages and pre-validates the status of accounts to see if the transactions can be executable.

  • loopchain

    • loopchain is the high-performance Blockchain Consensus & Network engine of ICON.
  • loop-queue (RabbitMQ)

    • RabbitMQ is the most widely deployed open source e message broker.
    • loopchain uses RabbitMQ as a message queue for inter-process communication.
  • loop-logger (Fluentd)

    • Fluentd is the open source data collector, which lets you unify the data collection and consumption.
    • Fluentd is included in the P-Rep node image. You can use Fluㅡ ntd to systematically collect and aggregate the log data that other processes produce.

Which ports a P-Rep Node is using?

For external communication:

  • TCP 7100: gRPC port used for peer-to-peer connection between peer nodes.
  • TCP 9000: JSON-RPC or RESTful API port serving application requests.

For internal communication:

  • TCP 5672: RabbitMQ port for inter-process communication.

For RabbitMQ management console:

  • TCP 15672: RabbitMQ Management will listen on port 15672.
    • You can use RabbitMQ Management by enabling this port. It must be enabled before it is used.
    • You can access the management web console at http://{node-hostname}:15672/

P-Rep Installation using Docker

Please read the SW requirements above. Below, we will outline the steps for docker installation.

If you already have installed docker and docker compose, you can skip the part below, and directly go to the Running P-Rep Node on Docker Container

Prerequisites - Docker & Docker Compose Installation

If you don't already have docker installed, you can download it here: https://www.docker.com/community-edition. Installation requires sudo privilege.

On Centos 7

Step 1: Install Docker

## Install necessary packages:
$ sudo yum install -y yum-utils device-mapper-persistent-data lvm2

## Configure the docker-ce repo:
$ sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

## Install docker-ce:
$ sudo yum install docker-ce

## Add your user to the docker group with the following command.
$ sudo usermod -aG docker $(whoami)

## Set Docker to start automatically at boot time:
$ sudo systemctl enable docker.service

## Finally, start the Docker service:
$ sudo systemctl start docker.service

## Then we'll verify docker is installed successfully by checking the version:
$ docker version 

Step 2: Install Docker Compose

## Install Extra Packages for Linux
$ sudo yum install epel-release

## Install python-pip
$ sudo yum install -y python-pip

## Then install Docker Compose:
$ sudo pip install docker-compose

## You will also need to upgrade your Python packages on CentOS 7 to get docker-compose to run successfully:
$ sudo yum upgrade python*

## To verify the successful Docker Compose installation, run:
$ docker-compose version

On Ubuntu 16.04+

Step 1: Install Docker

## Update the apt package index:
$ sudo apt-get update

## Install necessary packages:
$ sudo apt-get install  -y systemd apt-transport-https ca-certificates curl gnupg-agent software-properties-common 

## Add Docker's official GPG key:
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

## Add the apt repository
$ add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"

## Update the apt package index:
$ sudo apt-get update

## Install docker-ce:
$ sudo apt-get -y install docker-ce docker-ce-cli containerd.io

## Add your user to the docker group with the following command.
$ sudo usermod -aG docker $(whoami)

## Set Docker to start automatically at boot time:
$ sudo systemctl enable docker.service

## Finally, start the Docker service:
$ sudo systemctl start docker.service

## Then we'll verify docker is installed successfully by checking the version:
$ docker version

Step 2: Install Docker Compose

## Install python-pip
$ sudo apt-get install -y python-pip

## Then install Docker Compose:
$ sudo pip install docker-compose

## To verify the successful Docker Compose installation, run:
$ docker-compose version

Citizen node Configuration for Pre-voting

Citizen node is used for block sync and API load balancing without network consensus. Non-P-Rep nodes act in citizen node and synchronize blocks. ​ P-Reps can participate either by installing a citizen node or connect to https://ctz.solidwallet.io. ​ In order to facilitate block sync, we provide a snapshot which gets activated when FASTEST_START is yes. Then, snapshot in data volume will be re-downloaded when the null size file is deleted.

If the snapshot file already exists, a new file will not be downloaded. ​ docker and docker-compose need to be installed beforehand.

Running a Citizen Node on Docker Container

Using docker-compose command (Recommended)

Open docker-compose.yml in a text editor and add the following content:

version: '3'
services:
   citizen:
      image: 'iconloop/citizen-node:1908271151xd2b7a4'
      network_mode: host
      environment:
         LOG_OUTPUT_TYPE: "file"
         LOOPCHAIN_LOG_LEVEL: "DEBUG"
         FASTEST_START: "yes"     # Restore from lastest snapshot DB
        
      volumes:
         - ./data:/data  # mount a data volumes
         - ./keys:/citizen_pack/keys  # Automatically generate cert key files here
      ports:
         - 9000:9000

Run docker-compose

$ docker-compose up -d
citizen_1         |   [2019-08-20 15:36:18.933] Your IP: XXX.XXX.XXX.XXX
citizen_1         |   [2019-08-20 15:36:18.935] RPC_PORT: 9000 / RPC_WORKER: 3
citizen_1         |   [2019-08-20 15:36:18.937] DEFAULT_STORAGE_PATH=/data/loopchain/mainnet/.storage in Docker Container
citizen_1         |   [2019-08-20 15:36:18.939] scoreRootPath=/data/loopchain/mainnet/.score_data/score
citizen_1         |   [2019-08-20 15:36:18.940] stateDbRootPath=/data/loopchain/mainnet/.score_data/db
citizen_1         |   [2019-08-20 15:36:18.942] Citizen package version info - 1908271151xd2b7a4
citizen_1         | WARNING: You are using pip version 19.1.1, however version 19.2.2 is available.
citizen_1         | You should consider upgrading via the 'pip install --upgrade pip' command.
citizen_1         | iconcommons             1.1.1
citizen_1         | iconrpcserver           1.3.1.1
citizen_1         | iconservice             1.4.2
citizen_1         | loopchain               2.2.1.3
citizen_1         |   [2019-08-20 15:36:19.448] builtinScoreOwner = hx677133298ed5319607a321a38169031a8867085c
citizen_1         | 0
citizen_1         |   [2019-08-20 15:36:19.528] START FASTEST MODE : NETWORK_NAME=MainctzNet
citizen_1         |   [2019-08-20 15:36:19.777] Start download - https://s3.ap-northeast-2.amazonaws.com/icon-leveldb-backup/MainctzNet/20190820/MainctzNet_BH7344686_data-20190820_1500.tar.gz
citizen_1         |  [OK] CHECK=0,  Download 20190820/MainctzNet_BH7344686_data-20190820_1500.tar.gz(/data/loopchain/mainnet/)  to /data/loopchain/mainnet

Running a P-Rep Node on Docker Container

Once you have docker installed, then proceed through the following steps to install the P-Rep node.

Step 1. Pull the docker image

Pull the latest stable version of an image.

$ docker pull iconloop/prep-node:1907091410x2f8b2e

Step 2. Run the P-Rep Node as a Docker container

Using docker command

$ docker run -d  -p 9000:9000 -p 7100:7100 -v ${PWD}/data:/data iconloop/prep-node:1907091410x2f8b2e

Using docker-compose command (Recommended)

Open docker-compose.yml in a text editor and add the following content:

version: '3' 
services:    
     container:        
          image: 'iconloop/prep-node:1907091410x2f8b2e'        
          container_name: 'prep-node'        
          volumes:            
               - ./data:/data        
          ports:           
               - 9000:9000           
               - 7100:7100

Run docker-compose

$ docker-compose up -d

Above command options, do the following.

  1. Map container ports 7100 and 9000 to the host ports.

  2. Mount a volume into the docker container.

    • -v ${PWD}/data:/data sets up a bind mount volume that links the /data/ directory from inside the P-Rep Node container to the ${PWD}/data directory on the host machine.

    • data folder will have the following directory structure.

.
|---- data  
|     |---- PREP-MainNet   → Default ENV directory  
|          |---- .score_data  
|          |      |-- db      → root directory that SCOREs will be installed
|          |      |-- score   → root directory that the state DB file will be created
|          |---- .storage     → root directory that the block DB will be stored
|          |---- log          → root directory that log files will be stored

P-Rep Node Operation and Configuration

Start Node

Run docker-compose up.

$ docker-compose up -d
prep_prep_1 is up-to-date

The docker ps command shows the list of running docker containers.

$ docker ps
CONTAINER ID   IMAGE                                                          COMMAND                CREATED              STATUS                          PORTS                                                                 NAMES
0de99e33cdc9     iconloop/prep-node:1907091410x2f8b2e    "/src/entrypoint.sh"      2 minutes ago        Up 2 minutes(healthy)    0.0.0.0:7100->7100/tcp, 0.0.0.0:9000->9000/tcp prep_prep_1

The meaning of each column in the docker ps result output is as follows.

Column Description
CONTAINER ID Container ID
IMAGE P-Rep Node's image name
COMMAND The script will be executed whenever a P-Rep Node container is run
STATUS Healthcheck status. One of "starting" , "healthy", "unhealthy" or "none"
PORTS Exposed ports on the running container
NAMES Container name

You can read the container booting log from the log folder.

$ tail -f data/PREP-MainNet/log/booting_20190419.log
[2019-08-12 02:19:01.454] DEFAULT_STORAGE_PATH=/data/PREP-MainNet/.storage
[2019-08-12 02:19:01.459] scoreRootPath=/data/PREP-MainNet/.score_data/score
[2019-08-12 02:19:01.464] stateDbRootPath=/data/PREP-MainNet/.score_data/db
[2019-08-12 02:19:01.468] P-REP package version info - 1907091410x2f8b2e
[2019-08-12 02:19:02.125] iconcommons 1.0.5.2 iconrpcserver 1.3.1 iconservice 1.3.0 loopchain 2.1.7
[2019-08-12 02:19:07.107] Enable rabbitmq_management
[2019-08-12 02:19:10.676] Network: PREP-MainNet
[2019-08-12 02:19:10.682] Run loop-peer and loop-channel start
[2019-08-12 02:19:10.687] Run iconservice start!
[2019-08-12 02:19:10.692] Run iconrpcserver start!

Stop Node

$ docker-compose down

Stopping prep_prep_1 ... done
Removing prep_prep_1 ... done
Removing network prep_default

View Node Status

  • Check the current state and information of the prep-node

The current state of the node can be confirmed by /api/v1/status/peer and /api/v1/avail/peer. The response data are the same, but HTTP response code is different.

$ curl localhost:9000/api/v1/status/peer

{
    "made_block_count": 0,
    "status": "Service is online: 0",
    "state": "Vote",
    "peer_type": "0",
    "audience_count": "0",
    "consensus": "siever",
    "peer_id": "hx1787c2194f56bb550a8daba9bbaea00a4956ed58",
    "block_height": 184,
    "round": 1,
    "epoch_height": 186,
    "unconfirmed_block_height": 0,
    "total_tx": 93, 
    "unconfirmed_tx": 0,
    "peer_target": "20.20.1.195:7100",
    "leader_complaint": 185,
    "peer_count": 5,
    "leader": "hx7ff69280a1483c660695039c14ba954bb101bb66",
    "epoch_leader": "hx7ff69280a1483c660695039c14ba954bb101bb66",
    "mq": {
         "peer": {
               "message_count": 0
          },
         "channel": {
               "message_count": 0
         },
         "score": {
              "message_count": 0
         }
     }
}
  • /api/v1/avail/peer returns HTTP response 503 when Service Unavailable

This is useful when performing a health check based on the HTTP response code of the load balancer.

  • /api/v1/avail/peer returns HTTP 503 Service Unavailable
#Return HTTP 503 Service Unavailable 

state : InitComponents, EvaluateNetwork, BlockSync, SubscribeNetwork
  • /api/v1/status/peer returns 200 OK at BlockSync
$ curl -i localhost:9000/api/v1/status/peer
HTTP/1.1 200 OK
Connection: close
Access-Control-Allow-Origin: *
Content-Length: 573
Content-Type: application/json
 
{
  "made_block_count": 0,
  "status": "Service is offline: block height sync",
  "state": "BlockSync",
  "service_available": false,
  "peer_type": "0",
  "audience_count": "0",
  "consensus": "siever",
  "peer_id": "hxa5c4ae316abcdd58baa3f7889e9c15c970c3cde6",
  "block_height": 5904,
  "round": -1,
  "epoch_height": -1,
  "unconfirmed_block_height": -1,
  "total_tx": 0,
  "unconfirmed_tx": 0,
  "peer_target": "20.20.1.87:7100",
  "leader_complaint": 1,
  "peer_count": 6,
  "leader": "hx0124b85ccaf05d4265d2e09ecfff19937d6ca063",
  "epoch_leader": "",
  "mq": {
    "peer": {
      "message_count": 0
    },
    "channel": {
      "message_count": 0
    },
    "score": {
      "message_count": 0
    }
  }
}
  • /api/v1/avail/peer returns 503 OK at BlockSync
$ curl -i localhost:9000/api/v1/avail/peer
HTTP/1.1 503 Service Unavailable
Connection: close
Access-Control-Allow-Origin: *
Content-Length: 573
Content-Type: application/json
 
{
  "made_block_count": 0,
  "status": "Service is offline: block height sync",
  "state": "BlockSync",
  "service_available": false,
  "peer_type": "0",
  "audience_count": "0",
  "consensus": "siever",
  "peer_id": "hxa5c4ae316abcdd58baa3f7889e9c15c970c3cde6",
  "block_height": 5904,
  "round": -1,
  "epoch_height": -1,
  "unconfirmed_block_height": -1,
  "total_tx": 0,
  "unconfirmed_tx": 0,
  "peer_target": "20.20.1.87:7100",
  "leader_complaint": 1,
  "peer_count": 6,
  "leader": "hx0124b85ccaf05d4265d2e09ecfff19937d6ca063",
  "epoch_leader": "",
  "mq": {
    "peer": {
      "message_count": 0
    },
    "channel": {
      "message_count": 0
    },
    "score": {
      "message_count": 0
    }
  }
}

Node Status Detail

value name Description Reason or allowed value
made_block_count number of block generated by node Reset when node became a leader after a rotation or reorerated.
status service on / off status "Service is online: 1" : Working (leader status)
"Service is online: 0" : Working (not a leader status)
"Service is offline: block height sync" : block sync condition
"Service is offline: mq down" ": channel mq issue
state node condition detailed information (display at bottom)
service_available service condition [ true | false ]
peer_type classify leader node and verifying node "0" : verifying node
"1" : leader node
audience_count DEPRECATION, It will be remove
consensus consensus algorithm "siever" : current consensus algorithm
"LFT" : consensus algorithm that will adapt in the future
peer_id unique address of node 40 digit HEX string
block_height Current block height of node
round number of counts for current block consensus process
epoch_height block heights of processing block. In case of citizen nodes, their block heights stop at SubscribeNetwork block heights because they don’t participate in consensus.
unconfirmed_block_height block heights of unprocessed block same as epoch_height
total_tx Total number of tx until current block
unconfirmed_tx number of unprocessed tx that hold by queue
if it holds by leader more than one minute leader complaint will occur
peer_target IP address and port of node "IP:PORT"
leader_complaint DEPRECATION, It will be remove
peer_count total number of nodes in blockchain network
leader Unique address of a leader node 40 digit HEX string
epoch_leader Unique address of a leader node in consensus process 40 digit HEX string
mq.peer.message_count Accumulated number of messages in pier MQ Presents ‘-1’ when an error occurs and details can be found in "error"
mq.channel.message_count Accumulated number of messages in channel MQ Presents ‘-1’ when an error occurs and details can be found in "error"
mq.score.message_count Accumulated number of messages in SCORE MQ Presents ‘-1’ when an error occurs and details can be found in "error"
State Detail
State Value Description
InitComponents Channel Service initial state
Consensus Loopchain Consensus begins. Convert to BlockHeightSync automatically
BlockHeightSync Block height Sync state begins. Convert to EvaluateNetwork automatically
EvaluateNetwork Evaluate the BlockSync state by checking the network status
BlockSync Block Sync loop.
SubscribeNetwork Determine the type depending on the node type. A citizen node requests the block generation message to a parent node.
Watch Citizen node default state. Relay transaction and sync the blocks created by a parent node
Vote Status of validating and voting blocks created by the leader
LeaderComplain Status of requesting leader complain for current leader and wait for complain
BlockGenerate Status of leader creating a block
GracefulShutdown End Process

Docker Environment Variables

If you want change the TimeZone, open docker-compose.yml in a text editor and add the following content:

version: '3'
services:
   prep:
      image: 'iconloop/prep-node:1907091410x2f8b2e'
      container_name: "prep-node"
      network_mode: host
      environment:
         LOOPCHAIN_LOG_LEVEL: "DEBUG"
         DEFAULT_PATH: "/data/loopchain"
         LOCAL_TEST: "true"
         LOG_OUTPUT_TYPE: "file"
         TIMEOUT_FOR_LEADER_COMPLAIN: "120"
         MAX_TIMEOUT_FOR_LEADER_COMPLAIN: "600"
         LOAD_PEERS_FROM_IISS: "true"
         FIND_NEIGHBOR: "true"
         TZ: "America/Los_Angeles"
      cap_add:
         - SYS_TIME
      volumes:
         - ./data:/data
      ports:
         - 9000:9000
         - 7100:7100

The P-Rep Node image supports the following environment variables:

Environment variable Description Default value Allowed value
IPADDR Setting the IP address $EXT_IPADDR
LOCAL_TEST false false
TZ Setting the TimeZone Environment Asia/Seoul List of TZ name
NETWORK_ENV PREP PREP
SERVICE Service Name default
SERVICE_API SERVICE_API URI https://${SERVICE}.net.solidwallet.io/api/v3 URI
ENDPOINT_URL ENDPOINT API URI https://${SERVICE}.net.solidwallet.io URI
NTP_SERVER NTP SERVER ADDRESS time.google.com
NTP_REFRESH_TIME NTP refresh time 21600
FIND_NEIGHBOR Find fastest neighborhood PRrep false
DEFAULT_PATH Setting the Default Root PATH /data/${NETWORK_ENV}
DEFAULT_LOG_PATH Setting the logging path ${DEFAULT_PATH}/log
DEFAULT_STORAGE_PATH block DB will be stored ${DEFAULT_PATH}/.storage
USE_NAT if you want to use NAT Network no
NETWORK_NAME
VIEW_CONFIG for check deployment state false boolean (true/false)
USE_MQ_ADMIN Enable RabbitMQ management Web interface.The management UI can be accessed using a Web browser at http://{node false
MQ_ADMIN RabbitMQ management username admin
MQ_PASSWORD RabbitMQ management password iamicon
LOOPCHAIN_LOG_LEVEL loopchain log level INFO
ICON_LOG_LEVEL iconservice log level INFO
LOG_OUTPUT_TYPE loopchain's output log type file file, console, file|console
outputType iconservice's output log type $LOG_OUTPUT_TYPE file, console, file|console
CONF_PATH Setting the configure file path /${APP_DIR}/conf
CERT_PATH Setting the certificate key file path /${APP_DIR}/cert
REDIRECT_PROTOCOL http http
SUBSCRIBE_USE_HTTPS false false
ICON_NID Setting the ICON Network ID number 0x50
ALLOW_MAKE_EMPTY_BLOCK false false
score_fee true true
score_audit false false
scoreRootPath ${DEFAULT_PATH}/.score_data/score ${DEFAULT_PATH}/.score_data/score
stateDbRootPath ${DEFAULT_PATH}/.score_data/db ${DEFAULT_PATH}/.score_data/db
iissDbRootPath ${DEFAULT_PATH}/.iissDb ${DEFAULT_PATH}/.iissDb
CHANNEL_BUILTIN boolean (true/false) true
PEER_NAME uname uname
PUBLIC_PATH public cert key location ${CERT_PATH}/${IPADDR}_public.der
PRIVATE_PATH private cert key location ${CERT_PATH}/${IPADDR}_private.der
PRIVATE_PASSWORD private cert key password test
LOAD_PEERS_FROM_IISS true true
CHANNEL_MANAGE_DATA_PATH ${CONF_PATH}/channel_manange_data.json ${CONF_PATH}/channel_manange_data.json
CONFIG_API_SERVER https://download.solidwallet.io https://download.solidwallet.io
GENESIS_DATA_PATH ${CONF_PATH}/genesis.json ${CONF_PATH}/genesis.json
BLOCK_VERSIONS
NEXT_BLOCK_VERSION_HEIGHT
FORCE_RUN_MODE Setting the loopchain running parameter e.g. if FORCE_RUN_MODE is
configure_json ${CONF_PATH}/configure.json ${CONF_PATH}/configure.json
iconservice_json ${CONF_PATH}/iconservice.json ${CONF_PATH}/iconservice.json
iconrpcserver_json ${CONF_PATH}/iconrpcserver.json ${CONF_PATH}/iconrpcserver.json
ICON_REVISION 4 4
ROLE_SWITCH_BLOCK_HEIGHT 1 1
mainPRepCount 22 22
mainAndSubPRepCount 100 100
decentralizeTrigger 0.002 0.002
RPC_PORT Choose a RPC service port 9000
RPC_WORKER Setting the number of RPC workers 3
RPC_GRACEFUL_TIMEOUT rpc graceful timeout 0

Troubleshooting

Q: How to check if container is running or not

The docker ps command shows the list of running docker containers.

$ docker ps
CONTAINER ID   IMAGE                                                          COMMAND                CREATED              STATUS                          PORTS                                                                 NAMES
0de99e33cdc9     iconloop/prep-node:1907091410x2f8b2e    "/src/entrypoint.sh"      2 minutes ago        Up 2 minutes(healthy)    0.0.0.0:7100->7100/tcp, 0.0.0.0:9000->9000/tcp prep_prep_1

You should look at the STATUS field to see if the container is running up and in healthy state.

Inside the container, there is a healthcheck script running with the following configuration. It will return unhealthy when it fails.

Healthcheck option value
retries 4
interval 30s
timeout 20s
start-period 60s

The container can have three states:

  • starting - container just starts
  • healthy - when the health check passes
  • unhealthy - when the health check fails

If the container does not start properly or went down unexpectedly, please check the booting.log. Below is the log messages on success.

$ cat data/PREP-MainNet/log/booting_${DATE}.log 

[2019-08-12 02:19:01.435] Your IP: 20.20.1.195
[2019-08-12 02:19:01.439] RPC_PORT: 9000 / RPC_WORKER: 3
[2019-08-12 02:19:01.444] DEFAULT_PATH=/data/PREP-MainNet in Docker Container
[2019-08-12 02:19:01.449] DEFAULT_LOG_PATH=/data/PREP-MainNet/log
[2019-08-12 02:19:01.454] DEFAULT_STORAGE_PATH=/data/PREP-MainNet/.storage
[2019-08-12 02:19:01.459] scoreRootPath=/data/PREP-MainNet/.score_data/score
[2019-08-12 02:19:01.464] stateDbRootPath=/data/PREP-MainNet/.score_data/db
[2019-08-12 02:19:01.468] P-REP package version info - 1907091410x2f8b2e
[2019-08-12 02:19:02.125] iconcommons 1.0.5.2 iconrpcserver 1.3.1 iconservice 1.3.0 loopchain 2.1.7
[2019-08-12 02:19:07.107] Enable rabbitmq_management
[2019-08-12 02:19:10.676] Network: PREP-MainNet
[2019-08-12 02:19:10.682] Run loop-peer and loop-channel start
[2019-08-12 02:19:10.687] Run iconservice start!
[2019-08-12 02:19:10.692] Run iconrpcserver start!

Q: How to find error

Error log messages example

Grep the ERROR messages from the log files to find the possible cause of the failure.

$ cat data/PREP-MainNet/log/booting_${DATE}.log | grep ERROR

[2019-08-12 02:08:48.746] [ERROR] Download Failed - http://20.20.1.149:5000/cert/20.20.1.195_public.der status_code=000

[2019-08-12 01:58:46.439] [ERROR] Unauthorized IP address, Please contact our support team

Docker container generates below log files

  • booting.log
    • The log file contains the errors that occurred when the docker container starts up.
  • iconrpcserver.log
    • The log file contains information about the request/response message handling going through the iconrpcserver.
  • iconservice.log
    • The log file contains the internals of ICON Service
  • loopchain.channel-txcreator-icon_dex_broadcast.icon_dex.log
    • The log file contains information about TX broadcast from a node to other nodes
  • loopchain.channel-txcreator.icon_dex.log
    • The log file contains information about the process of confirming TXƒ
  • loopchain.channel-txreceiver.icon_dex.log
    • The log file contains information about receiving the broadcasted TX from a node.
  • loopchain.channel.icon_dex.log
    • The log file contains information about internals of loopchain engine

Q: How to monitor resources

We recommend the following tools for resource monitoring

  1. Network monitoring - iftop, nethogs, vnstat
  2. CPU/Memory monitoring - top, htop
  3. Disk I/O monitoring - iostat, iotop
  4. Docker monitoring - docker stats, ctop