Skip to content
David Salek edited this page Apr 23, 2018 · 17 revisions

We choose to store and analyze data in the cloud. The complete solution consists of the following components. Click on the links below to see the installation instructions.

  • EC2 instance to run all the services and store data
  • Anaconda for the Python data analytics environment
  • Git to get the plant monitor code
  • PostgreSQL database to store data
  • Flask for the REST API endpoints to upload data to the database and files
  • Swagger to view the REST API documentation
  • JupyterHub to write Jupyter notebooks for data analysis and visualisations, supporting multiple users
  • Dash to view measurements in an interactive dashboard

Once installed, after restarting the EC2 instance all the services can be started using the following script from this repository.

ssh -i ~/.ssh/my_first_EC2_key.pem ec2-user@ec2-34-243-27-176.eu-west-1.compute.amazonaws.com

source plant_monitor_aws/ec2_start.sh

EC2 instance

The first step is to launch an Amazon Linux EC2 instance, t2.micro will be sufficient for this project.

The following inbound traffic has to be enabled in the EC2 security group:

  • SSH, port 22 - for the SSH access
  • HTTP, port 80, 0.0.0.0/0 - for the web access (will be used for Swagger)
  • custom TCP role, port 5432, your IP address - for PostgreSQL
  • custom TCP role, port 5000, your IP address - for Flask
  • custom TCP role, port 8000, your IP address - for JupyterHub
  • custom TCP role, port 8888, your IP address - for Jupyter
  • custom TCP role, port 8050, your IP address - for Dash

The instance can be accessed with SSH using your private key.

ssh -i .ssh/my_first_EC2_key.pem ec2-user@ec2-34-243-177-120.eu-west-1.compute.amazonaws.com

Update packages.

sudo yum update

Besides the default user ec2-user, add new users: plant_monitor for development purposes, student01 for students and teacher01 for teachers. The following link gives some further information on how to do this. https://aws.amazon.com/premiumsupport/knowledge-center/new-user-accounts-linux-instance/

sudo adduser plant_monitor
sudo passwd plant_monitor

The new user will be used by students to access and analyse data, it will not have sudo rights. We keep the sudo rights only to the ec2-user as specified in /etc/sudoers.d/cloud-init

For vim users, I recommend downloading the .vimrc file below.

wget https://gist.githubusercontent.com/salekd/1ebae93b5d237daebaf2dfc101b92e19/raw/81d74c53f6ccc1db9f41433a678d4696a7663196/.vimrc

In case you need a break at some point, the good old terminal games may come handy.

wget http://www.melvilletheatre.com/articles/el6/bsd-games-2.17-28.el6.x86_64.rpm
sudo yum install bsd-games-2.17-28.el6.x86_64.rpm

In particular, hangman is my favourite. In the light of this project, let us create a new file /home/ec2-user/hangman.words with a custom word list:

plantmonitor
raspberrypi

Backup the default word dictionary and put the new one in place.

sudo mv /usr/share/dict/words /usr/share/dict/words.bak
sudo ln -s /home/ec2-user/hangman.words /usr/share/dict/words

Play hangman and have fun ;-)


Anaconda

Anaconda provides a Python environment with pre-installed packages for data analytics. Get the version for Python 3 and install.

wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
bash Anaconda3-5.0.1-Linux-x86_64.sh

Specify /opt/anaconda3 as the installation path.

which python
export PATH=/opt/anaconda3/bin:$PATH
which python

The newly installed version of Python should be accessible to sudo users as well. The following links talks about setting path for sudo commands. https://superuser.com/questions/927512/how-to-set-path-for-sudo-commands

Edit the sudoers file

sudo visudo

and add the Anaconda path there.

Defaults    secure_path = /opt/anaconda3/bin:/sbin:/bin:/usr/sbin:/usr/bin

Add export PATH="/opt/anaconda3/bin:$PATH" to /etc/profile for all users.


Git

We will use git to get the plant monitor code. Install and configure git.

sudo yum install git

git config --global user.name "..."
git config --global user.email ...

Add the following lines into .bash_profile in order to see the git branch in the command line prompt.

parse_git_branch() {
    git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/ (\1)/'
}
export PS1="\u@\h \[\033[32m\]\w\[\033[33m\]\$(parse_git_branch)\[\033[00m\] $ "

Clone the repository.

git clone https://github.com/salekd/plant_monitor_aws.git

PostgreSQL

The PostgreSQL database is used to store the sensor data. I followed the installation steps described here https://github.com/snowplow/snowplow/wiki/Setting-up-PostgreSQL

sudo yum install postgresql postgresql-server postgresql-devel postgresql-contrib postgresql-docs
sudo service postgresql initdb

Specify the user access in /var/lib/pgsql9/data/pg_hba.conf. You can directly copy the file from this repository https://github.com/salekd/plant_monitor_aws/blob/master/config/pg_hba.conf

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             flask_app       0.0.0.0/0               md5
host    all             plant_monitor   0.0.0.0/0               md5
# IPv6 local connections:
host    all             all             ::1/128                 md5

Make the database accessible from outside by making the following changes in /var/lib/pgsql9/data/postgresql.conf. You can directly copy the file from this repository https://github.com/salekd/plant_monitor_aws/blob/master/config/postgresql.conf

listen_addresses = '*'
port = 5432

Start the service.

sudo service postgresql start

Run PostgreSQL as user postgres.

sudo su - postgres
psql -U postgres

Create users and passwords.

ALTER USER postgres WITH PASSWORD '...';
CREATE USER flask_app NOSUPERUSER;
ALTER USER flask_app WITH PASSWORD '...';
CREATE USER plant_monitor NOSUPERUSER;
ALTER USER plant_monitor WITH PASSWORD '...';

Create the database and table for the measurements and grant the access rights. User flask_app has all privileges and will be used in the Flask REST API to upload data to the database table. User plant_monitor will have only the read access and will be used by students to read data.

CREATE DATABASE plant_monitor_db;

\list
\connect plant_monitor_db

CREATE TABLE measurements (
     device VARCHAR(20),
     time TIMESTAMP,
     moisture REAL,
     temperature REAL,
     conductivity REAL,
     light REAL
);
GRANT ALL PRIVILEGES ON TABLE measurements TO flask_app;
GRANT SELECT ON TABLE measurements TO plant_monitor;

CREATE TABLE bme280 (
     device VARCHAR(20),
     time TIMESTAMP,
     temperature REAL,
     pressure REAL,
     humidity REAL
);
GRANT ALL PRIVILEGES ON TABLE bme280 TO flask_app;
GRANT SELECT ON TABLE bme280 TO plant_monitor;

CREATE TABLE si1145 (
     device VARCHAR(20),
     time TIMESTAMP,
     visible REAL,
     IR REAL,
     UV REAL
);
GRANT ALL PRIVILEGES ON TABLE si1145 TO flask_app;
GRANT SELECT ON TABLE si1145 TO plant_monitor;

CREATE TABLE pump (
     device VARCHAR(20),
     time TIMESTAMP,
     duration REAL
);
GRANT ALL PRIVILEGES ON TABLE pump TO flask_app;
GRANT SELECT ON TABLE pump TO plant_monitor;

This is an example on how to insert and read data.

INSERT INTO measurements (device, time, moisture, temperature, conductivity, light)
VALUES (0, CURRENT_TIMESTAMP, 0, 0, 0, 0);

SELECT * FROM measurements;

The database can be accessed remotely. For example, on your mac the PostgreSQL database can be installed with Homebrew.

brew install postgres

Connect to the database in the EC2 instance

psql -h ec2-34-243-27-176.eu-west-1.compute.amazonaws.com -p 5432 -U plant_monitor -d plant_monitor_db

and get the measurements.

\dt
SELECT * FROM measurements;

Flask

Flask is a framework to write RESTful microservices in Python. Install the following packages.

sudo pip install Flask
sudo pip install Flask-SQLAlchemy
sudo pip install psycopg2
sudo pip install flask-restful-swagger-2
sudo pip install flask-cors

A Flask REST API will be used to:

  • upload sensor measurements to the PostgreSQL database and to store the same data in a csv file,
  • save a snapshot from a camera.

Read the following links to find out more about uploading files and security:

Create a new directory accessible to all users for storing the data and change the owner.

sudo mkdir /data
sudo chown ec2-user:ec2-user /data
mkdir /data/measurements
mkdir /data/images
cp /home/ec2-user/plant_monitor_rpi/measurements/* /data/measurements
cp /home/ec2-user/plant_monitor_rpi/images/* /data/images

Enter the correct password in the PostgreSQL URI in `flask_app.cfg'.

Run the Flask application.

cd /home/ec2-user/plant_monitor_aws
nohup python flask_app.py &

This is how to upload a measurement from the command line.

curl -H "Content-Type: application/json" -X POST -d '{"device":"C4:7C:8D:65:BD:76", "timestamp": "2018-01-23 22:00:41.062114", "moisture": 0, "temperature": 19.7, "conductivity": 0, "light": 39}' http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:5000/measurement

Flask also supports a Swagger documentation, see this link for more information https://github.com/swege/flask-restful-swagger-2.0

Download the json file describing the RESTful service.

curl http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:5000/api/swagger.json -o /home/ec2-user/plant_monitor_aws/swagger.json

Alternatively, one can use flask-autodoc instead of Swagger for the RESTful service documentation. https://github.com/acoomans/flask-autodoc


Swagger

We will use Docker to install Swagger-UI in order to view the RESTful service documentation online. https://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html#install_docker

Install Docker.

sudo yum update -y
sudo yum install -y docker

Start the service.

sudo service docker start
sudo usermod -a -G docker ec2-user

See the installation instructions for Swagger-UI here https://github.com/swagger-api/swagger-ui/blob/master/docs/usage/installation.md

Get the Docker image and run Swagger-UI on port 80.

docker pull swaggerapi/swagger-ui
docker run -p 80:8080 -e SWAGGER_JSON=/spec/swagger.json -e VALIDATOR_URL=null -v /home/ec2-user/plant_monitor/:/spec swaggerapi/swagger-ui

See the documentation of our Flask RESTful service in Swagger-UI online here http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com/


JupyterHub

JupyterHub is a multi-user server for Jupyter notebooks. It will be configured below to give admin rights to user ec2-user. User plant_monitor will be able to create notebooks in their linux account home directory. The setup is inspired by the instructions found here.

Install JupyterHub.

sudo pip install jupyterhub

Install Node.js, npm and configurable-http-proxy.

sudo yum install nodejs npm --enablerepo=epel
sudo npm install -g configurable-http-proxy

Generate SSL certificates.

mkdir certs
cd certs
sudo openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycert.pem

Generate configuration file.

jupyterhub --generate-config

Add the following lines into jupyterhub_config.py. You can directly copy the file from this repository to your home directory https://github.com/salekd/plant_monitor_aws/blob/master/config/jupyterhub_config.py

c.JupyterHub.port = 8000
c.JupyterHub.services = [
    {
        'name': 'cull-idle',
        'admin': True,
        'command': 'python cull_idle_servers.py --timeout=3600'.split(),
    }
]
c.JupyterHub.ssl_cert = '/home/ec2-user/certs/mycert.pem'
c.JupyterHub.ssl_key = '/home/ec2-user/certs/mykey.key'
c.Spawner.notebook_dir = '~/notebooks'
c.Authenticator.admin_users = {'ec2-user'}
c.Authenticator.whitelist = {'ec2-user', 'plant_monitor', 'student01', 'student02', 'student03', 'student04', 'student05', 'student06', 'student07', 'student08', 'student09', 'student10', 'teacher01', 'teacher02', 'teacher03'}
c.LocalAuthenticator.create_system_users = True

In order to save resources, the cull-idle service can be used to shut down idle single-user notebook servers. https://github.com/jupyterhub/jupyterhub/tree/master/examples/cull-idle

You will need to download the python script into your home directory.

wget https://raw.githubusercontent.com/jupyterhub/jupyterhub/master/examples/cull-idle/cull_idle_servers.py

Check that the PAM authentication is working.

sudo python -m pamela -a plant_monitor

Create a new directory for Jupyter notebooks in the plant_monitor user home directory.

sudo mkdir /home/plant_monitor/notebooks
sudo chown plant_monitor:plant_monitor /home/plant_monitor/notebooks
sudo cp /home/ec2-user/plant_monitor_rpi/notebooks/*.ipynb /home/plant_monitor/notebooks

For the other users, you can use a for loop.

for i in {01..10}
do
    sudo mkdir /home/student${i}/notebooks
    sudo chown student${i}:student${i} /home/student${i}/notebooks
    sudo cp /home/ec2-user/plant_monitor_rpi/notebooks/*.ipynb /home/student${i}/notebooks
done
for i in {01..03}
do
    sudo mkdir /home/teacher${i}/notebooks
    sudo chown teacher${i}:teacher${i} /home/teacher${i}/notebooks
    sudo cp /home/ec2-user/plant_monitor_rpi/notebooks/*.ipynb /home/teacher${i}/notebooks
done

Start JupyterHub as a super user.

sudo nohup jupyterhub &

JupyterHub will be available on port 8000.

https://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:8000

For troubleshooting, see https://github.com/jupyterhub/jupyterhub/issues/1377


Dash

Dash is a Python framework for building analytical web applications. It is used here to create an interactive dashboard. See the links below for more information.

Install the following packages.

sudo pip install dash==0.21.0  # The core dash backend
sudo pip install dash-renderer==0.11.3  # The dash front-end
sudo pip install dash-html-components==0.9.0  # HTML components
sudo pip install dash-core-components==0.20.2  # Supercharged components
sudo pip install plotly --upgrade  # Plotly graphing library used in examples

Dash will be available on port 8050.

http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:8050/

Clone this wiki locally