-
Notifications
You must be signed in to change notification settings - Fork 1
AWS
We choose to store and analyze data in the cloud. The complete solution consists of the following components. Click on the links below to see the installation instructions.
- EC2 instance to run all the services and store data
- Anaconda for the Python data analytics environment
- Git to get the plant monitor code
- PostgreSQL database to store data
- Flask for the REST API endpoints to upload data to the database and files
- Swagger to view the REST API documentation
- JupyterHub to write Jupyter notebooks for data analysis and visualisations, supporting multiple users
- Dash to view measurements in an interactive dashboard
Once installed, after restarting the EC2 instance all the services can be started using the following script from this repository.
ssh -i ~/.ssh/my_first_EC2_key.pem ec2-user@ec2-34-243-27-176.eu-west-1.compute.amazonaws.com
source plant_monitor_aws/ec2_start.sh
The first step is to launch an Amazon Linux EC2 instance, t2.micro will be sufficient for this project.
The following inbound traffic has to be enabled in the EC2 security group:
- SSH, port 22 - for the SSH access
- HTTP, port 80, 0.0.0.0/0 - for the web access (will be used for Swagger)
- custom TCP role, port 5432, your IP address - for PostgreSQL
- custom TCP role, port 5000, your IP address - for Flask
- custom TCP role, port 8000, your IP address - for JupyterHub
- custom TCP role, port 8888, your IP address - for Jupyter
- custom TCP role, port 8050, your IP address - for Dash
The instance can be accessed with SSH using your private key.
ssh -i .ssh/my_first_EC2_key.pem ec2-user@ec2-34-243-177-120.eu-west-1.compute.amazonaws.com
Update packages.
sudo yum update
Besides the default user ec2-user
, add new users: plant_monitor
for development purposes, student01
for students and teacher01
for teachers.
The following link gives some further information on how to do this. https://aws.amazon.com/premiumsupport/knowledge-center/new-user-accounts-linux-instance/
sudo adduser plant_monitor
sudo passwd plant_monitor
The new user will be used by students to access and analyse data, it will not have sudo rights. We keep the sudo rights only to the ec2-user
as specified in /etc/sudoers.d/cloud-init
For vim users, I recommend downloading the .vimrc
file below.
wget https://gist.githubusercontent.com/salekd/1ebae93b5d237daebaf2dfc101b92e19/raw/81d74c53f6ccc1db9f41433a678d4696a7663196/.vimrc
In case you need a break at some point, the good old terminal games may come handy.
wget http://www.melvilletheatre.com/articles/el6/bsd-games-2.17-28.el6.x86_64.rpm
sudo yum install bsd-games-2.17-28.el6.x86_64.rpm
In particular, hangman is my favourite. In the light of this project, let us create a new file /home/ec2-user/hangman.words
with a custom word list:
plantmonitor
raspberrypi
Backup the default word dictionary and put the new one in place.
sudo mv /usr/share/dict/words /usr/share/dict/words.bak
sudo ln -s /home/ec2-user/hangman.words /usr/share/dict/words
Play hangman
and have fun ;-)
Anaconda provides a Python environment with pre-installed packages for data analytics. Get the version for Python 3 and install.
wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh
bash Anaconda3-5.0.1-Linux-x86_64.sh
Specify /opt/anaconda3
as the installation path.
which python
export PATH=/opt/anaconda3/bin:$PATH
which python
The newly installed version of Python should be accessible to sudo users as well. The following links talks about setting path for sudo commands. https://superuser.com/questions/927512/how-to-set-path-for-sudo-commands
Edit the sudoers file
sudo visudo
and add the Anaconda path there.
Defaults secure_path = /opt/anaconda3/bin:/sbin:/bin:/usr/sbin:/usr/bin
Add export PATH="/opt/anaconda3/bin:$PATH"
to /etc/profile
for all users.
We will use git to get the plant monitor code. Install and configure git.
sudo yum install git
git config --global user.name "..."
git config --global user.email ...
Add the following lines into .bash_profile
in order to see the git branch in the command line prompt.
parse_git_branch() {
git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/ (\1)/'
}
export PS1="\u@\h \[\033[32m\]\w\[\033[33m\]\$(parse_git_branch)\[\033[00m\] $ "
Clone the repository.
git clone https://github.com/salekd/plant_monitor_aws.git
The PostgreSQL database is used to store the sensor data. I followed the installation steps described here https://github.com/snowplow/snowplow/wiki/Setting-up-PostgreSQL
sudo yum install postgresql postgresql-server postgresql-devel postgresql-contrib postgresql-docs
sudo service postgresql initdb
Specify the user access in /var/lib/pgsql9/data/pg_hba.conf
. You can directly copy the file from this repository https://github.com/salekd/plant_monitor_aws/blob/master/config/pg_hba.conf
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all flask_app 0.0.0.0/0 md5
host all plant_monitor 0.0.0.0/0 md5
# IPv6 local connections:
host all all ::1/128 md5
Make the database accessible from outside by making the following changes in /var/lib/pgsql9/data/postgresql.conf
. You can directly copy the file from this repository https://github.com/salekd/plant_monitor_aws/blob/master/config/postgresql.conf
listen_addresses = '*'
port = 5432
Start the service.
sudo service postgresql start
Run PostgreSQL as user postgres
.
sudo su - postgres
psql -U postgres
Create users and passwords.
ALTER USER postgres WITH PASSWORD '...';
CREATE USER flask_app NOSUPERUSER;
ALTER USER flask_app WITH PASSWORD '...';
CREATE USER plant_monitor NOSUPERUSER;
ALTER USER plant_monitor WITH PASSWORD '...';
Create the database and table for the measurements and grant the access rights. User flask_app
has all privileges and will be used in the Flask REST API to upload data to the database table. User plant_monitor
will have only the read access and will be used by students to read data.
CREATE DATABASE plant_monitor_db;
\list
\connect plant_monitor_db
CREATE TABLE measurements (
device VARCHAR(20),
time TIMESTAMP,
moisture REAL,
temperature REAL,
conductivity REAL,
light REAL
);
GRANT ALL PRIVILEGES ON TABLE measurements TO flask_app;
GRANT SELECT ON TABLE measurements TO plant_monitor;
CREATE TABLE bme280 (
device VARCHAR(20),
time TIMESTAMP,
temperature REAL,
pressure REAL,
humidity REAL
);
GRANT ALL PRIVILEGES ON TABLE bme280 TO flask_app;
GRANT SELECT ON TABLE bme280 TO plant_monitor;
CREATE TABLE si1145 (
device VARCHAR(20),
time TIMESTAMP,
visible REAL,
IR REAL,
UV REAL
);
GRANT ALL PRIVILEGES ON TABLE si1145 TO flask_app;
GRANT SELECT ON TABLE si1145 TO plant_monitor;
CREATE TABLE pump (
device VARCHAR(20),
time TIMESTAMP,
duration REAL
);
GRANT ALL PRIVILEGES ON TABLE pump TO flask_app;
GRANT SELECT ON TABLE pump TO plant_monitor;
This is an example on how to insert and read data.
INSERT INTO measurements (device, time, moisture, temperature, conductivity, light)
VALUES (0, CURRENT_TIMESTAMP, 0, 0, 0, 0);
SELECT * FROM measurements;
The database can be accessed remotely. For example, on your mac the PostgreSQL database can be installed with Homebrew.
brew install postgres
Connect to the database in the EC2 instance
psql -h ec2-34-243-27-176.eu-west-1.compute.amazonaws.com -p 5432 -U plant_monitor -d plant_monitor_db
and get the measurements.
\dt
SELECT * FROM measurements;
Flask is a framework to write RESTful microservices in Python. Install the following packages.
sudo pip install Flask
sudo pip install Flask-SQLAlchemy
sudo pip install psycopg2
sudo pip install flask-restful-swagger-2
sudo pip install flask-cors
A Flask REST API will be used to:
- upload sensor measurements to the PostgreSQL database and to store the same data in a csv file,
- save a snapshot from a camera.
Read the following links to find out more about uploading files and security:
- http://docs.python-requests.org/en/master/user/quickstart/#post-a-multipart-encoded-file
- http://flask.pocoo.org/docs/0.12/patterns/fileuploads/
- https://stackoverflow.com/questions/48116820/writing-python-request-using-post-to-flask-server-as-binary-file
Create a new directory accessible to all users for storing the data and change the owner.
sudo mkdir /data
sudo chown ec2-user:ec2-user /data
mkdir /data/measurements
mkdir /data/images
cp /home/ec2-user/plant_monitor_rpi/measurements/* /data/measurements
cp /home/ec2-user/plant_monitor_rpi/images/* /data/images
Enter the correct password in the PostgreSQL URI in `flask_app.cfg'.
- In case the password contains special characters make sure they are percent-encoded. The percent sign has to be entered twice as
%%
. https://en.wikipedia.org/wiki/Percent-encoding - https://docs.python.org/3.1/library/urllib.parse.html#urllib.parse.quote_plus
Run the Flask application.
cd /home/ec2-user/plant_monitor_aws
nohup python flask_app.py &
This is how to upload a measurement from the command line.
curl -H "Content-Type: application/json" -X POST -d '{"device":"C4:7C:8D:65:BD:76", "timestamp": "2018-01-23 22:00:41.062114", "moisture": 0, "temperature": 19.7, "conductivity": 0, "light": 39}' http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:5000/measurement
Flask also supports a Swagger documentation, see this link for more information https://github.com/swege/flask-restful-swagger-2.0
Download the json file describing the RESTful service.
curl http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:5000/api/swagger.json -o /home/ec2-user/plant_monitor_aws/swagger.json
Alternatively, one can use flask-autodoc instead of Swagger for the RESTful service documentation. https://github.com/acoomans/flask-autodoc
We will use Docker to install Swagger-UI in order to view the RESTful service documentation online. https://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html#install_docker
Install Docker.
sudo yum update -y
sudo yum install -y docker
Start the service.
sudo service docker start
sudo usermod -a -G docker ec2-user
See the installation instructions for Swagger-UI here https://github.com/swagger-api/swagger-ui/blob/master/docs/usage/installation.md
Get the Docker image and run Swagger-UI on port 80.
docker pull swaggerapi/swagger-ui
docker run -p 80:8080 -e SWAGGER_JSON=/spec/swagger.json -e VALIDATOR_URL=null -v /home/ec2-user/plant_monitor/:/spec swaggerapi/swagger-ui
See the documentation of our Flask RESTful service in Swagger-UI online here http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com/
JupyterHub is a multi-user server for Jupyter notebooks. It will be configured below to give admin rights to user ec2-user
. User plant_monitor
will be able to create notebooks in their linux account home directory. The setup is inspired by the instructions found here.
- https://github.com/jupyterhub/jupyterhub/wiki/Deploying-JupyterHub-on-AWS
- https://stackoverflow.com/questions/8205369/installing-npm-on-aws-ec2
Install JupyterHub.
sudo pip install jupyterhub
Install Node.js, npm and configurable-http-proxy.
sudo yum install nodejs npm --enablerepo=epel
sudo npm install -g configurable-http-proxy
Generate SSL certificates.
mkdir certs
cd certs
sudo openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycert.pem
Generate configuration file.
jupyterhub --generate-config
Add the following lines into jupyterhub_config.py
. You can directly copy the file from this repository to your home directory https://github.com/salekd/plant_monitor_aws/blob/master/config/jupyterhub_config.py
c.JupyterHub.port = 8000
c.JupyterHub.services = [
{
'name': 'cull-idle',
'admin': True,
'command': 'python cull_idle_servers.py --timeout=3600'.split(),
}
]
c.JupyterHub.ssl_cert = '/home/ec2-user/certs/mycert.pem'
c.JupyterHub.ssl_key = '/home/ec2-user/certs/mykey.key'
c.Spawner.notebook_dir = '~/notebooks'
c.Authenticator.admin_users = {'ec2-user'}
c.Authenticator.whitelist = {'ec2-user', 'plant_monitor', 'student01', 'student02', 'student03', 'student04', 'student05', 'student06', 'student07', 'student08', 'student09', 'student10', 'teacher01', 'teacher02', 'teacher03'}
c.LocalAuthenticator.create_system_users = True
In order to save resources, the cull-idle service can be used to shut down idle single-user notebook servers. https://github.com/jupyterhub/jupyterhub/tree/master/examples/cull-idle
You will need to download the python script into your home directory.
wget https://raw.githubusercontent.com/jupyterhub/jupyterhub/master/examples/cull-idle/cull_idle_servers.py
Check that the PAM authentication is working.
sudo python -m pamela -a plant_monitor
Create a new directory for Jupyter notebooks in the plant_monitor
user home directory.
sudo mkdir /home/plant_monitor/notebooks
sudo chown plant_monitor:plant_monitor /home/plant_monitor/notebooks
sudo cp /home/ec2-user/plant_monitor_rpi/notebooks/*.ipynb /home/plant_monitor/notebooks
For the other users, you can use a for loop.
for i in {01..10}
do
sudo mkdir /home/student${i}/notebooks
sudo chown student${i}:student${i} /home/student${i}/notebooks
sudo cp /home/ec2-user/plant_monitor_rpi/notebooks/*.ipynb /home/student${i}/notebooks
done
for i in {01..03}
do
sudo mkdir /home/teacher${i}/notebooks
sudo chown teacher${i}:teacher${i} /home/teacher${i}/notebooks
sudo cp /home/ec2-user/plant_monitor_rpi/notebooks/*.ipynb /home/teacher${i}/notebooks
done
Start JupyterHub as a super user.
sudo nohup jupyterhub &
JupyterHub will be available on port 8000.
https://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:8000
For troubleshooting, see https://github.com/jupyterhub/jupyterhub/issues/1377
Dash is a Python framework for building analytical web applications. It is used here to create an interactive dashboard. See the links below for more information.
Install the following packages.
sudo pip install dash==0.21.0 # The core dash backend
sudo pip install dash-renderer==0.11.3 # The dash front-end
sudo pip install dash-html-components==0.9.0 # HTML components
sudo pip install dash-core-components==0.20.2 # Supercharged components
sudo pip install plotly --upgrade # Plotly graphing library used in examples
Dash will be available on port 8050.
http://ec2-34-243-27-176.eu-west-1.compute.amazonaws.com:8050/