Most of the (python) prerequisites for AmCAT are automatically installed using pip (see below). To install the non-python requirements, you can use the following (on ubuntu):
$ sudo apt-get install antiword unrtf rabbitmq-server python-pip python-dev libxml2-dev libxslt-dev lib32z1-dev postgresql postgresql-server-dev-9.4 postgresql-contrib-9.4
It is probably best to install AmCAT in a virtual environment. Run the following commands to setup and activate a virtual environment for AmCAT: (on ubuntu)
$ sudo apt-get install python-virtualenv
$ virtualenv amcat-env
$ source amcat-env/bin/activate
If you use a virtual environment, every time you start working with AmCAT you need to repeat the source
line to load the environment. If you don't use a virtual environment, you will need to run most pip command below using sudo
.
AmCAT requires a database to store its documents in. The default settings look for a postgres database 'amcat' on localhost. To set up the current user as a superuser in postgres and create the database, use:
$ sudo -u postgres createuser -s $USER
$ createdb amcat
AmCAT uses elasticsearch for searching articles. Since we use a custom similarity to provide hit counts instead of relevance, this needs to be installed 'by hand'. You can probably skip this and rely on a pre-packaged elasticsearch if you don't care about hit counts, although you still need to install the elasticsearch plugins.
First, install oracle java (from http://www.webupd8.org/2012/01/install-oracle-java-jdk-7-in-ubuntu-via.html) For java 8 visit: http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html
$ sudo add-apt-repository ppa:webupd8team/java
$ sudo apt-get update
$ sudo apt-get install oracle-java7-installer #for java 7
$ sudo apt-get install oracle-java8-installer #for java 8
Next, download and extract elasticsearch and our custom hitcount jar, and install the required plugins:
cd /tmp
# Download and install elasticsearch
wget "https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.4.deb"
sudo dpkg -i elasticsearch-1.4.4.deb
# Install plugins
cd /usr/share/elasticsearch
# sudo bin/plugin -install elasticsearch/elasticsearch-lang-python/2.4.1 (no longer needed for master)
sudo bin/plugin -install elasticsearch/elasticsearch-analysis-icu/2.4.2
sudo bin/plugin -install mobz/elasticsearch-head
sudo wget http://hmbastiaan.nl/martijn/amcat/hitcount.jar
# Allow dynamic scripting (no longer needed for master)
# cd /etc/elasticsearch
# echo -e "\nscript.disable_dynamic: false" | sudo tee -a elasticsearch.yml
# Make sure elasticsearch detects hitcount.jar
sudo editor /etc/init.d/elasticsearch
# Add after ES_HOME:
ES_CLASSPATH=$ES_HOME/hitcount.jar
export ES_CLASSPATH
# Add to DAEMON_OPTS:
-Des.index.similarity.default.type=nl.vu.amcat.HitCountSimilarityProvider
# Save file and close editor
# Restart elasticsearch
sudo service elasticsearch restart
cd
Now you are ready to install AmCAT. The easiest way to do this is to pip install
it direct from github.
This is not advised unless you use a virtual environment.
pip install git+https://github.com/amcat/amcat.git
Alternatively, clone the project from github and pip install the requirements. If you plan to make changes to AmCAT, this is probably the best thing to do.
git clone https://github.com/amcat/amcat.git
pip install -r amcat/requirements.txt
If you install amcat via cloning, be sure to add the new directory to the pythonpath. Also, add AMCAT_ES_LEGACY hash to the environment. If you add these lines to amcat-env/bin/activate they will be automatically set when you activate.
export PYTHONPATH=$PYTHONPATH:$HOME/amcat
export AMCAT_ES_LEGACY_HASH=N
AmCAT uses bower to install javascript/CSS libraries. On Ubuntu, you need to install the legacy version of nodejs
first, and then install bower by using npm
:
sudo apt-get install nodejs-legacy npm
sudo npm install -g bower
On older ubuntu versions, if the above does not work, try installing nodejs via Chris Lea's ppa:
sudo add-apt-repository ppa:chris-lea/node.js
sudo apt-get update
sudo apt-get install nodejs
sudo apt-get upgrade nodejs
sudo npm install -g bower
Then, in the top-directory of AmCAT itself run:
bower install
Whichever way you installed AmCAT, you need tocall the syncdb command to populate the database and set the elasticsearch mapping:
python -m amcat.manage syncdb
For debugging, it is easiest to start amcat using runserver:
python -m amcat.manage runserver
Finally, to use the query screen you need to start a celery worker. In a new terminal, type:
DJANGO_SETTINGS_MODULE=settings celery -A amcat.amcatcelery worker -l info -Q amcat
(if you are using a virtual environment, make sure to activate
that first)
The main configuration parameters for AmCAT reside in the settings folder. In many places, these settings are defaults that can be overridden with environment variables.