CrowdEval is an experimental crowdsourced factchecking application, created as part of my MComp Computer Science dissertation project. The idea for this project was provided by the project supervisor, Carolina Scarton.
CrowdEval ships with two Docker environments:
- a development environment with containerised infrastructure, but the app and frontend run natively
- a fully containerised production environment
Dependencies:
- Python 3.9.4
- Poetry
cd
into thewebapp
directory- Run
poetry install
- On macOS Big Sur, if Numpy fails to install, ensure the environment variable
SYSTEM_VERSION_COMPAT
is set to1
and try again
- On macOS Big Sur, if Numpy fails to install, ensure the environment variable
cp .env.dev .env
- Enter a
SECRET_KEY
, which can be any string - Enter the
TWITTER_
API keys. This has to be a new-style Twitter project because we use v2.0 of the Twitter API- The callback URL for development should be
localhost:5000/login/twitter/authorized
- The callback URL for development should be
- Enter the
RECAPTCHA_
API keys - The remainder of the file is already configured for the development environment and shouldn't need to be changed
- Enter a
- Run
docker-compose up
to start - Once the Elasticsearch service has started, run:
This will create the required posts index.
$ poetry run flask create-index -i posts -c infrastructure/elasticsearch/posts.json
- Install frontend dependencies with
npm install
- To start asset compilation and the Flask dev server, run
The app will be started on
$ npm run start
localhost:5000
- Migrate the database with
poetry run flask db upgrade
Working from the root directory:
cp webapp/.env.prod webapp/.env
- Enter a
SECRET_KEY
, which should be a random ~32-character secret - Enter the
TWITTER_
API keys. This has to be a new-style Twitter project because we use v2.0 of the Twitter API- The callback URL should be
<your hostname>/login/twitter/authorized
- The callback URL should be
- Enter the
RECAPTCHA_
API keys - The remainder of the file is already configured for the production environment and shouldn't need to be changed
- Enter a
docker-compose up
The application should now be built and started via Gunicorn.
In development, these commands should be prefixed with
poetry run
(or runpoetry shell
once to activate and run as is).In production, attach to the
crowdeval
service i.e.:docker-compose run --entrypoint "bash -l" crowdeval
and then run the commands, although note that
flask
needs to be run from./venv/bin/flask
to ensure the correct version is used.
The system can import the Kochkina et al.'s PHEME dataset, which has been pre-processed and stored in /seeds/kochkina_et_al_PHEME. To import it:
$ flask import-tweet-seeds seeds/kochkina_et_al_PHEME
This will create a .veracities.json
file, which can then be used to seed random (but biased towards the dataset's veracity) ratings with
$ flask seed-ratings
The explore by rating pages are served from Redis, and must be manually regenerated.
$ flask recache-explore
In production this is probably best run as a scheduled task on the host machine via cron with an entry such as (note the hard coded path to the docker-compose.yml file):
*/5 * * * * /usr/local/bin/docker-compose -f /data/crowdeval/docker-compose.yml run --entrypoint "venv/bin/flask recache-explore" crowdeval >> crowdeval-cron.log 2>&1
Tests and linting can be run with
$ flask test
$ flask lint