Crittenden, Matthew, "SCOPE: Building and Testing an Integrated Manual-Automated Event Extraction Tool for Online Text-Based Media Sources" (2021). Undergraduate Honors Theses. Paper 1651. https://scholarworks.wm.edu/honorstheses/1651
SCOPE data collection+coding platform (2GZM demo)
WARNING (THIS README NEEDS TO BE UPDATED)
The SCOPE data collection 2GZM demo website is implemented in a Python web framework called Django. All you need is a laptop, a Python installation, and a browser for testing the website.
-
Install Python 3, if you haven't already done so:
- For Windows: Download and run the Python 3.11.2 installer.
- For macOS: Download and run the Python 3.11.2 installer.
-
Install Git, if you haven't already done so:
- For Windows: Download and run the Git 2.39.2 installer.
- For macOS: Git should already be installed on your system. If not, run
brew install git
to install.
-
Once Python and Git are installed, open a command prompt or terminal and run the following commands to verify the installations:
python3 --version
git --version
- Clone the repository by running the following in your console:
git clone https://github.com/wmgeolab/scope.git
-
Install the dependencies. If you're on Windows, download and install the MariaDB Connector/C library. Once you've installed the library, open a command prompt and navigate to the root of this repository (the main folder for the project). Then, type this command and press Enter:
- Windows:
pip install mariadb
- MacOS:
brew install mysql
-
Make note of the local path where you copied the
scope
project folder, this is the folder you'll be working in. Personally, I like to keep my coding projects in theDocuments
folder. -
Connect to the website database. The website connects to an external MySQL database and requires the database login credentials to be stored in a file called
db_config.cnf
. Contact @joegenius98 to get access to this file, and save it in the top-level project folder.
To run the backend server:
- Change your directory to the
backend
folder python manage.py runserver
- Note: Both frontend and backend servers must be running to make queries.
That should be all you need to work on developing and testing the website codebase.
- Install
Node.js
. (Last version used:16.7.1
) - Navigate to the
frontend
folder npm install
(Last version used:8.15.0
)npm start
As an alternative to npm
, you can use yarn
. I'm not too certain as how it compares to npm
,
but I have heard it can be better.
curl -o- -L https://yarnpkg.com/install.sh | bash
- Check that
yarn
installed successfully withyarn --version
(Last version used:1.22.19
) - Navigate to the
frontend
folder - Run
yarn
(which will automatically install frompackage.json
) yarn run
If you have Docker installed you can run everything in a containerized development environment without installing Node, Python, or MySQL libraries locally.
- Clone this repository to your local machine
- Run
docker compose build
- Run
docker compose run frontend npm install
- This step installs the node modules in the Docker image. In production this is handled automatically.
- Run
docker compose up
At this point you'll have the frontend piece running at http://localhost:3000 and the Python backend
components running at http://localhost:8080. Both servers will use your local clone of this repository, but
will use containerized versions of Node and Python. To updated versions edit the Dockerfile in frontend/
and
in the root directory respectively.
The scope
project folder contains all the files and folders needed to make the SCOPE website/platform work. Here's an overview of the top level folder structure that you'll need to know about:
scope
core
This folder controls the workings and settings of the "main" website, such as the home or about pages.apps
This folder contains one or more "apps" -- self-contained mini websites that are nested within the main website. In our case, each "app" is a SCOPE workflow "module", since these are independent website components that may or may not be included for different SCOPE data coding projects.examples
...resources
...
Before submitting a new commit, it's sometimes a good idea to check that your code works locally on your computer and see how your changes have impacted the website. This requires starting the Django development web-server, which you can do as follows:
- Open a commandline window, navigate to the
backend
folder, and typepython manage.py runserver
. This will run some basic checks of the website code and, if successfull, should displayQuit the server with CTRL-BREAK
. - Open a webbrowser and go to the url
http://127.0.0.1:8000/
. This should show your local version of the website. The data contents of the website however (such as rows in tables) is the same external MySQL database shared between all collaborators.
As long as you keep the commandline window open, you'll be able to interact with the website. Plus, any changes you make to the code will immediately take effect (except when you make changes to the database, see below).
Whenever you have made changes to any of the data models in models.py
, you will need to make sure these changes are registered to the central MySQL database:
- In the commandline window, type
python manage.py makemigrations
(if the web server is running, you have to stop it first by pressing CTRL-C). This command tells django to make a record of what changes need to be made in the database, based on your changes inmodels.py
(this is recorded in a file inside themigrations
folder). - Again in the commandline window, type
python manage.py migrate
. This is what actually applies the changes from the previous step to the database. - Start up the webserver again to see that your changes are reflected in the website. However, in most cases when you make changes to the database, the website will "break" and you'll see some errors. You'll need to update the code in whichever scripts rely on those data models. Read the errors -- they'll help you identify what needs fixing.
If all goes well, whatever changes you made locally in models.py
will be reflected in the external MySQL database shared by everyone.
NameError: name '_mysql' is not defined
If you run into this error, it could be that certain packages are incompatible on your machine. I was able to fix it by running
pip install mysqlclient==1.4.6