Getting started with development

This is the LAGOON (acronym...) project source code.

Getting started with development

Note that ./lagoon_cli.py is a CLI for running common LAGOON functions.

Run pip install -r requirements.txt to ensure your Python environment has LAGOON's dependencies.
Also ensure you have Docker installed.
Run ./lagoon_cli.py dev up to launch an appropriately configured Postgres DB (and any other services required by LAGOON).
Either use a pre-populated database or build one from scratch (see two sections below).
Run ./lagoon_cli.py ui to browse around visually.
Run ./lagoon_cli.py shell to interact with the database in a CLI.
If running machine learning experiments is desired:
1. Run pip install -r requirements-ml.txt.
2. Clone the lagoon-artifacts repository as a sibling to this repository.

Using a pre-populated database

This method is preferred, as it saves a lot of time.

Retrieve a backup of the database, named like lagoon-db-backup-DATE in Google Drive.
Run ./lagoon_cli.py dev backup-restore path/to/backup to restore the database.

Building a database from scratch

Run ./lagoon_cli.py db reset to delete / create / set up the database.
Clone e.g. the CPython repository somewhere.
Run ./lagoon_cli.py ingest git load <path/to/cpython> to extract information from git into the LAGOON database. This took just under two hours on my laptop.
Run ./lagoon_cli.py ingest ocean_pickle load ~/Downloads/python.pck to extract information from OCEAN data.
Run ./lagoon_cli.py ingest python_peps load to extract information regarding Python PEPs into the LAGOON database.
Run ./lagoon_cli.py ingest toxicity_badwords compute to compute bad-word-based toxicity on messages and git commits, and put that information in the LAGOON database.
Run ./lagoon_cli.py ingest toxicity_nlp compute to compute toxicity scores from natural language processing models on messages and git commits, and put that information in the LAGOON database. This step requires the following:
1. Run pip install -r requirements-ml.txt.
2. Download pre-trained NLP models from Google Drive and place them inside ml/nlp_models/.
Run ./lagoon_cli.py ingest hibp load-breaches (and, optionally, ./lagoon_cli.py ingest hibp load-pastes) to extract the number of breaches (and pastes) from Have I Been Pwned for emails in the LAGOON database.
Run ./lagoon_cli.py fusion run to fuse entities and re-compute caches.

For development, after any change which affects attributes in the database, ./lagoon_cli.py fusion recache must be run to re-cache the latest attribute set.

Documentation

Building the documentation requires a few additional packages, which may be installed as pip install -r requirements-dev.txt.

System documentation may be built with the following commands:

$ cd docs
$ make html
$ open _build/html/index.html

Troubleshooting

Docker crashes

If the Postgres docker container holding the database crashes, no worries. The actual database files are stored in the folder ../deploy/dev/db, so as long as that still exists, the database is not truly deleted. If the container crashes, do docker stop <container_id>, and then ./lagoon_cli.py dev up again. May also want to restart VSCode.

Upgrading versions

Sometimes, the database might get upgraded. To upgrade your database to the latest version, run:

$ ./lagoon_cli.py alembic -- upgrade head

Postgres: using pgadmin

pgadmin is a popular tool for investigating PostgreSQL installations. To launch an instance of it pointing at the development database, call:

$ ./lagoon_cli.py db pgadmin

It may take up to a minute to actually open a browser tab.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
docs		docs
lagoon		lagoon
oneoffs		oneoffs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
lagoon_cli.py		lagoon_cli.py
requirements-dev.txt		requirements-dev.txt
requirements-ml.txt		requirements-ml.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting started with development

Using a pre-populated database

Building a database from scratch

Documentation

Troubleshooting

Docker crashes

Upgrading versions

Postgres: using pgadmin

About

Releases 1

Packages

Contributors 2

Languages

License

GaloisInc/SocialCyberLAGOON

Folders and files

Latest commit

History

Repository files navigation

Getting started with development

Using a pre-populated database

Building a database from scratch

Documentation

Troubleshooting

Docker crashes

Upgrading versions

Postgres: using pgadmin

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages