gold-crowd

A platform to create crowd-sourced gene function gold standards with Amazon Mechanical Turk

Installation

Make sure you have all requirements: python2, pipenv, and java (tested on openjdk 1.8, used for NobleCoder).
Download the repository
Change into it and pipenv install python dependencies
Launch NobleCoder from tools/NobleCoder-1.0.jar and import the Gene Ontology (download from here) under the name go. The process.py script will run NobleCoder on your abstracts and tell it to use the Ontology "go", so if you choose a different name you will have to adapt the script.

Put the Pubmed IDs of the abstracts you're interested in into data/pmid_list.txt
Run pipenv run python process.py
Output is in data/abstracts and data/brat-input. Put all files from these folders together in the same folder of your brat installation. In that same folder you will also need a file annotation.conf that could look like this (more information here):
```
[entities]

Gene
Function

[relations]

Does	Arg1:Gene, Arg2:Function
Does	Arg1:Function, Arg2:Gene
DoesNot	Arg1:Function, Arg2:Gene
DoesNot	Arg1:Gene, Arg2:Function

[attributes]

[events]
```
There will also be a file data/statistics.cvs containing the number of words, genes, and functions for each abstract.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
landing-page		landing-page
lib		lib
resources		resources
tools		tools
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
process.py		process.py