github_crawl

Script to crawl through Github by username or repo to mine for .py files in creating NLP datasets.

Installation

This package can be easily installed by following these steps:

$ git clone https://github.com/lupks/github_crawl

$ pip install -r requirements.txt

To mine for .py files by repo, pass the following arguments:

$ python src/repo_crawl.py {url} {output directory}

To mine for .py files by username, pass the following arguments:

$ python src/repo_crawl.py {username} {output directory}

For help or issues involving github_crawl, please submit a GitHub issue.

MIT