Script to crawl through Github by username or repo to mine for .py files in creating NLP datasets.
This package can be easily installed by following these steps:
- Clone the repo:
$ git clone https://github.com/lupks/github_crawl
- Install any missing dependencies:
$ pip install -r requirements.txt
To mine for .py files by repo, pass the following arguments:
- URL of repo
- Output directory of where you want to save python script as text file
$ python src/repo_crawl.py {url} {output directory}
To mine for .py files by username, pass the following arguments:
- Github username (i.e. 'lupks')
- Output directory of where you want to save python script as text file
$ python src/repo_crawl.py {username} {output directory}
For help or issues involving github_crawl, please submit a GitHub issue.
MIT