Skip to content

Python script to crawl through a repo looking for Python script

License

Notifications You must be signed in to change notification settings

lupks/github_crawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

github_crawl

Script to crawl through Github by username or repo to mine for .py files in creating NLP datasets.

Installation

This package can be easily installed by following these steps:

  • Clone the repo:
$ git clone https://github.com/lupks/github_crawl
  • Install any missing dependencies:
$ pip install -r requirements.txt

Usage

repo_crawl

To mine for .py files by repo, pass the following arguments:

  • URL of repo
  • Output directory of where you want to save python script as text file
$ python src/repo_crawl.py {url} {output directory}

user_crawl

To mine for .py files by username, pass the following arguments:

  • Github username (i.e. 'lupks')
  • Output directory of where you want to save python script as text file
$ python src/repo_crawl.py {username} {output directory}

Contact/Issues

For help or issues involving github_crawl, please submit a GitHub issue.

License

MIT

About

Python script to crawl through a repo looking for Python script

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages