GitHub Miner is a tool designed to scrape and analyze GitHub data to provide insightful statistics and information about repository metadata, pull requests, issues andf team collaboration. Here's a quick overview of how to set up and use the tool. This project was generated using the dxworks-template-node-ts
repository template.
GitHub Miner is designed to help data scientists, researchers, or curious minds to fetch, analyze, and design visualizations about GitHub data. It uses the GitHub GraphQL API to get relevant information and provides a suite of functions to analyze the data and derive different statistics.
- Repository Analysis
- Issue Analysis
- Pull Request Analysis
Before you begin, ensure you have met the following requirements:
- You have installed the latest version of Node.js and npm.
- You have installed the latest version of .NET.
- You have a Windows/Linux/Mac machine.
- You have read GitHub API rate limiting rules.
- You have a GitHub personal access token.
To install GitHub Miner, follow these steps:
-
Clone the repository
-
Install the requirements for the Node.js project.
cd github-miner-2/github-miner
npm install
To use GitHub Miner, follow these steps:
-
Generate a GitHub personal access token.
-
Complete the necessary information in the configuration file. In order to specify the targeted repository for data extraction, provide its name and owner.
-
Start the data extraction proccess by running in the terminal:
npm run start
-
After the data extraction process is completed, a new folder named
exports
is created, containing three files:issues.json
,pullRequests.json
andrepositoryInfo.json
. Move or copy these files to the GithubAnalyzer folder. -
Start the data analysis process by running in the terminal:
cd ../GithubAnalyzer
dotnet run ./GithubAnalyzer.csproj
- The final result file is created in the GithubAnalyzer folder, in
JSON
andCSV
formats.
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.