Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

casics/collector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CASICS Collector

The CASICS Collector is a repository crawler and scraper that extracts data about projects and stores it in the CASICS (Comprehensive and Automated Software Inventory Creation System) database.

Authors: Michael Hucka
Repository: https://github.com/casics/collector
License: Unless otherwise noted, this content is licensed under the GPLv3 license.

☀ Introduction

CASICS (the Comprehensive and Automated Software Inventory Creation System) is a project to create a proof of concept that uses machine learning techniques to analyze source code in software repositories and classify the repositories. As part of this project, we need to obtain data about software project repositories in GitHub and (eventually) other hosting systems such as SourceForge. This module (the CASICS Collector) is designed to gather that data.

The Collector module queries hosting services via APIs (and for some purposes, also scrapes project web pages) and writes the data to the CASICS Database. It is designed as a separate module so that one or more instances can be started and run simultaneously. It does not download copies of repository files; that task is left to a separate module, the CASICS Downloader.

The CASICS Collector is written in Python.

⁇ Getting help and support

If you find an issue, please submit it in the GitHub issue tracker for this repository.

♬ Contributing — info for developers

A lot remains to be done on CASICS in many areas. We would be happy to receive your help and participation if you are interested. Please feel free to contact the developers either via GitHub or the mailing list casics-team@googlegroups.com.

Everyone is asked to read and respect the code of conduct when participating in this project.

❤️ Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant Number 1533792 (Principal Investigator: Michael Hucka). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.


             

About

Repository data gatherer for CASICS.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages