This tool imports GitHub metadata from repositories into the Software Observatory database. It identifies the GitHub repositories listed in the database entries, retrieves metadata for each repository using the GitHub metadata API, and stores the retrieved metadata back in the database.
If you are looking for a tool to import metadata from a GitHub repository, you can directly use the GitHub metadata importer tool. More specifically, use this endpoint.
The tool is written in Python 3.12 and requires the packages in the file requirements.txt
. You can install the required packages using the following command:
pip install -r requirements.txt
The tool requires the following environment variables to be set:
MONGO_HOST
: the hostname of the MongoDB server.MONGO_PORT
: the port of the MongoDB server.MONGO_USER
: the username for the MongoDB server.MONGO_PWD
: the password for the MongoDB server.MONGO_AUTH_SRC
: the authentication source for the MongoDB server.MONGO_DB
: the name of the MongoDB database.REPOSITORIES
: the name of the database where the gathered metadata will be stored.PRETOOLS
: the name of the Pretools database. The tool will read the list of repositories from this database.GITHUB_TOKEN
: the user GitHub token to use for the GitHub metadata API. The token must haveread:packages
enabled.
Put these environment variables in a .env
file in the root directory of the project.
To run the tool, execute the following command:
python3 main.py