An automatically generated registry of LinkML schemas discovered from GitHub repositories.
- Documentation: https://linkml.github.io/linkml-registry/registry/
- Registry Data: linkml_registry.yaml
The LinkML Registry is automatically generated using GitHub's Search API to discover repositories that use LinkML. The registry is updated through automated scripts that search for, analyze, and catalog LinkML projects.
The main discovery script (src/scripts/discover_linkml_repos.py
) performs the following operations:
- Repository searches: Queries for repositories containing "linkml", "linkml-project-cookiecutter", "linkml in:description", and "LinkML in:description"
- File content searches: Searches for "linkml" in specific files including:
pyproject.toml
requirements.txt
setup.py
,setup.cfg
poetry.lock
,Pipfile
- README files (
.md
,.rst
)
Note: GitHub's code search API does not index all repositories or files immediately. Some repositories, particularly newer ones or those with lower activity, may not have their files indexed and therefore won't be discovered through file content searches. If your LinkML project is not appearing in the registry, it may be due to GitHub's indexing limitations rather than the search criteria.
The following filters are applied to discovered repositories:
- Fork exclusion: Skips forked repositories to focus on original projects
- LinkML organization exclusion: Excludes repositories from the
linkml/
organization itself to focus on external usage - Minimum stars: Configurable minimum GitHub star count (default: 0)
- Archive status: Filters out archived repositories during analysis
For each discovered repository, the script extracts:
- Basic metadata: Title, description, license, GitHub URL
- Domain classification: Categorizes projects by domain (Biology, Computer Science, Chemistry, etc.)
- Status determination: Assesses if projects are active, inactive, or maintenance mode
- Topics extraction: Collects relevant GitHub topics, filtering out generic ones
- Schema URL detection: Attempts to identify documentation URLs (GitHub Pages, w3id.org)
- Contact information: Extracts repository owner/organization
- Primary sort: By GitHub star count (descending)
- Secondary sort: Alphabetically by repository title within same star count
The registry is automatically updated through GitHub Actions workflows:
- Trigger: Push to main branch or manual dispatch
- Frequency: Can be run on-demand via the GitHub Actions interface
- Process:
- Runs discovery script to update
linkml_registry.yaml
- Generates documentation from the updated registry
- Commits changes back to the repository
- Deploys updated documentation to GitHub Pages
- Runs discovery script to update
To manually update the registry:
# Discover and update the registry
make discover
# Generate documentation
make gendoc
# View locally
make testdoc
- Discovery script:
src/scripts/discover_linkml_repos.py
- Documentation generator:
src/scripts/generate_registry_docs.py
- Template files:
src/doc_templates/
- Makefile targets:
discover
,gendoc
,testdoc
.github/workflows/regenerate-registry.yaml
: Updates registry data.github/workflows/deploy-docs.yaml
: Deploys documentation to GitHub Pages.github/workflows/test-pr.yaml
: Runs tests on pull requests
The registry serves as a comprehensive catalog of the LinkML ecosystem, helping users discover projects, tools, and schemas that utilize the LinkML framework.