Skip to content

Commit

Permalink
first commit for the open-sourced version
Browse files Browse the repository at this point in the history
  • Loading branch information
Valentin Zulkower committed Oct 31, 2024
0 parents commit d85a13b
Show file tree
Hide file tree
Showing 20 changed files with 790 additions and 0 deletions.
12 changes: 12 additions & 0 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[bumpversion]
current_version = 0.0.3
tag = True
commit = True

[bumpversion:file:pyproject.toml]

[bumpversion:file:ginkgo_ai_client/__init__.py]

[bumpversion:file:CHANGES.md]
search = Unreleased
replace = {new_version} ({utcnow.year}-{utcnow.month:0>2}-{utcnow.day:0>2})
29 changes: 29 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Test

on: [push, pull_request]

permissions:
contents: read

jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.12"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: pip
cache-dependency-path: pyproject.toml
- name: Install dependencies
run: |
pip install '.[test]'
- name: Run tests
env:
GINKGOAI_API_KEY: ${{ secrets.GINKGOAI_API_KEY }}
run: |
pytest
34 changes: 34 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]

# Virtual environment directories
env/
venv/
ENV/
.venv/
.ENV/

# Distribution / packaging
build/
dist/
*.egg-info/
*.egg

# Log files
*.log

# Unit test / coverage reports
*.coverage
.coverage.*
.cache

# Jupyter Notebook checkpoints
.ipynb_checkpoints/

# IDE specific files
.vscode/
.idea/

# NOTE(vz): Personal pattern for keeping a non-shared sandbox directory in projects
_sandbox/
34 changes: 34 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
hooks:
- id: check-added-large-files
- id: check-ast
- id: check-builtin-literals
- id: check-case-conflict
- id: check-docstring-first
- id: check-json
- id: check-merge-conflict
- id: check-shebang-scripts-are-executable
- id: check-symlinks
- id: check-toml
- id: check-xml
- id: check-yaml
- id: debug-statements
- id: detect-private-key
- id: end-of-file-fixer
- id: trailing-whitespace
exclude: |
(?x)^(
.bumpversion.cfg
)$
# Fast Python linter and formatter - replaces flake8, isort, and black
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.9
hooks:
# Run the Ruff linter
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
# Run the Ruff formatter
- id: ruff-format
19 changes: 19 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Changelog

Major changes to `profound` are documented here.
Version numbers for the project follow the conventions described in :pep:`440`
and `Semantic versioning 2.0.0 <http://semver.org/>`\_, with the exceptions that:

- versions below `1.0.0` will be numbered as `0.major.minor-or-patch`

- versions above `1.0.0` will be numbered as `major.minor.patch`, as is
typical

## 0.0.3 (2024-10-28)

- Added UTR model and ESM
- Added batch inference

## 0.0.2 (2024-10-18)

- First version, AAO only
7 changes: 7 additions & 0 deletions LICENCE
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Copyright 2024 Ginkgo Bioworks

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
54 changes: 54 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# ginkgo-ai-client

**Work in progress: this repo was just made public and we are still working on integration**

A python client for [Ginkgo's AI model API](https://models.ginkgobioworks.ai/), to run inference on public and Ginkgo-proprietary models.
Learn more in the [Model API announcement](https://www.ginkgobioworks.com/2024/09/17/ginkgo-model-api-ai-research/).

## Prerequisites

Register at https://models.ginkgobioworks.ai/ to get credits and an API KEY (of the form `b396553a-326a-4478-22eb-223e6ef9ee49`).
Store the API KEY in the `GINKGOAI_API_KEY` environment variable.

## Installation

Install the python client with pip:

```bash
pip install ginkgo-ai-client
```

## Usage:

**Note: This is an alpha version of the client and its interface may vary in the future.**

The client requires an API key (and defaults to `os.environ.get("GINKGOAI_API_KEY")` if none is explicitly provided)

```python
from ginkgo_ai_client import GinkgoAIClient, aa0_masked_inference_params

client = GinkgoAIClient()
prediction = client.query(aa0_masked_inference_params("MPK<mask><mask>RRL"))
# prediction["sequence"] == "MPKYLRRL"

predictions = client.batch_query([
aa0_masked_inference_params("MPK<mask><mask>RRL"),
aa0_masked_inference_params("M<mask>RL"),
aa0_masked_inference_params("MLLM<mask><mask>R"),
])
# predictions[0]["result"]["sequence"] == "MPKYLRRL"
```

## Available models

See the reference docs for more details on usage and parameters

| Model | Description | Reference | Supported queries | Versions |
| ----- | ------------------------------------------- | -------------------------------------------------------------------------------------------- | ---------------------------- | -------- |
| ESM2 | Large Protein language model from Meta | [Github](https://github.com/facebookresearch/esm?tab=readme-ov-file#esmfold) | Embeddings, masked inference | 3B, 650M |
| AA0 | Ginkgo's proprietary protein language model | [Announcement](https://www.ginkgobioworks.com/2024/09/17/aa-0-protein-llm-technical-review/) | Embeddings, masked inference | 650M |
| 3UTR | Ginkgo's proprietary 3'UTR language model | [Preprint](https://www.biorxiv.org/content/10.1101/2024.10.07.616676v1) | Embeddings, masked inference | 650M |

## License

This project is licensed under the MIT License. See the `LICENSE` file for details.
23 changes: 23 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

livehtml:
sphinx-autobuild -n "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
3 changes: 3 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Docs

Note: you can get live docs running with `make livehtml`.
35 changes: 35 additions & 0 deletions docs/make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=sphinx-build
)
set SOURCEDIR=source
set BUILDDIR=build

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
echo.installed, then set the SPHINXBUILD environment variable to point
echo.to the full path of the 'sphinx-build' executable. Alternatively you
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.https://www.sphinx-doc.org/
exit /b 1
)

if "%1" == "" goto help

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%

:end
popd
3 changes: 3 additions & 0 deletions docs/source/_static/custom.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
.w-64 {
width: 26rem;
}
37 changes: 37 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
import os
import sys

sys.path.insert(0, os.path.abspath("../../"))

project = "ginkgo_ai_client"
copyright = "2024, Ginkgo Bioworks"
author = "Ginkgo Bioworks"

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration


extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"myst_parser", # for markdown support
]

templates_path = ["_templates"]
exclude_patterns = []


# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = "shibuya"
html_static_path = ["_static"]
html_css_files = ["custom.css"]
28 changes: 28 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
.. ginkgo_ai_client documentation master file, created by
sphinx-quickstart on Wed Oct 30 17:29:16 2024.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
.. include:: ../../README.md
:parser: myst_parser.sphinx_

----

API Documentation
-----------------

GinkgoAIClient
~~~~~~~~~~~~~~

.. automodule:: ginkgo_ai_client.client
:members:

Query Parameters
~~~~~~~~~~~~~~~~

.. automodule:: ginkgo_ai_client.query_parameters
:members:

.. .. toctree::
.. :maxdepth: 2
.. :caption: Contents:
22 changes: 22 additions & 0 deletions ginkgo_ai_client/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
__version__ = "0.0.3"

from .client import GinkgoAIClient

from .query_parameters import (
aa0_masked_inference_params,
aa0_mean_embedding_params,
esm_mean_embedding_params,
esm_masked_inference_params,
three_utr_masked_inference_params,
three_utr_mean_embedding_params,
)

__all__ = [
"GinkgoAIClient",
"aa0_masked_inference_params",
"aa0_mean_embedding_params",
"esm_mean_embedding_params",
"esm_masked_inference_params",
"three_utr_masked_inference_params",
"three_utr_mean_embedding_params",
]
Loading

0 comments on commit d85a13b

Please sign in to comment.