Skip to content

Latest commit

 

History

History
347 lines (241 loc) · 9.81 KB

File metadata and controls

347 lines (241 loc) · 9.81 KB

Contributing guide

Development environment

Development tasks are managed with mise. Run mise tasks to see all available tasks.

Prerequisites

Setup

# Trust the mise configuration and install tools
mise trust
mise install

# Create Python virtualenv and install dependencies
uv venv
uv sync --all-groups

Build & install

mise run build

# symlink the binary to /usr/local/bin
sudo mise run install                     

After making changes, run mise run build to rebuild and it will get picked up by the symlink.

Common tasks

# Run all tests
mise run test:go
mise run test:python
mise run test:rust

# Run specific tests
mise run test:go -- ./pkg/config
uv run tox -e py312-tests -- python/tests/server/test_http.py -k test_name

# Format code (all languages)
mise run fmt:fix

# Lint code (all languages)
mise run lint

Run mise tasks for the complete list of available tasks.

If you encounter any errors, see the troubleshooting section below.

Project structure

As much as possible, this is attempting to follow the Standard Go Project Layout.

  • cmd/ - The root cog command.
  • pkg/cli/ - CLI commands.
  • pkg/config - Everything cog.yaml related.
  • pkg/docker/ - Low-level interface for Docker commands.
  • pkg/dockerfile/ - Creates Dockerfiles.
  • pkg/image/ - Creates and manipulates Cog Docker images.
  • pkg/predict/ - Runs predictions on models.
  • pkg/util/ - Various packages that aren't part of Cog. They could reasonably be separate re-usable projects.
  • python/ - The Cog Python library.
  • integration-tests/ - Go-based integration tests using testscript.
  • tools/compatgen/ - Tool for generating CUDA/PyTorch/TensorFlow compatibility matrices.

For deeper architectural understanding, see the architecture documentation.

Updating compatibility matrices

The CUDA base images and framework compatibility matrices in pkg/config/ are checked into source control and only need to be regenerated when adding support for new versions of CUDA, PyTorch, or TensorFlow.

To regenerate the compatibility matrices, run:

# Regenerate all matrices
mise run generate:compat

# Or regenerate specific matrices
mise run generate:compat cuda
mise run generate:compat torch
mise run generate:compat tensorflow

The generated files are:

  • pkg/config/cuda_base_images.json - Available NVIDIA CUDA base images
  • pkg/config/torch_compatibility_matrix.json - PyTorch/CUDA/Python compatibility
  • pkg/config/tf_compatibility_matrix.json - TensorFlow/CUDA/Python compatibility

CI tool dependencies

Development tools are managed in two places that must be kept in sync:

  1. mise.toml — Tool versions for local development (uses aqua backend for prebuilt binaries)
  2. .github/workflows/ci.yaml — Tool installation for CI (uses dedicated GitHub Actions)

CI deliberately avoids aqua downloads from GitHub Releases to prevent transient 502 failures. Instead, it uses dedicated actions (taiki-e/install-action, go install, PyO3/maturin-action, etc.) that are more reliable.

Tools disabled in CI are listed in MISE_DISABLE_TOOLS in ci.yaml.

When updating a tool version, update both:

  • The version in mise.toml (for local dev)
  • The corresponding version pin in .github/workflows/ci.yaml (for CI)

See the CI Tool Dependencies section in AGENTS.md for the full mapping of tools to their CI installation methods.

Concepts

There are a few concepts used throughout Cog that might be helpful to understand.

  • Config: The cog.yaml file.
  • Image: Represents a built Docker image that serves the Cog API, containing a model.
  • Input: Input from a prediction, as key/value JSON object.
  • Model: A user's machine learning model, consisting of code and weights.
  • Output: Output from a prediction, as arbitrarily complex JSON object.
  • Prediction: A single run of the model, that takes input and produces output.
  • Predictor: Defines how Cog runs predictions on a model.

Running tests

To run the entire test suite:

mise run test:go
mise run test:python
mise run test:rust

To run just the Go unit tests:

mise run test:go

To run just the Python tests:

mise run test:python

[!INFO] This runs the Python test suite across all supported Python versions (3.10-3.13) using tox.

Integration Tests

Integration tests are in integration-tests/ using testscript. Each test is a self-contained .txtar file in integration-tests/tests/, with some specialized tests as Go test functions in subpackages.

# Run all integration tests
mise run test:integration

# Run a specific test
mise run test:integration string_predictor

# Run fast tests only (skip slow GPU/framework tests)
cd integration-tests && go test -short -v

# Run with a custom cog binary
COG_BINARY=/path/to/cog mise run test:integration

Writing Integration Tests

When adding new functionality, add integration tests in integration-tests/tests/. They are:

  • Self-contained (embedded fixtures in .txtar files)
  • Faster to run (parallel execution with automatic cleanup)
  • Easier to read and write (simple command script format)

Example test structure:

# Test string predictor
cog build -t $TEST_IMAGE
cog predict $TEST_IMAGE -i s=world
stdout 'hello world'

-- cog.yaml --
build:
  python_version: "3.12"
predict: "predict.py:Predictor"

-- predict.py --
from cog import BasePredictor

class Predictor(BasePredictor):
    def predict(self, s: str) -> str:
        return "hello " + s

For testing cog serve, use cog serve and the curl command:

cog build -t $TEST_IMAGE
cog serve
curl POST /predictions '{"input":{"s":"test"}}'
stdout '"output":"hello test"'

Advanced Test Commands

For tests that require subprocess initialization or async operations, use retry-curl:

retry-curl - HTTP request with automatic retries:

# Make HTTP request with retry logic (useful for subprocess initialization delays)
# retry-curl [method] [path] [body] [max-attempts] [retry-delay]
retry-curl POST /predictions '{"input":{"s":"test"}}' 30 1s
stdout '"output":"hello test"'

Example: Testing predictor with subprocess in setup

cog build -t $TEST_IMAGE
cog serve

# Use generous retries since setup spawns a background process
retry-curl POST /predictions '{"input":{"s":"test"}}' 30 1s
stdout '"output":"hello test"'

-- predict.py --
class Predictor(BasePredictor):
    def setup(self):
        self.process = subprocess.Popen(["./background.sh"])
    
    def predict(self, s: str) -> str:
        return "hello " + s

Test Conditions

Use conditions to control when tests run based on environment:

[short] - Skip slow tests in short mode:

[short] skip 'requires GPU or long build time'

cog build -t $TEST_IMAGE
# ... rest of test

Run with go test -short to skip these tests.

[linux] / [!linux] - Platform-specific tests:

[!linux] skip 'requires Linux'

# Linux-specific test
cog build -t $TEST_IMAGE

[amd64] / [!amd64] - Architecture-specific tests:

[!amd64] skip 'requires amd64 architecture'

# amd64-specific test
cog build -t $TEST_IMAGE

[linux_amd64] - Combined platform and architecture:

[!linux_amd64] skip 'requires Linux on amd64'

# Test that requires both Linux and amd64
cog build -t $TEST_IMAGE

Combining conditions:

Conditions can be negated with !. Examples:

  • [short] - True when go test -short is used (skip this test in short mode)
  • [!short] - True when NOT running with -short flag (only run this in full test mode)
  • [!linux] - True when NOT on Linux
  • [linux_amd64] - True when on Linux AND amd64

See existing tests in integration-tests/tests/, especially setup_subprocess_*.txtar, for more examples.

Running the docs server

To run the docs website server locally:

mise run docs:serve

Publishing a release

Releases are managed by GitHub Actions workflows. See .github/workflows/README.md for full details.

All packages use lockstep versioning from crates/Cargo.toml. There are three release types:

Type Example tag Branch rule PyPI/crates.io?
Stable v0.17.0 Must be on main Yes
Pre-release v0.17.0-alpha3 Must be on main Yes
Dev v0.17.0-dev1 Any branch No

Stable / Pre-release

# 1. Update crates/Cargo.toml version (e.g. "0.17.0" or "0.17.0-alpha3")
# 2. Merge to main
# 3. Tag and push
git tag v0.17.0
git push origin v0.17.0
# 4. Wait for release-build.yaml to create a draft release
# 5. Review the draft in GitHub UI, then click "Publish release"
#    This triggers release-publish.yaml -> PyPI + crates.io

Dev release

# From any branch:
# 1. Update crates/Cargo.toml version (e.g. "0.17.0-dev1")
# 2. Commit and push
# 3. Tag and push
git tag v0.17.0-dev1
git push origin v0.17.0-dev1
# 4. Done. Artifacts are built and published as a GH pre-release.
#    No PyPI/crates.io. No manual approval.

Troubleshooting

cog command not found

The compiled cog binary will be installed in $GOPATH/bin/cog, e.g. ~/go/bin/cog. Make sure that Golang's bin directory is present on your system PATH by adding it to your shell config (.bashrc, .zshrc, etc):

export PATH=~/go/bin:$PATH

Still having trouble? Please open an issue on GitHub.