Skip to content

Add healthcheck on serve (fixes #2041)#2475

Merged
mikebonnet merged 2 commits intocontainers:mainfrom
rhatdan:2041
Apr 3, 2026
Merged

Add healthcheck on serve (fixes #2041)#2475
mikebonnet merged 2 commits intocontainers:mainfrom
rhatdan:2041

Conversation

@rhatdan
Copy link
Copy Markdown
Member

@rhatdan rhatdan commented Feb 27, 2026

When running ramalama serve with container and detach (-d), wait for the server to become healthy before returning, using the existing wait_for_healthy() and is_healthy() logic.

Made-with: Cursor

Fixes: #2041

Summary by Sourcery

Bug Fixes:

  • Ensure ramalama serve with --container and --detach waits for the server to be healthy before exiting, avoiding races with dependent operations.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Feb 27, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Adds a post-exec healthcheck to serve when running in containerized, detached mode so the CLI waits for the server to become healthy before returning.

File-Level Changes

Change Details Files
Run a healthcheck after starting the server in containerized detached mode.
  • After executing the server command, check CLI args for container, detach, and not dryrun
  • Invoke existing wait_for_healthy(args) to block until the server reports healthy in this mode
  • Leave behavior unchanged for non-container, non-detached, or dry-run executions
  • Ensure cleanup and error handling still flow through the existing exception block
ramalama/transports/base.py

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the ramalama serve command by integrating a health check for containerized and detached server processes. Previously, the command would return immediately after starting the container, potentially before the server was fully operational. The new logic ensures that the command waits for the server to report as healthy, improving reliability and preventing issues where subsequent operations might interact with an unready server.

Highlights

  • Health Check for Detached Containers: Implemented a health check mechanism for the ramalama serve command when executed with containerization and detachment (-d), ensuring the server is healthy before the command returns.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • ramalama/transports/base.py
    • Added a health check call for detached containerized servers.
Activity
  • No specific activity (comments, reviews) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The repeated getattr(args, ...) checks inside the if condition make the logic a bit hard to scan; consider pulling these into clearly named local booleans (e.g., is_container, is_detached, is_dry_run) before the if for readability.
  • If wait_for_healthy(args) can block for a noticeable time, you may want to surface some minimal user feedback or logging around this wait so that ramalama serve -d doesn't appear to hang silently when starting the container.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The repeated `getattr(args, ...)` checks inside the `if` condition make the logic a bit hard to scan; consider pulling these into clearly named local booleans (e.g., `is_container`, `is_detached`, `is_dry_run`) before the `if` for readability.
- If `wait_for_healthy(args)` can block for a noticeable time, you may want to surface some minimal user feedback or logging around this wait so that `ramalama serve -d` doesn't appear to hang silently when starting the container.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to add a health check when running ramalama serve in detached container mode. The intention is correct, but the current implementation has a critical flaw that makes the new health check code unreachable. I've provided a detailed comment and a code suggestion to fix this underlying issue and correctly implement the desired behavior.

Comment thread ramalama/transports/base.py Outdated
Comment on lines 792 to 802
try:
self.execute_command(cmd, args)
if (
getattr(args, "container", False)
and getattr(args, "detach", False)
and not getattr(args, "dryrun", False)
):
self.wait_for_healthy(args)
except Exception as e:
self._cleanup_server_process(args.server_process)
raise e
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current implementation has a fundamental issue: the new health check logic is unreachable. The call to self.execute_command(cmd, args) on line 793 leads to sys.exit(), terminating the program before the health check can run for detached containers.

To fix this and correctly implement the health check, the logic for detached container runs needs to be handled differently to avoid exiting. I suggest refactoring this try...except block to handle detached container runs separately, using run_cmd which doesn't exit the process.

        try:
            # For detached container runs, we need to avoid `exec_cmd` which calls `sys.exit`.
            if (
                getattr(args, "container", False)
                and getattr(args, "detach", False)
            ):
                # This logic is similar to `exec_model_in_container` but uses `run_cmd`.
                if len(cmd) > 0 and isinstance(cmd[0], ContainerEntryPoint):
                    cmd = cmd[1:]

                self.setup_container(args)
                self.setup_mounts(args)
                self.engine.add([args.image] + cmd)

                if not getattr(args, "dryrun", False):
                    run_cmd(self.engine.exec_args, stdout2null=args.noout)
                    self.wait_for_healthy(args)
                else:
                    self.engine.dryrun()
            else:
                self.execute_command(cmd, args)
        except Exception as e:
            self._cleanup_server_process(args.server_process)
            raise e

Copy link
Copy Markdown
Collaborator

@mikebonnet mikebonnet Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review looks correct to me. execute_command() calls exec_model_in_container() which calls engine.exec() which calls exec_cmd() which calls sys.exit() unconditionally. For this to work the call to engine.exec() would likely need to be replaced with engine.run() or engine.run_process().

When running ramalama serve with container and detach (-d), wait for the
server to become healthy before returning, using the existing
wait_for_healthy() and is_healthy() logic. The healthcheck is run from
the inference plugin after execute_command() starts the container.

Made-with: Cursor
Signed-off-by: Daniel J Walsh <[email protected]>
@rhatdan
Copy link
Copy Markdown
Member Author

rhatdan commented Mar 30, 2026

@olliewalsh @mikebonnet @engelmi PTAL, this should be ready to go in.

Copy link
Copy Markdown
Collaborator

@mikebonnet mikebonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mikebonnet mikebonnet temporarily deployed to macos-installer April 3, 2026 17:42 — with GitHub Actions Inactive
@mikebonnet mikebonnet enabled auto-merge April 3, 2026 17:42
@mikebonnet
Copy link
Copy Markdown
Collaborator

@sourcery-ai dismiss

@mikebonnet mikebonnet disabled auto-merge April 3, 2026 20:21
@mikebonnet mikebonnet merged commit effe375 into containers:main Apr 3, 2026
44 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add healthcheck on serve

2 participants