Skip to content

test: add comprehensive tests for gemini-cli skills setup#401

Merged
IsmailMehdi merged 7 commits into
GoogleCloudPlatform:mainfrom
minzznguyen:unit_test_gemini_cli
May 26, 2026
Merged

test: add comprehensive tests for gemini-cli skills setup#401
IsmailMehdi merged 7 commits into
GoogleCloudPlatform:mainfrom
minzznguyen:unit_test_gemini_cli

Conversation

@minzznguyen
Copy link
Copy Markdown
Contributor

Add a new integration test to verify that skill contents are preserved when copied to the fake home sandbox.

TAG=agy
CONV=a151643e-4527-4bbe-9908-a1f4f33be7e3

Add a new integration test to verify that skill contents are preserved
when copied to the fake home sandbox.

TAG=agy
CONV=a151643e-4527-4bbe-9908-a1f4f33be7e3
@minzznguyen minzznguyen requested a review from IsmailMehdi as a code owner May 21, 2026 22:07
Copy link
Copy Markdown
Collaborator

@IsmailMehdi IsmailMehdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice instinct adding test coverage for _setup_skills — there's not much for the generator boundary today, and the integration test in particular (test_skill_content_preserved) is a good idea since tmp_path lets you exercise real file ops safely. A few things should be fixed before this lands.

Blocking

  1. Missing sys.path setup — the file likely doesn't import from the repo root. Every other test in evalbench/test/ (e.g. trajectory_matcher_test.py:7, tool_naming_test.py:9) prepends the parent dir to sys.path so from generators.models... resolves:

    sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

    Without this, pytest evalbench/test/gemini_cli_test.py from the repo root will fail with ModuleNotFoundError: No module named 'generators'. Please add it, or run the test yourself from the repo root (uv run pytest evalbench/test/gemini_cli_test.py) to confirm it imports.

  2. print() statements throughout. There are ~40 prints used as narration. pytest captures stdout by default, so they only appear on failure or with -s, and they make the test body 3× longer than it needs to be. Behavior-level asserts already say what's being checked. Please strip them — for the rare case where a future debugger needs them back, pytest -v plus a good assertion message does the same job.

  3. try/except AssertionError: raise e around mock assertions. pytest's introspection already produces a clear diff for assert_called_once_with. Wrapping in try/except adds noise and raise e (vs bare raise) truncates the traceback. Replace with the bare assertion call.

Medium

  1. Log-message coupling is brittle. mock_logging.info.assert_any_call("Syncing skill: my-single-skill") will break the next time someone tweaks that log line. The mock_copytree.assert_called_once_with(...) assertion above it is the behavioral check that matters — drop the log assertion.

  2. Hardcoded /home/jamesamn in test_setup_single_skill_string. Looks like a leftover from local iteration. Use a generic value like /fake/real_home (or str(tmp_path)) — accidentally fine, but a reviewer notices.

Minor

  1. Inconsistent env-mocking style. Test 1 uses patch.dict(os.environ, {"HOME": ...}), test 2 uses monkeypatch.setenv. Pick one — monkeypatch is the pytest-native idiom and is already in test 2.

  2. test_skill_content_preserved is really a setup-integration test. It does exercise content preservation, but it relies on _setup_npm_auth and _setup not blowing up either — worth a docstring noting the broader scope, or split out a narrower variant that calls _setup_skills directly with a pre-built generator if you want a true unit test of the copy step.

  3. Worth covering at least one of the dict skill_config branches (link/install/enable/disable/uninstall) — they're the lion's share of _setup_skills and currently untested. subprocess.run is already mocked, so adding one is cheap.

The integration approach in test 2 is good — once you trim the prints and add the sys.path fix, that's a solid sandbox for further skill-setup tests.


Cleaned-up version

Here's the file with blocking + medium fixes applied — drop in if it's easier than editing piecemeal:

import os
import sys

import pytest
from unittest.mock import patch, MagicMock

sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))

from generators.models.gemini_cli import GeminiCliGenerator


@patch('generators.models.gemini_cli.logging')
@patch('generators.models.gemini_cli.subprocess.run')
@patch('generators.models.gemini_cli.os.path.exists')
@patch('generators.models.gemini_cli.shutil.copytree')
@patch('generators.models.gemini_cli.shutil.rmtree')
def test_setup_single_skill_string(
    mock_rmtree,
    mock_copytree,
    mock_exists,
    mock_run,
    mock_logging,
):
    """A string entry in setup.skills triggers a copytree from real to fake home."""
    real_home = "/fake/real_home"
    real_skill_path = os.path.join(real_home, ".gemini", "skills", "my-single-skill")

    mock_exists.side_effect = lambda path: path == real_skill_path
    mock_run.return_value = MagicMock(returncode=0, stdout="fake-token")

    config = {"setup": {"skills": ["my-single-skill"]}}

    with patch('generators.models.gemini_cli.os.makedirs'), \
         patch('generators.models.gemini_cli.open', create=True), \
         patch.dict(os.environ, {"HOME": real_home}):
        GeminiCliGenerator(config)

    expected_fake_home = os.path.abspath(os.path.join(".venv", "fake_home"))
    expected_fake_skill_path = os.path.join(
        expected_fake_home, ".gemini", "skills", "my-single-skill"
    )
    mock_copytree.assert_called_once_with(real_skill_path, expected_fake_skill_path)


@patch('generators.models.gemini_cli.subprocess.run')
def test_skill_content_preserved(mock_run, tmp_path, monkeypatch):
    """End-to-end: a skill file's contents survive copy into the fake-home sandbox.

    Exercises the full GeminiCliGenerator.__init__ -> _setup -> _setup_skills
    path with real filesystem operations confined to tmp_path. Only subprocess
    (gcloud auth) is mocked.
    """
    real_home = tmp_path / "real_home"
    real_home.mkdir()

    skill_name = "my-content-skill"
    real_skill_dir = real_home / ".gemini" / "skills" / skill_name
    real_skill_dir.mkdir(parents=True)

    skill_file = real_skill_dir / "secret.txt"
    expected_content = "hello world"
    skill_file.write_text(expected_content)

    mock_run.return_value = MagicMock(returncode=0, stdout="fake-token")

    monkeypatch.chdir(tmp_path)
    monkeypatch.setenv("HOME", str(real_home))

    GeminiCliGenerator({"setup": {"skills": [skill_name]}})

    expected_fake_skill_file = (
        tmp_path / ".venv" / "fake_home" / ".gemini" / "skills" / skill_name / "secret.txt"
    )
    assert expected_fake_skill_file.exists(), (
        f"Copied skill file not found at {expected_fake_skill_file}"
    )
    assert expected_fake_skill_file.read_text() == expected_content

(Also swapped the "password is xyz" sample content for a benign "hello world" so the file doesn't read like a credential.)

Comment thread evalbench/test/gemini_cli_test.py
Comment thread evalbench/test/gemini_cli_test.py Outdated
Comment thread evalbench/test/gemini_cli_test.py
@IsmailMehdi
Copy link
Copy Markdown
Collaborator

Also take a look at the style issues.

@minzznguyen minzznguyen changed the title test: add gemini cli skill content preservation test [WIP] test: add gemini cli skill content preservation test May 21, 2026
@minzznguyen minzznguyen force-pushed the unit_test_gemini_cli branch from d264465 to bcc4886 Compare May 26, 2026 02:34
TAG=agy
CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
@minzznguyen minzznguyen force-pushed the unit_test_gemini_cli branch from bcc4886 to 35d1d73 Compare May 26, 2026 02:36
TAG=agy
CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
@minzznguyen minzznguyen changed the title [WIP] test: add gemini cli skill content preservation test [WIP] test: add comprehensive tests for gemini-cli skills setup May 26, 2026
@minzznguyen minzznguyen changed the title [WIP] test: add comprehensive tests for gemini-cli skills setup test: add comprehensive tests for gemini-cli skills setup May 26, 2026
TAG=agy
CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
@IsmailMehdi IsmailMehdi self-requested a review May 26, 2026 18:07
IsmailMehdi
IsmailMehdi previously approved these changes May 26, 2026
@IsmailMehdi
Copy link
Copy Markdown
Collaborator

/gcbrun

@IsmailMehdi IsmailMehdi merged commit b76e928 into GoogleCloudPlatform:main May 26, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants