Skip to content

feat: shp buildrun gather subcommand to extract the details of the failed BuildRun#380

Open
kaizakin wants to merge 3 commits intoshipwright-io:mainfrom
kaizakin:feat/shp-gather
Open

feat: shp buildrun gather subcommand to extract the details of the failed BuildRun#380
kaizakin wants to merge 3 commits intoshipwright-io:mainfrom
kaizakin:feat/shp-gather

Conversation

@kaizakin
Copy link
Contributor

@kaizakin kaizakin commented Mar 21, 2026

Changes

  • added dynamic client support in params to fetch taskrun objects
  • added buildrun gather command that collects buildrun taskrun pod Objects and individual pod logs
  • --output flag determines the output location, default value is . currect directory.
  • --archive flag determines the output to be compressed as .tar.gz

Fixes #250


Since the issue's main concern was about TaskRun, gather now only collects diagnostics for taskruns
for PipelineRun it collects only buildrun object and stores it to buildrun.yaml
and returns this error message

BuildRun %q uses executor kind %q, which gather does not support yet", c.name, executorKind
  • Should i add support for PipelineRuns as well?
  • and tests for this feature?

@IrvingMg @hasanawad94 @SaschaSchwarze0


Type of PR

/kind feature

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Kind label has been set
  • Release notes block has been filled in, or marked NONE
2026-03-22.20-23-03.mp4

Release Notes

Add "shp buildrun gather" subcommand

Signed-off-by: karthik balasubramanian <karthikbalasubramanian08@gmail.com>
@pull-request-size pull-request-size bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 21, 2026
@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note labels Mar 21, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 21, 2026

@kaizakin: The label(s) kind/your-label-here cannot be applied, because the repository doesn't have them.

Details

In response to this:

Changes

Fixes #250

Related Issue

Fixes #

Type of PR

/kind your-label-here

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Kind label has been set
  • Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

your release note here

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign saschaschwarze0 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 22, 2026
@kaizakin kaizakin changed the title Feat: add shp buildrun gather subcommand feat: shp buildrun gather subcommand to extract the details of the failed BuildRun Mar 22, 2026
Signed-off-by: karthik balasubramanian <karthikbalasubramanian08@gmail.com>
Signed-off-by: karthik balasubramanian <karthikbalasubramanian08@gmail.com>
@kaizakin kaizakin marked this pull request as ready for review March 22, 2026 16:03
Copilot AI review requested due to automatic review settings March 22, 2026 16:03
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 22, 2026
@openshift-ci openshift-ci bot requested review from HeavyWombat and qu1queee March 22, 2026 16:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new shp buildrun gather subcommand to collect BuildRun diagnostics (BuildRun/TaskRun/Pod objects + pod logs) into a directory or .tar.gz archive, to help debug failed BuildRuns (Fixes #250).

Changes:

  • Add a dynamic.Interface client to CLI params for fetching Tekton objects.
  • Introduce shp buildrun gather subcommand with --output and --archive flags to export YAMLs and logs.
  • Add generated CLI docs for the new subcommand and link it from shp buildrun docs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pkg/shp/params/params.go Adds dynamic client support for retrieving Tekton resources (TaskRuns).
pkg/shp/cmd/buildrun/gather.go Implements the new gather subcommand, YAML/log collection, and optional tar.gz archiving.
pkg/shp/cmd/buildrun/buildrun.go Registers gather under the buildrun command group.
docs/shp_buildrun_gather.md Adds generated documentation for shp buildrun gather.
docs/shp_buildrun.md Links the new gather subcommand in the BuildRun command index.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +8 to +17
Gather collects the BuildRun object, the TaskRun created for it, the Pod created
for that TaskRun, and all the container logs into a single directory.

By default the command writes:

buildrun.yaml
taskrun.yaml
pod.yaml
logs/*.log

Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated CLI docs currently describe gather as always collecting TaskRun/Pod/logs, but the implementation errors out for PipelineRun executors (and may not produce all files). Please update the docs to reflect the current limitation (TaskRun-only) and what gets written in the unsupported cases.

Suggested change
Gather collects the BuildRun object, the TaskRun created for it, the Pod created
for that TaskRun, and all the container logs into a single directory.
By default the command writes:
buildrun.yaml
taskrun.yaml
pod.yaml
logs/*.log
Gather collects the BuildRun object and, for BuildRuns that use the TaskRun
executor, the TaskRun created for it, the Pod created for that TaskRun, and all
the container logs into a single directory.
BuildRuns that use other executors (for example, PipelineRun) are not fully
supported. For those BuildRuns, the command fails after writing only the
BuildRun object and does not reliably collect TaskRun/Pod resources or logs.
By default, for supported (TaskRun-based) BuildRuns the command writes:
buildrun.yaml
taskrun.yaml
pod.yaml
logs/*.log
For unsupported executors, only buildrun.yaml is guaranteed to be written before
the command exits with an error.

Copilot uses AI. Check for mistakes.
Comment on lines 45 to 49
type Params struct {
clientset kubernetes.Interface // kubernetes api-client, global instance
buildClientset buildclientset.Interface // shipwright api-client, global instance
dynamicClient dynamic.Interface
pw *reactor.PodWatcher // pod-watcher global instance
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Params now contains a dynamicClient, but unlike the kube and Shipwright clients there’s no Options helper (or NewParamsForTest wiring) to inject a dynamic client. This makes the new gather functionality hard to unit test without a real kubeconfig; consider adding a WithDynamicClient(dynamic.Interface) option (and/or extending WithClientset/NewParamsForTest) so tests can pass dynamicfake.NewSimpleDynamicClient.

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +54
const gatherLongDesc = `
Gather collects the BuildRun object, the TaskRun created for it, the Pod created
for that TaskRun, and all the container logs into a single directory.

By default the command writes:

buildrun.yaml
taskrun.yaml
pod.yaml
logs/*.log

Use --archive to package the gathered files as a .tar.gz archive.
`
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command help/long description doesn’t mention that only TaskRun executors are supported right now (and PipelineRun returns an error). This is user-facing behavior; please document the limitation in the long description so users aren’t surprised.

Copilot uses AI. Check for mistakes.
Comment on lines +98 to +115
func (c *GatherCommand) Run(p *params.Params, ioStreams *genericclioptions.IOStreams) error {
ctx := c.cmd.Context()
namespace := p.Namespace()

shpClient, err := p.ShipwrightClientSet()
if err != nil {
return err
}

kubeClient, err := p.ClientSet()
if err != nil {
return err
}

buildRun, err := shpClient.ShipwrightV1beta1().BuildRuns(namespace).Get(ctx, c.name, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("failed to get BuildRun %q: %w", c.name, err)
}
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New gather functionality isn’t covered by unit tests. Other BuildRun subcommands in this package (e.g. cancel, logs) have tests, so it would be good to add tests for at least: writing buildrun/taskrun/pod YAML, handling missing TaskRun/Pod, and --archive path behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +89 to +93
return fmt.Errorf("Buildrun name is required")
}
if c.outputDir == "" {
return fmt.Errorf("output directory cannot be empty")
}
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation error strings should follow Go conventions (start with lowercase, consistent resource casing). For example, "Buildrun name is required" should be "buildrun name is required" (or "BuildRun" consistently if treating it as a proper noun). Also the indentation in this block looks off and should be gofmt’d.

Suggested change
return fmt.Errorf("Buildrun name is required")
}
if c.outputDir == "" {
return fmt.Errorf("output directory cannot be empty")
}
return fmt.Errorf("buildrun name is required")
}
if c.outputDir == "" {
return fmt.Errorf("output directory cannot be empty")
}

Copilot uses AI. Check for mistakes.
}

logPath := filepath.Join(logsDir, fmt.Sprintf("%s.log", container.Name))
if err := os.WriteFile(logPath, []byte(logText), 0o644); err != nil {
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log files are written with mode 0644, but pod logs often contain sensitive information (tokens, env vars, etc.). Since the YAML outputs are written 0600, consider using similarly restrictive permissions for log files (e.g., 0600) to avoid leaking data to other local users.

Suggested change
if err := os.WriteFile(logPath, []byte(logText), 0o644); err != nil {
if err := os.WriteFile(logPath, []byte(logText), 0o600); err != nil {

Copilot uses AI. Check for mistakes.
Comment on lines +324 to +327
defer sourceFile.Close()

_, err = io.Copy(tarWriter, sourceFile)
return err
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createTargz defers sourceFile.Close() inside the filepath.Walk callback. Because the callback runs many times, those defers won’t execute until the entire walk completes, which can leak file descriptors and fail on larger outputs. Close the file explicitly at the end of each iteration (or wrap the per-file logic in an inner function).

Suggested change
defer sourceFile.Close()
_, err = io.Copy(tarWriter, sourceFile)
return err
_, err = io.Copy(tarWriter, sourceFile)
if err != nil {
_ = sourceFile.Close()
return err
}
if err := sourceFile.Close(); err != nil {
return err
}
return nil

Copilot uses AI. Check for mistakes.
Copy link
Member

@IrvingMg IrvingMg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @kaizakin!

I have a few suggestions for the first pass:

  1. Please address Copilot’s review comments.
  2. Fix the linter issues reported by CI.
  3. Feel free to squash the commits as well.

Regarding your questions:

  1. IIUC, supporting PipelineRun would add more complexity. So, in my opinion, it would be better to keep this PR focused on BuildRun, but I’d like to hear others’ opinions.
  2. It would be nice to have unit tests, at least for basic scenarios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[FEATURE] Debug utility command: shp buildrun gather

4 participants