feat: shp buildrun gather subcommand to extract the details of the failed BuildRun#380
feat: shp buildrun gather subcommand to extract the details of the failed BuildRun#380kaizakin wants to merge 3 commits intoshipwright-io:mainfrom
Conversation
Signed-off-by: karthik balasubramanian <karthikbalasubramanian08@gmail.com>
|
@kaizakin: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Signed-off-by: karthik balasubramanian <karthikbalasubramanian08@gmail.com>
c761cbd to
8307693
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new shp buildrun gather subcommand to collect BuildRun diagnostics (BuildRun/TaskRun/Pod objects + pod logs) into a directory or .tar.gz archive, to help debug failed BuildRuns (Fixes #250).
Changes:
- Add a
dynamic.Interfaceclient to CLI params for fetching Tekton objects. - Introduce
shp buildrun gathersubcommand with--outputand--archiveflags to export YAMLs and logs. - Add generated CLI docs for the new subcommand and link it from
shp buildrundocs.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/shp/params/params.go | Adds dynamic client support for retrieving Tekton resources (TaskRuns). |
| pkg/shp/cmd/buildrun/gather.go | Implements the new gather subcommand, YAML/log collection, and optional tar.gz archiving. |
| pkg/shp/cmd/buildrun/buildrun.go | Registers gather under the buildrun command group. |
| docs/shp_buildrun_gather.md | Adds generated documentation for shp buildrun gather. |
| docs/shp_buildrun.md | Links the new gather subcommand in the BuildRun command index. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Gather collects the BuildRun object, the TaskRun created for it, the Pod created | ||
| for that TaskRun, and all the container logs into a single directory. | ||
|
|
||
| By default the command writes: | ||
|
|
||
| buildrun.yaml | ||
| taskrun.yaml | ||
| pod.yaml | ||
| logs/*.log | ||
|
|
There was a problem hiding this comment.
The generated CLI docs currently describe gather as always collecting TaskRun/Pod/logs, but the implementation errors out for PipelineRun executors (and may not produce all files). Please update the docs to reflect the current limitation (TaskRun-only) and what gets written in the unsupported cases.
| Gather collects the BuildRun object, the TaskRun created for it, the Pod created | |
| for that TaskRun, and all the container logs into a single directory. | |
| By default the command writes: | |
| buildrun.yaml | |
| taskrun.yaml | |
| pod.yaml | |
| logs/*.log | |
| Gather collects the BuildRun object and, for BuildRuns that use the TaskRun | |
| executor, the TaskRun created for it, the Pod created for that TaskRun, and all | |
| the container logs into a single directory. | |
| BuildRuns that use other executors (for example, PipelineRun) are not fully | |
| supported. For those BuildRuns, the command fails after writing only the | |
| BuildRun object and does not reliably collect TaskRun/Pod resources or logs. | |
| By default, for supported (TaskRun-based) BuildRuns the command writes: | |
| buildrun.yaml | |
| taskrun.yaml | |
| pod.yaml | |
| logs/*.log | |
| For unsupported executors, only buildrun.yaml is guaranteed to be written before | |
| the command exits with an error. |
| type Params struct { | ||
| clientset kubernetes.Interface // kubernetes api-client, global instance | ||
| buildClientset buildclientset.Interface // shipwright api-client, global instance | ||
| dynamicClient dynamic.Interface | ||
| pw *reactor.PodWatcher // pod-watcher global instance |
There was a problem hiding this comment.
Params now contains a dynamicClient, but unlike the kube and Shipwright clients there’s no Options helper (or NewParamsForTest wiring) to inject a dynamic client. This makes the new gather functionality hard to unit test without a real kubeconfig; consider adding a WithDynamicClient(dynamic.Interface) option (and/or extending WithClientset/NewParamsForTest) so tests can pass dynamicfake.NewSimpleDynamicClient.
| const gatherLongDesc = ` | ||
| Gather collects the BuildRun object, the TaskRun created for it, the Pod created | ||
| for that TaskRun, and all the container logs into a single directory. | ||
|
|
||
| By default the command writes: | ||
|
|
||
| buildrun.yaml | ||
| taskrun.yaml | ||
| pod.yaml | ||
| logs/*.log | ||
|
|
||
| Use --archive to package the gathered files as a .tar.gz archive. | ||
| ` |
There was a problem hiding this comment.
The command help/long description doesn’t mention that only TaskRun executors are supported right now (and PipelineRun returns an error). This is user-facing behavior; please document the limitation in the long description so users aren’t surprised.
| func (c *GatherCommand) Run(p *params.Params, ioStreams *genericclioptions.IOStreams) error { | ||
| ctx := c.cmd.Context() | ||
| namespace := p.Namespace() | ||
|
|
||
| shpClient, err := p.ShipwrightClientSet() | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| kubeClient, err := p.ClientSet() | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| buildRun, err := shpClient.ShipwrightV1beta1().BuildRuns(namespace).Get(ctx, c.name, metav1.GetOptions{}) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to get BuildRun %q: %w", c.name, err) | ||
| } |
There was a problem hiding this comment.
New gather functionality isn’t covered by unit tests. Other BuildRun subcommands in this package (e.g. cancel, logs) have tests, so it would be good to add tests for at least: writing buildrun/taskrun/pod YAML, handling missing TaskRun/Pod, and --archive path behavior.
| return fmt.Errorf("Buildrun name is required") | ||
| } | ||
| if c.outputDir == "" { | ||
| return fmt.Errorf("output directory cannot be empty") | ||
| } |
There was a problem hiding this comment.
The validation error strings should follow Go conventions (start with lowercase, consistent resource casing). For example, "Buildrun name is required" should be "buildrun name is required" (or "BuildRun" consistently if treating it as a proper noun). Also the indentation in this block looks off and should be gofmt’d.
| return fmt.Errorf("Buildrun name is required") | |
| } | |
| if c.outputDir == "" { | |
| return fmt.Errorf("output directory cannot be empty") | |
| } | |
| return fmt.Errorf("buildrun name is required") | |
| } | |
| if c.outputDir == "" { | |
| return fmt.Errorf("output directory cannot be empty") | |
| } |
| } | ||
|
|
||
| logPath := filepath.Join(logsDir, fmt.Sprintf("%s.log", container.Name)) | ||
| if err := os.WriteFile(logPath, []byte(logText), 0o644); err != nil { |
There was a problem hiding this comment.
The log files are written with mode 0644, but pod logs often contain sensitive information (tokens, env vars, etc.). Since the YAML outputs are written 0600, consider using similarly restrictive permissions for log files (e.g., 0600) to avoid leaking data to other local users.
| if err := os.WriteFile(logPath, []byte(logText), 0o644); err != nil { | |
| if err := os.WriteFile(logPath, []byte(logText), 0o600); err != nil { |
| defer sourceFile.Close() | ||
|
|
||
| _, err = io.Copy(tarWriter, sourceFile) | ||
| return err |
There was a problem hiding this comment.
createTargz defers sourceFile.Close() inside the filepath.Walk callback. Because the callback runs many times, those defers won’t execute until the entire walk completes, which can leak file descriptors and fail on larger outputs. Close the file explicitly at the end of each iteration (or wrap the per-file logic in an inner function).
| defer sourceFile.Close() | |
| _, err = io.Copy(tarWriter, sourceFile) | |
| return err | |
| _, err = io.Copy(tarWriter, sourceFile) | |
| if err != nil { | |
| _ = sourceFile.Close() | |
| return err | |
| } | |
| if err := sourceFile.Close(); err != nil { | |
| return err | |
| } | |
| return nil |
IrvingMg
left a comment
There was a problem hiding this comment.
Thanks for the PR @kaizakin!
I have a few suggestions for the first pass:
- Please address Copilot’s review comments.
- Fix the linter issues reported by CI.
- Feel free to squash the commits as well.
Regarding your questions:
- IIUC, supporting
PipelineRunwould add more complexity. So, in my opinion, it would be better to keep this PR focused onBuildRun, but I’d like to hear others’ opinions. - It would be nice to have unit tests, at least for basic scenarios.
Changes
buildruntaskrunpodObjects and individual pod logs--outputflag determines the output location, default value is.currect directory.--archiveflag determines the output to be compressed as.tar.gzFixes #250
Since the issue's main concern was about
TaskRun, gather now only collects diagnostics for taskrunsfor
PipelineRunit collects only buildrun object and stores it tobuildrun.yamland returns this error message
PipelineRunsas well?@IrvingMg @hasanawad94 @SaschaSchwarze0
Type of PR
/kind feature
Submitter Checklist
2026-03-22.20-23-03.mp4
Release Notes