style: make_units, launch_runs#29
Open
7PintsOfCherryGarcia wants to merge 6 commits into
Open
Conversation
…endencies add fallback defaults for missing out-path wildcards (project, library, flowcell_pos, flowcell, lane, date, workflow_ver, extra_file_md5, sample) warn when out-path fields are missing from parsed metadata and defaults are applied normalize date after metadata injection and replace invalid dates with today to keep formatting stable make workflow version detection resilient when GitPython is unavailable by falling back to git describe (and finally unknown) replace assert on missing --extra-file with argparse parser error for a clean, user-friendly failure message
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
Contributor
|
It might also be nice to support running |
Author
|
Good point. I agree that running Snakemake locally while submitting jobs to Slurm would be useful. I’d prefer to keep this PR focused, so I can follow up with a separate change for that if that sounds good |
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 1, 2026
| f"Jobs will be submitted to the {args.run} HPC, on account '{hpc_job_account}' and partition '{hpc_job_partition}'." | ||
| ) | ||
| extra_args += ( | ||
| f" --executor {args.run} --default-resources slurm_account={hpc_job_account} slurm_partition={hpc_job_partition}" |
Contributor
There was a problem hiding this comment.
This behavior is not kept.
fgvieira
reviewed
Jun 1, 2026
fgvieira
reviewed
Jun 2, 2026
| logging.info(f"Running jobs locally") | ||
| elif args.run == "slurm": | ||
| logging.info( | ||
| f"Jobs will be submitted to the {args.run} HPC, on account '{hpc_job_account}' and partition '{hpc_job_partition}'." |
Contributor
There was a problem hiding this comment.
hpc_job_account and hpc_job_partition are not defined.
fgvieira
reviewed
Jun 2, 2026
| out_path.mkdir(parents=True, exist_ok=args.force) | ||
| # Save units.tsv file | ||
| units.drop(["extra_file_md5"], axis=1).dropna(axis=1, how="all").to_csv( | ||
| units.dropna(axis=1, how="all").to_csv( |
Contributor
There was a problem hiding this comment.
Why include the config md5 in units.tsv?
fgvieira
reviewed
Jun 2, 2026
| logging.warning( | ||
| "Some date values could not be parsed. Using today's date for invalid rows." | ||
| ) | ||
| units.loc[units["date"].isna(), "date"] = out_path_defaults["date"] |
Contributor
There was a problem hiding this comment.
This feels a bit dangerous.
Suggested change
| units.loc[units["date"].isna(), "date"] = out_path_defaults["date"] |
fgvieira
reviewed
Jun 2, 2026
Comment on lines
+271
to
+277
| missing_out_path_fields = [ | ||
| key for key in out_path_wildcards if key not in units.columns.values | ||
| ] | ||
| if missing_out_path_fields: | ||
| logging.warning( | ||
| "Missing out-path fields in parsed metadata: %s. Applying defaults.", | ||
| ", ".join(sorted(missing_out_path_fields)), |
Contributor
There was a problem hiding this comment.
Won;t this clash with lines 263-268?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request makes several improvements and refactors to the
make_units.pyandlaunch_runs.pyscripts, focusing on workflow configuration flexibility, error handling, and robustness in metadata processing. The changes enhance user experience, make the scripts more adaptable to different environments, and improve maintainability.Major improvements and new features:
Workflow configuration and flexibility:
launch_runs.pyto add new arguments for workflow and environment selection (--pixi-path,--workflow-path,--scheduler,--partition,--account,--jobs). The script now builds commands more flexibly based on these options, supporting both local and Slurm-based execution and allowing users to override paths and scheduler settings. [1] [2] [3]Metadata and output handling:
make_units.py, added logic to inject default values for missing metadata fields required by output path formatting, ensuring that all required fields are present and logging warnings when defaults are used. [1] [2]workflow_ver(workflow version): now prefers GitPython but gracefully falls back to usinggit describevia subprocess if unavailable, making the script more robust and portable.Error handling and user feedback:
make_units.pyby usingparser.errorwith a clear message instead of an assertion.Other improvements:
--dryrunis set, making dry runs safer and clearer.These changes collectively make the scripts more user-friendly, configurable, and robust for diverse workflow execution environments.