Skip to content

added multiprocessing to Bayesian sampling#251

Open
rozyczko wants to merge 18 commits into
developfrom
bayesian_mp
Open

added multiprocessing to Bayesian sampling#251
rozyczko wants to merge 18 commits into
developfrom
bayesian_mp

Conversation

@rozyczko

Copy link
Copy Markdown
Member

This pull request introduces multiprocessing support for Bayesian DREAM sampling with the BUMPS minimizer, allowing parallel evaluation of the population during MCMC sampling. The changes add a new n_workers parameter to the sampling API, enabling users to specify the number of worker processes for parallel execution. Serialization logic is implemented to allow fit problems to be sent to worker processes, with robust handling of objects that are not trivially pickleable. The PR also includes comprehensive tests and a benchmarking script to demonstrate and validate the new multiprocessing functionality.

Multiprocessing support for BUMPS DREAM sampling

  • Added a new n_workers parameter to the sample method in both minimizer_bumps.py and multi_fitter.py, allowing users to specify the number of worker processes for parallel population evaluation. If n_workers is greater than 1, a custom multiprocessing pool mapper is used; otherwise, the default sequential evaluation is retained.

  • Implemented the BumpsPoolMapper class and associated serialization/deserialization helpers in minimizer_bumps.py to enable robust transfer of fit problems and closures to worker processes, including special handling for weak references and scipp.Variable objects. Added dependency on cloudpickle for advanced serialization.

Error handling and API improvements

  • Improved error messages and input validation for multiprocessing, including checks that n_workers is at least 1 and that the fit problem is serializable. The documentation for the sample methods was updated to describe the new parameter and possible exceptions.

  • Added a benchmarking script sampling_mpi.py to the tools/benchmarks directory, allowing users to measure wall-clock speedup from multiprocessing in a realistic sampling scenario.

@rozyczko rozyczko added [scope] enhancement Adds/improves features (major.MINOR.patch) [area] fitting Umbrella for fitting related work [priority] high Should be prioritized soon labels May 21, 2026
@codecov

codecov Bot commented May 21, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 59.73154% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.98%. Comparing base (aadbd48) to head (b4ced13).

Files with missing lines Patch % Lines
.../easyscience/fitting/minimizers/minimizer_bumps.py 59.73% 60 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #251      +/-   ##
===========================================
- Coverage    82.65%   81.98%   -0.68%     
===========================================
  Files           62       62              
  Lines         5023     5169     +146     
===========================================
+ Hits          4152     4238      +86     
- Misses         871      931      +60     
Flag Coverage Δ
integration 43.39% <27.51%> (-0.49%) ⬇️
unittests 81.46% <59.06%> (-0.68%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/easyscience/fitting/fitter.py 97.27% <ø> (ø)
.../easyscience/fitting/minimizers/minimizer_bumps.py 82.18% <59.73%> (-13.77%) ⬇️

@rozyczko

Copy link
Copy Markdown
Member Author

This has quite a few steps for pickling/serialization of legacy objects based on weakrefs and global object.
Currently required by reflectometry but we are moving away from these.

rozyczko and others added 2 commits June 5, 2026 21:49
Integrate the squash-merged Bayesian DREAM feature from develop (PR #244)
while retaining this branch's multiprocessing and pickling additions.

Reconciliation notes:
- Adopt develop's polymorphic `Fitter.mcmc_sample` (replaces the branch's
  separate `MultiFitter.sample`) and develop's API naming
  (`mcmc_sample`, return key `internal_bumps_object`) and input validation.
- Extend `Fitter.mcmc_sample` with `chains`, `seed`, and `n_workers`,
  forwarding them to the BUMPS minimizer so multiprocessing works for both
  single- and multi-dataset fitters.
- Keep the minimizer-level multiprocessing machinery: `BumpsPoolMapper`,
  worker helpers, cloudpickle-based problem serialization, the `chains`
  alias / `seed` handling, and `n_workers` wiring with mapper cleanup.
- Merge the test suites: develop's mcmc_sample tests plus the branch's MP,
  pickling, alias, and seed tests (adapted to the mcmc_sample API). Make the
  n_workers subprocess test import the model by file path instead of via the
  shadowable `tests` namespace package.
- Update the sampling benchmark tool to call `mcmc_sample`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@rozyczko rozyczko changed the base branch from bayesian to develop June 5, 2026 20:23
@rozyczko rozyczko marked this pull request as ready for review June 8, 2026 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[area] fitting Umbrella for fitting related work [priority] high Should be prioritized soon [scope] enhancement Adds/improves features (major.MINOR.patch)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant