master into nix-next#1715
Merged
Ericson2314 merged 23 commits intonix-nextfrom May 6, 2026
Merged
Conversation
There is an upstream recursive-nix issue that causes this test to fail intermittently in CI. Use `HARNESS-RETRY 5` so yath retries it rather than failing the whole suite.
The queue runner now supports systemd socket activation for both its REST and gRPC servers (`--rest-bind -` and `--grpc-bind -`). It uses `LISTEN_FDNAMES` to map socket names (`rest`, `grpc`) to fd indices. The HTTP and gRPC servers are refactored to accept pre-bound listeners, and binding is done centrally in `main`. The NixOS module now unconditionally uses socket units, eliminating port races and enabling the standard systemd socket activation lifecycle. The test harness (`QueueRunnerContext.pm`) binds sockets with port 0 and passes them to the queue runner via `LISTEN_FDS`, so tests get OS-assigned ports with no race condition.
Allow `dyn-drv-non-trivial` test to retry up to 5 times
…ivation Add systemd socket activation to queue runner
The `build_log` gRPC handler discarded the `io::Error` with `|_|`, reporting only `"Failed to write log file."` regardless of whether the cause was a disk issue, a broken stream from a crashed builder, or something else. This made intermittent CI failures impossible to diagnose. Include the error in the gRPC status message so the builder logs show the real cause (e.g. `"broken pipe"`, `"No space left on device"`).
… reporting The Perl web controllers were falling back to reading `/etc/nix/machines` (the old Nix remote systems file) when `queue_runner_endpoint` wasn't configured. But this doesn't really make sense because Hydra won't actually fall back to using those machines — the Rust queue runner uses a completely different gRPC self-registration mechanism. `getMachines()` now always queries the queue runner's `/status/machines` HTTP endpoint, and warns if `queue_runner_endpoint` is missing. Additionally, the `SystemStatus` DB table was being used as an intermediary for status reporting: the queue runner would write a JSON dump there, and `hydra-queue-runner --status` would read it back. This is no longer needed since the queue runner has a REST API that serves the same data directly. All consumers (`queue-runner-status` page, `hydra-send-stats`) now query the HTTP endpoint instead. Removed: - Legacy `/etc/nix/machines` file parsing from `getMachines()` - `SystemStatus` DB reads from `Root.pm` (the Perl controller) - The `--status` CLI flag and `dump_status_loop` from the Rust queue runner (no more writing status to the database) - `upsert_status`, `get_status`, `notify_dump_status`, `notify_status_dumped` from the DB crate - The `POST /dump_status` HTTP endpoint - The `buildMachinesFiles` NixOS option (with `mkRemovedOptionModule` deprecation message) and `NIX_REMOTE_SYSTEMS` env var - The machines file parsing test and related test scaffolding `hydra-send-stats` and the `queue-runner-status` page now query the queue runner's REST API directly. The `SystemStatus` table is marked for removal in a future migration. The `hydra-send-stats` test now exercises both the no-endpoint and with-endpoint code paths.
queue-runner: include actual error in `build_log` failure messages
Remove legacy build machine discovery and `SystemStatus`-based status…
Bumps [nixpkgs](https://github.com/NixOS/nixpkgs) from `fcf5160` to `41a7bd5`. - [Commits](NixOS/nixpkgs@fcf5160...41a7bd5) --- updated-dependencies: - dependency-name: nixpkgs dependency-version: 41a7bd591bc47320390c829b9f16bfc59b25843c dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
Bumps [nix](https://github.com/NixOS/nix) from `ee1ce88` to `24a0724`. - [Commits](NixOS/nix@ee1ce88...24a0724) --- updated-dependencies: - dependency-name: nix dependency-version: 24a072442bee6ad4c3a2da92abafff987881da26 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
Instead of forking Starman to add socket activation support, carry a `Net::Server::Systemd::PreFork` personality inside hydra's own Perl libs and monkey-patch `@Starman::Server::ISA` in `Hydra::Script::Server` when `LISTEN_FDS` is set. This avoids a forked dependency while the upstream PR (miyagawa/Starman#156) is pending. I am working on a forked foreman for optional local testing, but for now, the VM test tests that that the socket activation works correctly.
Add systemd socket activation for `hydra-server` (Starman)
build(deps): bump nix from `ee1ce88` to `24a0724`
build(deps): bump nixpkgs from `fcf5160` to `41a7bd5`
It is not needed by the test suite.
The two changes are: - `build.job` appears to be a string, so `build.job.name` should just be `build.job` - `git clone` doesn't copy over orphaned commits, so if the branch moves and a commit we're interested in is abandoned, we have to add a call to `git fetch` to get that specific commit
The web UI's machine status page needs this to fetch data from the queue runner's REST API. Without it, `/machines` shows no machines.
Add a `Socketfile` that declares sockets for `hydra-server` and `hydra-queue-runner`, so foreman can pass sockets to them just like systemd socket activation would. This means that development with foreman is once again exercising the same code paths that production will use. The foreman start scripts are updated accordingly: - `hydra-server` switches from `hydra-dev-server --port` to `hydra-server -f` (Starman with fork mode), picking up the socket via the `Net::Server::Systemd::PreFork` personality. - `hydra-queue-runner` gets `--rest-bind - --grpc-bind -` to use socket activation for both its REST and gRPC servers. - All services now `wait_for_hydra_db` before starting, since with socket activation the listen port is open before the server is ready, so `wait_for_hydra_server` alone is no longer sufficient. The foreman fork (Ericson2314/foreman@socketfile, PR ddollar/foreman#816) adds the `Socketfile` feature. Foreman is packaged from source in `packaging/foreman/` because it was not clear how to override the Nixpkgs version which gets its source from rubygems. I would not use forked projects when I am far from sure that upstream would accept the change in production, but foreman is just a developer convenience in the dev shell. That to me makes this OK --- it's just for developer convenience.
Use forked foreman to passing sockets like systemd in production
Update the "reproduce locally" script
Update harmonia to latest main, where `DrvOutput` uses a `StorePath` instead of a derivation hash. This lets us construct `Realisation` objects directly from data we already have (resolved drv path + output name + concrete output path) without calling `static_output_hashes` through the nix C++ FFI. Remove the realisation query FFI (`query_raw_realisation`, `InternalRealisation`, `realisation.rs`, `realisation.cpp`) and `static_output_hashes` FFI entirely. The binary-cache crate's `copy_realisation` (which queried the local store via FFI) is replaced with `write_realisation` that accepts a pre-constructed `Realisation`. In `succeed_step`, after a build step with CA floating outputs succeeds, write a `Realisation` for each output to all S3 binary caches. The realisations are signed with the cache's secret keys. TODO: also write realisations to the local store's SQLite `Realisations` table (for non-S3 stores) once nix is updated to 2.35, which uses path-based `DrvOutput` matching the new harmonia types.
Write CA realisations to binary cache in pure Rust
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.