Skip to content

refactor(l1): unify discovery servers into single DiscoveryServer actor#6449

Open
ElFantasma wants to merge 11 commits intomainfrom
refactor/unify-discovery-servers
Open

refactor(l1): unify discovery servers into single DiscoveryServer actor#6449
ElFantasma wants to merge 11 commits intomainfrom
refactor/unify-discovery-servers

Conversation

@ElFantasma
Copy link
Copy Markdown
Contributor

Motivation

Resolves #5990. The discovery protocol uses four GenServers (DiscoveryMultiplexer + Discv4Server + Discv5Server + PeerTable) with significant inter-GenServer communication overhead and ~30-40% code duplication between v4 and v5 servers. Since PeerTable is already shared and the multiplexer already discriminates packets, merging the three discovery GenServers into a single DiscoveryServer simplifies the architecture.

Description

Merges DiscoveryMultiplexer, discv4::DiscoveryServer, and discv5::DiscoveryServer into a single unified discovery::DiscoveryServer actor.

Architecture: 4 actors → 2 actors (DiscoveryServer + PeerTable)

Before:                              After:
┌─────────────────────┐              ┌──────────────────────────────────┐
│ DiscoveryMultiplexer│              │        DiscoveryServer           │
└──────────┬──────────┘              │  ┌─────────────┬──────────────┐  │
           │                         │  │ Discv4State │ Discv5State  │  │
     ┌─────┴─────┐                   │  └─────────────┴──────────────┘  │
     ▼           ▼                   │  • Single UDP receive loop       │
┌─────────┐ ┌─────────┐             │  • Inline packet discrimination  │
│ Discv4  │ │ Discv5  │             │  • Unified timers                │
│ Server  │ │ Server  │             └───────────────┬──────────────────┘
└────┬────┘ └────┬────┘                             ▼
     └─────┬─────┘                           ┌───────────┐
           ▼                                 │ PeerTable │
    ┌───────────┐                            └───────────┘
    │ PeerTable │
    └───────────┘

Key changes:

  • New: discovery/server.rs — unified actor struct, protocol, startup, periodic task handlers, packet discrimination
  • New: discovery/discv4_handlers.rs — all discv4 handler logic as impl DiscoveryServer methods
  • New: discovery/discv5_handlers.rs — all discv5 handler logic as impl DiscoveryServer methods
  • Modified: discv4/server.rs — replaced old actor with Discv4State data struct + Discv4Message type
  • Modified: discv5/server.rs — replaced old actor with Discv5State data struct + Discv5Message type
  • Modified: network.rs — spawns single DiscoveryServer instead of 3 actors
  • Removed: discovery/multiplexer.rs — no longer needed
  • Moved INITIAL_LOOKUP_INTERVAL_MS, LOOKUP_INTERVAL_MS, lookup_interval_function to discovery/mod.rs (shared by discv4, discv5, and RLPx initiator)
  • DiscoveryConfig now includes initial_lookup_interval field

No behavioral changes — same protocols, same packet formats, same peer discovery semantics. Net -696 lines.

Closes #5990

Checklist

All items addressed — no Store schema changes.

@ElFantasma ElFantasma requested a review from a team as a code owner April 7, 2026 13:13
@github-actions github-actions bot added the L1 Ethereum client label Apr 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Lines of code report

Total lines added: 1648
Total lines removed: 1950
Total lines changed: 3598

Detailed view
+-----------------------------------------------------------+-------+-------+
| File                                                      | Lines | Diff  |
+-----------------------------------------------------------+-------+-------+
| ethrex/cmd/ethrex/initializers.rs                         | 667   | +1    |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/discovery/discv4_handlers.rs | 478   | +478  |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/discovery/discv5_handlers.rs | 718   | +718  |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/discovery/mod.rs             | 33    | +17   |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/discovery/server.rs          | 434   | +434  |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/discv4/server.rs             | 52    | -648  |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/discv5/server.rs             | 240   | -1268 |
+-----------------------------------------------------------+-------+-------+
| ethrex/crates/networking/p2p/network.rs                   | 621   | -34   |
+-----------------------------------------------------------+-------+-------+

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🤖 Codex Code Review

  1. discv5 external-IP/ENR updates are now dead code. PONG still records votes in discv5_handlers.rs:384, but Discv5State::record_ip_vote and cleanup_stale_entries only ever call finalize_ip_vote_round(None) in server.rs:104 and server.rs:134. That means the IpUpdater branch in server.rs:140 is never reached, so the node never updates local_node.ip or bumps its ENR sequence from discv5 IP voting anymore. The tests were also weakened to accept that behavior in discv5_server_tests.rs:285, which masks the regression instead of catching it.

  2. The refactor serializes all discovery packet handling behind one actor and then blocks inside that actor for each packet. handle_raw_packet processes packets inline in discovery/server.rs:273, and both protocol entrypoints immediately call tokio::task::block_in_place(...block_on(...)) in discv4_handlers.rs:26 and discv5_handlers.rs:41. Compared with the previous multiplexer + per-protocol actors, this creates head-of-line blocking between discv4, discv5, timers, and peer-table/store awaits. Under discovery churn or malformed-packet spam, that can turn into missed handshake/lookup timeouts and materially worse resilience. I’d keep packet routing non-blocking and let the async handlers run without block_in_place, or preserve actor separation.

I did not find an EVM/consensus-facing issue in this PR; the clear problems are in discovery correctness and discovery-path robustness.

I couldn’t run cargo check/targeted tests in this environment because rustup failed to create temp files on the read-only filesystem.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 7, 2026

Greptile Summary

This PR consolidates DiscoveryMultiplexer, discv4::DiscoveryServer, and discv5::DiscoveryServer into a single DiscoveryServer actor, reducing inter-actor messaging overhead and removing ~696 lines while keeping the same wire-level semantics. The refactor is well-structured with good test coverage.

Confidence Score: 5/5

Safe to merge; all findings are P2 style and quality issues with no protocol behavior regressions.

No P0 or P1 regressions identified. The block_in_place concern is a design quality issue on multi-threaded runtimes. The IP voting non-update is likely a pre-existing incomplete feature since the PR claims no behavioral changes. Both findings are P2.

discv5/server.rs (IP vote updater wiring), discovery/discv5_handlers.rs (deprecated RNG), discovery/server.rs (block_in_place pattern)

Important Files Changed

Filename Overview
crates/networking/p2p/discovery/server.rs Core unified actor; clean startup and routing logic; sync route_packet dispatch requires block_in_place workaround
crates/networking/p2p/discovery/discv4_handlers.rs Discv4 handlers ported faithfully; block_in_place+block_on bridging is the only concern
crates/networking/p2p/discovery/discv5_handlers.rs Discv5 handlers; one deprecated rand::thread_rng() call; IP-vote path collects votes but never applies them
crates/networking/p2p/discv4/server.rs Correctly reduced to Discv4State data struct and Discv4Message; no behavioral issues
crates/networking/p2p/discv5/server.rs Discv5State data struct; finalize_ip_vote_round always receives None updater so IP update never fires
crates/networking/p2p/discovery/mod.rs Shared constants and DiscoveryConfig; initial_lookup_interval field always overridden in network.rs
crates/networking/p2p/network.rs Spawns single DiscoveryServer; correctly overrides initial_lookup_interval from context
test/tests/p2p/discovery/discv5_server_tests.rs Tests migrated to unified DiscoveryServer; good coverage of rate limiting and PONG-triggered ENR updates

Sequence Diagram

sequenceDiagram
    participant UDP as UDP Socket
    participant DS as DiscoveryServer
    participant V4 as Discv4Handlers
    participant V5 as Discv5Handlers
    participant PT as PeerTable

    UDP->>DS: RawPacket via UdpFramed stream
    DS->>DS: route_packet(data, from)
    alt discv4 packet (keccak hash match)
        DS->>DS: Discv4Packet::decode(data)
        DS->>V4: discv4_process_message(msg)
        V4->>PT: insert_if_new / record_ping / get_contact
        V4->>UDP: send_to Pong/Neighbors/ENRResponse
    else discv5 packet
        DS->>DS: Discv5Packet::decode(data, node_id)
        DS->>V5: discv5_handle_packet(msg)
        V5->>PT: get_session_info / set_session_info
        V5->>UDP: send_to WhoAreYou/Handshake/Ordinary
    end
    Note over DS: Shared periodic timers
    DS->>V4: discv4_revalidate every 1s
    DS->>V5: discv5_revalidate every 1s
    DS->>V4: discv4_lookup adaptive interval
    DS->>V5: discv5_lookup adaptive interval
    DS->>DS: prune every 5s
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/networking/p2p/discovery/discv4_handlers.rs
Line: 26-31

Comment:
**`block_in_place` + `block_on` inside an already-async handler**

This sync wrapper calls `block_in_place` to enter async, but `handle_raw_packet` (the actor handler) is already `async`. `block_in_place` blocks the OS thread for the entire handler duration and panics on a `current_thread` Tokio runtime. Since `#[send_handler]` supports `async`, making `route_packet` and its callees `async fn` and awaiting them would eliminate `block_in_place` entirely. The same pattern appears in `discv5_handle_packet` (discv5_handlers.rs lines 41-46).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/networking/p2p/discv5/server.rs
Line: 140-169

Comment:
**IP auto-detection votes are collected but never applied**

`finalize_ip_vote_round` is designed to take an `IpUpdater` to write the winning IP back, but every call site passes `None`: `record_ip_vote` (line 134) and `cleanup_stale_entries` (line 104). The function docstring says 'When called from the unified server, `local_node` is passed so the IP can be updated', and the test comment says 'the server's prune/periodic handler would do that' — but no such periodic call with an `IpUpdater` exists in `discovery/server.rs`. The PONG-based IP self-detection feature silently collects votes but never updates the node's external IP.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/networking/p2p/discovery/discv5_handlers.rs
Line: 149-150

Comment:
**`rand::thread_rng()` is deprecated**

All other RNG usages in this file use `OsRng`, but this site still calls the deprecated `rand::thread_rng()`. Suggested fix:
```suggestion
        let ephemeral_key = SecretKey::new(&mut OsRng);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/networking/p2p/discovery/mod.rs
Line: 27

Comment:
**`initial_lookup_interval` is always overridden in `network::start_network`**

`start_network` unconditionally overwrites `config.initial_lookup_interval` with `context.initial_lookup_interval`, making the default set in `DiscoveryConfig::default()` dead code for every production caller. Consider removing the field from `DiscoveryConfig` or adding a doc-comment that `start_network` always overrides it.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "refactor(l1): unify discovery servers in..." | Re-trigger Greptile

Comment on lines +26 to +31
pub(crate) fn discv4_process_message(&mut self, msg: Discv4Message) {
let _ = tokio::task::block_in_place(|| {
tokio::runtime::Handle::current().block_on(self.discv4_process_message_inner(msg))
})
.inspect_err(|e| error!(protocol = "discv4", err=?e, "Error Handling Discovery message"));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 block_in_place + block_on inside an already-async handler

This sync wrapper calls block_in_place to enter async, but handle_raw_packet (the actor handler) is already async. block_in_place blocks the OS thread for the entire handler duration and panics on a current_thread Tokio runtime. Since #[send_handler] supports async, making route_packet and its callees async fn and awaiting them would eliminate block_in_place entirely. The same pattern appears in discv5_handle_packet (discv5_handlers.rs lines 41-46).

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/discovery/discv4_handlers.rs
Line: 26-31

Comment:
**`block_in_place` + `block_on` inside an already-async handler**

This sync wrapper calls `block_in_place` to enter async, but `handle_raw_packet` (the actor handler) is already `async`. `block_in_place` blocks the OS thread for the entire handler duration and panics on a `current_thread` Tokio runtime. Since `#[send_handler]` supports `async`, making `route_packet` and its callees `async fn` and awaiting them would eliminate `block_in_place` entirely. The same pattern appears in `discv5_handle_packet` (discv5_handlers.rs lines 41-46).

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 506ec7f — made route_packet, route_to_discv4, and route_to_discv5 async, removing the block_in_place + block_on bridge entirely. The #[send_handler] on handle_raw_packet is already async, so the handlers now just .await directly.

Comment thread crates/networking/p2p/discv5/server.rs Outdated
Comment on lines 140 to 169
fn finalize_ip_vote_round(&mut self, updater: Option<&mut IpUpdater<'_>>) {
let winner = self
.ip_votes
.iter()
.map(|(ip, voters)| (*ip, voters.len()))
.max_by_key(|(_, count)| *count);

if let Some((winning_ip, vote_count)) = winner {
// Only update if we have minimum votes and IP differs
if vote_count >= IP_VOTE_THRESHOLD && winning_ip != self.local_node.ip {
info!(
protocol = "discv5",
old_ip = %self.local_node.ip,
new_ip = %winning_ip,
votes = vote_count,
"External IP detected via PONG voting, updating local ENR"
);
self.update_local_ip(winning_ip);
}
if let Some((winning_ip, vote_count)) = winner
&& let Some(updater) = updater
&& vote_count >= IP_VOTE_THRESHOLD
&& winning_ip != updater.local_node.ip
{
info!(
protocol = "discv5",
old_ip = %updater.local_node.ip,
new_ip = %winning_ip,
votes = vote_count,
"External IP detected via PONG voting, updating local ENR"
);
update_local_ip(
updater.local_node,
updater.local_node_record,
updater.signer,
winning_ip,
);
}

// Reset for next round
self.ip_votes.clear();
self.ip_vote_period_start = Some(Instant::now());
self.first_ip_vote_round_completed = true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 IP auto-detection votes are collected but never applied

finalize_ip_vote_round is designed to take an IpUpdater to write the winning IP back, but every call site passes None: record_ip_vote (line 134) and cleanup_stale_entries (line 104). The function docstring says 'When called from the unified server, local_node is passed so the IP can be updated', and the test comment says 'the server's prune/periodic handler would do that' — but no such periodic call with an IpUpdater exists in discovery/server.rs. The PONG-based IP self-detection feature silently collects votes but never updates the node's external IP.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/discv5/server.rs
Line: 140-169

Comment:
**IP auto-detection votes are collected but never applied**

`finalize_ip_vote_round` is designed to take an `IpUpdater` to write the winning IP back, but every call site passes `None`: `record_ip_vote` (line 134) and `cleanup_stale_entries` (line 104). The function docstring says 'When called from the unified server, `local_node` is passed so the IP can be updated', and the test comment says 'the server's prune/periodic handler would do that' — but no such periodic call with an `IpUpdater` exists in `discovery/server.rs`. The PONG-based IP self-detection feature silently collects votes but never updates the node's external IP.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 506ec7f — removed IpUpdater entirely. record_ip_vote and cleanup_stale_entries now return Option<IpAddr> (the winning IP to apply). The caller in discv5_handle_pong and the prune() handler apply the update directly where they have access to self.local_node / self.local_node_record / self.signer.

Comment on lines +149 to +150
let ephemeral_key = SecretKey::new(&mut rand::thread_rng());
let ephemeral_pubkey = ephemeral_key.public_key(secp256k1::SECP256K1).serialize();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 rand::thread_rng() is deprecated

All other RNG usages in this file use OsRng, but this site still calls the deprecated rand::thread_rng(). Suggested fix:

Suggested change
let ephemeral_key = SecretKey::new(&mut rand::thread_rng());
let ephemeral_pubkey = ephemeral_key.public_key(secp256k1::SECP256K1).serialize();
let ephemeral_key = SecretKey::new(&mut OsRng);
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/discovery/discv5_handlers.rs
Line: 149-150

Comment:
**`rand::thread_rng()` is deprecated**

All other RNG usages in this file use `OsRng`, but this site still calls the deprecated `rand::thread_rng()`. Suggested fix:
```suggestion
        let ephemeral_key = SecretKey::new(&mut OsRng);
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 506ec7f — replaced with OsRng.

pub struct DiscoveryConfig {
pub discv4_enabled: bool,
pub discv5_enabled: bool,
pub initial_lookup_interval: f64,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 initial_lookup_interval is always overridden in network::start_network

start_network unconditionally overwrites config.initial_lookup_interval with context.initial_lookup_interval, making the default set in DiscoveryConfig::default() dead code for every production caller. Consider removing the field from DiscoveryConfig or adding a doc-comment that start_network always overrides it.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/networking/p2p/discovery/mod.rs
Line: 27

Comment:
**`initial_lookup_interval` is always overridden in `network::start_network`**

`start_network` unconditionally overwrites `config.initial_lookup_interval` with `context.initial_lookup_interval`, making the default set in `DiscoveryConfig::default()` dead code for every production caller. Consider removing the field from `DiscoveryConfig` or adding a doc-comment that `start_network` always overrides it.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — the default in DiscoveryConfig::default() is mainly useful for tests and standalone usage. In production, start_network always overrides it from P2PContext::initial_lookup_interval. Changed the override pattern to use struct update syntax (DiscoveryConfig { initial_lookup_interval: context.initial_lookup_interval, ..config }) to make the intent clearer.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🤖 Claude Code Review

Here is my review of PR #6449:


Review: refactor(l1): unify discovery servers into single DiscoveryServer actor

Overall impression: The architectural consolidation is clean and well-motivated — removing 30-40% code duplication and one inter-actor hop is a meaningful improvement. The new file layout is logical. However, there is one clear behavioral regression and a few design concerns worth addressing before merge.


Bug — External IP Update Is Broken (Regression)

crates/networking/p2p/discv5/server.rs:104,134 and crates/networking/p2p/discv5/server.rs:140-169

finalize_ip_vote_round is called with None everywhere in the codebase:

// cleanup_stale_entries (line 104)
self.finalize_ip_vote_round(None);

// record_ip_vote (line 134)
self.finalize_ip_vote_round(None);

When updater is None, the function computes the winning IP but never applies it — the update_local_ip call is dead. The IpUpdater struct is never instantiated in any production code path:

fn finalize_ip_vote_round(&mut self, updater: Option<&mut IpUpdater<'_>>) {
    // ...
    if let Some((winning_ip, vote_count)) = winner
        && let Some(updater) = updater  // always None → update never happens
        && vote_count >= IP_VOTE_THRESHOLD
        && winning_ip != updater.local_node.ip
    {
        update_local_ip(...);
    }
    // ...
}

The PR description says this is "no behavioral change," but the discv5 external IP detection via PONG voting is silently broken. The PR commit message says "actual IP update is done by the server's periodic handler" but no such periodic handler exists in server.rshandle_revalidate_v5 only sends pings.

Fix: Either wire up IpUpdater at the call sites in record_ip_vote / cleanup_stale_entries by passing &mut self.local_node, &mut self.local_node_record, and &self.signer through a borrow split, or introduce a dedicated IP finalization handler called from the prune timer.


Design Concern — block_in_place in Sync Packet Handlers

discv4_handlers.rs:27-30, discv5_handlers.rs:42-45

Both entry points are sync fns that block the Tokio worker thread:

pub(crate) fn discv4_process_message(&mut self, msg: Discv4Message) {
    let _ = tokio::task::block_in_place(|| {
        tokio::runtime::Handle::current().block_on(self.discv4_process_message_inner(msg))
    })
    ...
}

handle_raw_packet (the caller in server.rs:272) is already an async fn. The only reason for this sync wrapper and block_in_place bridge is that route_packet is sync. Making route_packetroute_to_discv4 / route_to_discv5 → the message handler all async would remove this indirection and avoid blocking a worker thread for the duration of crypto + network I/O on each packet. At high packet rates this can starve other tasks on the same thread.


Fragile expect Invariant Propagation

discv5_handlers.rs, multiple locations (e.g. lines 85, 133, 191, 262, 508, 517, 562, 605, 614, 632, 708)

There are ~10 occurrences of:

let discv5 = self.discv5.as_mut().expect("discv5 state must exist");

The invariant "discv5 is Some when a discv5 handler is called" is currently maintained by the routing guard in route_to_discv5:

fn route_to_discv5(&mut self, data: &[u8], from: SocketAddr) {
    if self.discv5.is_none() { return; }
    ...
}

This is a single layer of defense. Any future direct call to a discv5_* method (e.g., from a new timer handler, from a test, or from new_for_test with discv5 None) will panic at runtime. Prefer if let Some(discv5) = &mut self.discv5 { ... } else { return Ok(()); } or extract a helper that returns Result — both are panic-free.


Visibility: pub Fields on DiscoveryServer

server.rs:82-91

pub struct DiscoveryServer {
    pub local_node: Node,
    pub local_node_record: NodeRecord,
    pub(crate) signer: SecretKey,
    ...
    pub discv4: Option<Discv4State>,
    pub discv5: Option<Discv5State>,
}

discv4 and discv5 state (including session keys, pending challenges, nonce counters) are pub. This is likely driven by the test helper in discv5_server_tests.rs:

fn discv5(server: &mut DiscoveryServer) -> &mut Discv5State {
    server.discv5.as_mut().expect("discv5 state must exist")
}

pub(crate) would give the same test access (tests in test/ use the public API, so those would need pub, but discv4 could be pub(crate) or hidden behind accessors). Worth tightening since Discv5State contains live cryptographic session material.


Minor — get_lookup_interval Takes &mut self Unnecessarily

server.rs:430

pub(crate) async fn get_lookup_interval(&mut self) -> Duration {

The function only reads self.initial_lookup_interval and calls peer_table.target_peers_completion(). The peer_table field uses interior mutability (it's an actor handle), so &self should suffice. The &mut self signature propagates unnecessarily into caller signatures.


Minor — find_node_message Expiration Risk

discv4/server.rs:35-43

Discv4State::random_message bakes the expiration timestamp into the cached BytesMut at creation time using EXPIRATION_SECONDS = 20. The message is refreshed every CHANGE_FIND_NODE_MESSAGE_INTERVAL = 5s. The first message sent is valid, but if the actor is slow to start (e.g., delayed started() hook), the initial 20s window begins at construction, not at first use. This is a narrow window and low severity, but worth noting.


Nit — Inconsistent Logging Format

discv4_handlers.rs:289, discv5_handlers.rs:45

discv4 uses err=?e (Debug) while discv5 uses err=%e (Display) in symmetric log lines. Consistent use of ? (Debug) everywhere would match the rest of the codebase.


Summary

Severity Item
Bug IP update via PONG voting dead code — finalize_ip_vote_round never called with Some(updater)
Medium block_in_place in sync packet handler blocks Tokio worker thread
Medium Invariant panics on discv5 state must exist — single point of failure
Low pub visibility on live cryptographic session state
Nit get_lookup_interval unnecessarily takes &mut self
Nit Inconsistent ? vs % in tracing log fields

Automated review by Claude (Anthropic) · sonnet · custom prompt

@ElFantasma
Copy link
Copy Markdown
Contributor Author

Addressed all review feedback in 506ec7f:

Fixed:

  1. IP voting regressionrecord_ip_vote and cleanup_stale_entries now return Option<IpAddr> when a round ends with a winning IP. The caller (discv5_handle_pong and prune()) applies the update directly where it has access to local_node/local_node_record/signer. Removed the unused IpUpdater struct.
  2. block_in_place removedroute_packet, route_to_discv4, route_to_discv5 are now async fn and awaited directly from the #[send_handler]. No more sync bridge.
  3. rand::thread_rng()OsRng — consistent with the rest of the file.
  4. get_lookup_interval takes &self instead of &mut self.
  5. initial_lookup_interval override — uses struct update syntax for clarity.

Acknowledged but not changing:

  • pub fields on DiscoveryServer — needed for external test crate access (test/tests/p2p/discovery/). The old discv5 server had the same fields as pub.
  • expect("discv5 state must exist") panics — the invariant is maintained by the routing guard (if self.discv5.is_none() { return; }). Making every handler return early with if let would add noise for a situation that can't happen in normal operation. The new_for_test constructor always sets discv5 = Some(...).

Ok(())
}

pub fn new_for_test(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably shouldn't be here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 672fa03 — moved the test helper behind #[cfg(any(test, feature = "test-utils"))]. The ethrex-test crate now enables the test-utils feature on its ethrex-p2p dependency. Not in the production build path anymore.

Comment on lines +201 to +202
discv4: None,
discv5: Some(Discv5State::default()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only has discv5 but the name or the (nonexisting) documentation doesn't mention it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 672fa03 — renamed to new_for_discv5_test and added a docstring making it explicit that only discv5 state is initialized. The config now also explicitly sets discv4_enabled: false, discv5_enabled: true.

Comment on lines +88 to +89
pub(crate) initial_lookup_interval: f64,
pub(crate) config: DiscoveryConfig,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't initial_lookup_interval already in the DiscoveryConfig?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 672fa03 — removed the standalone initial_lookup_interval field and now reads it from self.config.initial_lookup_interval directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

L1 Ethereum client

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Unify discovery protocol GenServers into a single DiscoveryServer

3 participants