Fix chanmon_consistency fuzz test for splice RBF#4536
Fix chanmon_consistency fuzz test for splice RBF#4536jkczyz wants to merge 5 commits intolightningdevkit:mainfrom
chanmon_consistency fuzz test for splice RBF#4536Conversation
…ency The process_events! macro only handled DiscardFunding events with FundingInfo::Contribution, but splice RBF replacements can produce DiscardFunding with FundingInfo::Tx when the original splice transaction is discarded. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The SplicePending event handler was immediately confirming splice transactions, which caused force-closes when RBF splice replacements were also confirmed for the same channel. Since both transactions spend the same funding UTXO, only one can exist on a real chain. Model this properly by adding a mempool-like pending pool to ChainState. Splice transactions are added to the pending pool instead of being confirmed immediately. Conflicting RBF candidates coexist in the pool until chain-sync time, when one is selected deterministically (by txid sort order) and the rest are rejected as double-spends. If a conflicting transaction was already confirmed, new candidates are dropped. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
👋 Thanks for assigning @wpaulino as a reviewer! |
|
This should fix the test for the following inputs: |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4536 +/- ##
==========================================
- Coverage 87.12% 86.99% -0.14%
==========================================
Files 163 163
Lines 108740 108659 -81
Branches 108740 108659 -81
==========================================
- Hits 94735 94523 -212
- Misses 11520 11653 +133
+ Partials 2485 2483 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I've thoroughly reviewed every hunk in this diff, examining:
No issues found. |
| fn is_outpoint_spent(&self, outpoint: &bitcoin::OutPoint) -> bool { | ||
| self.blocks.iter().any(|(_, txs)| { | ||
| txs.iter().any(|tx| { | ||
| tx.input.iter().any(|input| input.previous_output == *outpoint) |
There was a problem hiding this comment.
note that because hashes in fuzzing can collide, this implies some splices will never lock in - is that an issue for chanmon_consistency?
There was a problem hiding this comment.
Hmm... maybe we can mostly avoid some occurrences by only looking back six blocks?
fuzz/src/chanmon_consistency.rs
Outdated
| if msg.data.contains("Disconnecting due to timeout awaiting response") | ||
| )); | ||
| if msg.data.contains("Disconnecting due to timeout awaiting response") | ||
| || msg.data.contains("cannot RBF") |
There was a problem hiding this comment.
Huh? i don't see why some of these should be reachable? Messages should always be delivered in order, so, for example, it still should not be possible to attempt to RBF a splice for which a node has already sent a lock message. But honestly I don't see why we should allow most of these.
There was a problem hiding this comment.
Agreed. It seems like we're not failing locally in certain cases, so we see the counterparty rightfully reject them.
There was a problem hiding this comment.
Sorry, this commit should have been dropped. All but one of the test inputs are actually fixed by the previous commit. And the remaining input is fixed by the last commit. Need to verify if there were other inputs not listed here fixed by the first commit.
| events::Event::DiscardFunding { | ||
| funding_info: events::FundingInfo::Contribution { .. }, | ||
| funding_info: events::FundingInfo::Contribution { .. } | ||
| | events::FundingInfo::Tx { .. }, |
There was a problem hiding this comment.
Wouldn't it be better to always emit the Contribution variant?
There was a problem hiding this comment.
That's part of #4498, right? I guess we should prioritize that, too. 😬
fuzz/src/chanmon_consistency.rs
Outdated
| let mut txs = std::mem::take(&mut self.pending_txs); | ||
| txs.sort_by_key(|tx| tx.compute_txid()); | ||
| for tx in txs { | ||
| self.confirm_tx(tx); |
There was a problem hiding this comment.
Can we make all of them confirm in the same block? This will push 6 blocks for each transaction.
There was a problem hiding this comment.
Sure, though it shouldn't affect splices since we only take one RBF tx.
fuzz/src/chanmon_consistency.rs
Outdated
| if msg.data.contains("Disconnecting due to timeout awaiting response") | ||
| )); | ||
| if msg.data.contains("Disconnecting due to timeout awaiting response") | ||
| || msg.data.contains("cannot RBF") |
There was a problem hiding this comment.
Agreed. It seems like we're not failing locally in certain cases, so we see the counterparty rightfully reject them.
64205b8 to
28c4672
Compare
Under fuzz hashing, txids have only 8 effective bits, so outpoints from unrelated transactions in old blocks frequently collide. This causes is_outpoint_spent to produce false positives, silently preventing legitimate transactions from being added to the pending pool or confirmed. Limit is_outpoint_spent to the last 6 blocks. Confirmed transactions are always followed by 5 empty blocks, so the scan window covers exactly one block with transactions. This catches real double-spends (same-channel splice RBF candidates) while avoiding false positives from older blocks. Also remove the is_outpoint_spent check from add_pending_tx so that filtering decisions are made at confirmation time (in confirm_pending_txs) rather than at add time. This ensures all candidates participate in the deterministic sort before any are rejected. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
When process_msg_events processes a single message for node B (via OneMessage or OnePendingMessage mode), remaining events are passed to push_excess_b_events for routing. Timer ticks can generate BroadcastChannelUpdate events from marking channels disabled on disconnect, which may be left over after the single processed message. Since it is a broadcast not directed at a specific peer, it can be safely skipped. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
28c4672 to
3ce31ba
Compare
The
chanmon_consistencyfuzz target panicked on various inputs involvingsplice RBF because it immediately confirmed splice transactions in the
SplicePendinghandler. When RBF replacements were also confirmed for thesame channel, both transactions ended up in the chain despite spending the
same funding UTXO -- an impossible scenario that caused force-closes.
ChainState. Splice transactions are held in the pool until chain-synctime, when one is selected deterministically (by txid sort order) among
conflicting candidates. Double-spends with already-confirmed transactions
are rejected.
DiscardFundingwithFundingInfo::Txvariant, which is producedwhen a splice transaction is discarded during RBF replacement.
HandleErrorassertion,which can occur when the fuzzer delivers messages in unusual orders.
BroadcastChannelUpdateinpush_excess_b_events, which can beleft over when
process_msg_eventsprocesses a limited number of messages.See #4504