Skip to content

fix(h2): return Poll::Pending when poll_capacity is not ready in UpgradedSendStreamTask#4050

Open
abbshr wants to merge 1 commit intohyperium:masterfrom
abbshr:fix/poll_capacity_bypass_backpressure
Open

fix(h2): return Poll::Pending when poll_capacity is not ready in UpgradedSendStreamTask#4050
abbshr wants to merge 1 commit intohyperium:masterfrom
abbshr:fix/poll_capacity_bypass_backpressure

Conversation

@abbshr
Copy link
Copy Markdown

@abbshr abbshr commented Apr 9, 2026

Fix #4049

Fix a backpressure bypass bug in UpgradedSendStreamTask::tick() where poll_capacity() returning Poll::Pending caused a 'break 'capacity' that fell through to rx.poll_next() -> send_data(), pushing data into the h2 send buffer without available capacity. This broke the HTTP/2 flow control chain, causing unbounded memory growth (OOM) when downstream consumers were slower than upstream producers.

The fix changes 'break 'capacity' to 'return Poll::Pending', which correctly suspends the task until a WINDOW_UPDATE frame restores send capacity. The now-unused 'capacity label is also removed.

This bug was introduced in hyper v1.8.0 (PR #3967) and affects v1.8.0, v1.8.1, and v1.9.0. A single HTTP/2 CONNECT tunnel with asymmetric upstream/downstream speeds could trigger OOM within seconds.

Add four integration tests covering H2 CONNECT backpressure scenarios:

  • h2_connect_backpressure_respected: small window + large data transfer
  • h2_connect_zero_window_then_release: normal path regression guard
  • h2_connect_reset_during_backpressure: RST_STREAM error propagation
  • h2_connect_backpressure_bidirectional: bidirectional data + backpressure

…adedSendStreamTask

Fix a backpressure bypass bug in UpgradedSendStreamTask::tick() where
poll_capacity() returning Poll::Pending caused a 'break 'capacity' that
fell through to rx.poll_next() -> send_data(), pushing data into the h2
send buffer without available capacity. This broke the HTTP/2 flow
control chain, causing unbounded memory growth (OOM) when downstream
consumers were slower than upstream producers.

The fix changes 'break 'capacity' to 'return Poll::Pending', which
correctly suspends the task until a WINDOW_UPDATE frame restores send
capacity. The now-unused 'capacity label is also removed.

This bug was introduced in hyper v1.8.0 (PR hyperium#3967) and affects
v1.8.0, v1.8.1, and v1.9.0. A single HTTP/2 CONNECT tunnel with
asymmetric upstream/downstream speeds could trigger OOM within seconds.

Add four integration tests covering H2 CONNECT backpressure scenarios:
- h2_connect_backpressure_respected: small window + large data transfer
- h2_connect_zero_window_then_release: normal path regression guard
- h2_connect_reset_during_backpressure: RST_STREAM error propagation
- h2_connect_backpressure_bidirectional: bidirectional data + backpressure
@abbshr
Copy link
Copy Markdown
Author

abbshr commented Apr 9, 2026

cc @seanmonstar
Thanks for reviewing.

)));
}
Poll::Pending => break 'capacity,
Poll::Pending => return Poll::Pending,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment from L71-L74 I think is relevant to why this previously did not return early. We want to make sure the waker is registered with each of the futures, so that if one side "cancels", the task can clean up quickly.

  • We want to notice when capacity has become available.
  • Or when the remote has sent a RST_STREAM (or other error)
  • Or when our bytes sender (on the me.rx side) has closed and no longer expects to send more data.

Said another way, if we're waiting for capacity, and the user drops the Upgraded type (meaning they no longer want to write), this UpgradedSendStreamTask will not notice and will hang around until capacity is eventually given (if the peer ever gives it), and only then hang up.

I get what you're trying to do, but I think the types or channels would need to adjusted a little to handle those cases.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for replying.
I re-read the comments and the code impls. As the comments said, there are three sub task within tick: h2_tx.poll_capacity() h2_tx.poll_reset(), rx.poll_next().

I agree with the case as you said: "when the remote has sent a RST_STREAM (or other error)" should be noticed as soon as possible, because it relate to h2 context semantic, if h2 chan no longer to work, the whole task should be dropped.

But rx.poll_next(), I think it's the other half of the whole transaction, and comes after the h2: if no write operations on h2 chan are permitted, no poll() should be performed on the rx chan.

So I think the modify should be something like this:

// check capacity
let h2_has_capacity = loop {
     match me.h2_tx.poll_capacity(cx) {
        ...
        // just break the loop return no capacity flag
        Poll::Pending => break false,
     }
}

// handle the h2_tx RST_STREAM case
match me.h2_tx.poll_reset(cx) {
  ....
}

if !h2_has_capacity {
    return Poll::Pending;
}

// handle rx side poll data
match me.rx.as_mut().poll_next(cx) {
                Poll::Ready(Some(cursor)) => {
                    me.h2_tx
                        .send_data(SendBuf::Cursor(cursor), false)
                        .map_err(crate::Error::new_body_write)?;
                }
                Poll::Ready(None) => {
                    me.h2_tx
                        .send_data(SendBuf::None, true)
                        .map_err(crate::Error::new_body_write)?;
                    return Poll::Ready(Ok(()));
                }
                Poll::Pending => {
                    return Poll::Pending;
                }
}

Correct me if I'm wrong with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HTTP/2 CONNECT Upgraded stream bypasses H2 flow-control backpressure, causing unbounded memory growth (OOM)

2 participants