Allow set-body to replace origin server response bodies#12879
Allow set-body to replace origin server response bodies#12879bryancall wants to merge 9 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends the header_rewrite plugin's set-body operator to work with origin server responses, not just ATS-generated error responses. Previously, TSHttpTxnErrorBodySet() only affected responses generated by ATS itself, as the origin body would stream through the HTTP tunnel. Now, when internal_msg_buffer is set, ATS drains the origin response body (when feasible for connection reuse) and uses setup_internal_transfer() to send the plugin-provided body instead.
Changes:
- Added logic in
HttpSM::handle_api_return()to detect when a plugin has set an internal message buffer and divert to internal transfer instead of tunneling the origin response - Implemented
HttpSM::do_drain_server_response_body()to synchronously consume origin response bodies from the buffer to enable connection reuse - Extended
HttpSM::release_server_session()to allow pooling connections after successful body drains - Enabled
TS_HTTP_READ_RESPONSE_HDR_HOOKfor theset-bodyoperator - Added comprehensive test coverage and documentation
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
src/proxy/http/HttpSM.cc |
Core implementation: added internal_msg_buffer checks in TRANSFORM_READ and SERVER_READ paths, implemented do_drain_server_response_body() for connection reuse, and extended release_server_session() pooling logic |
include/proxy/http/HttpSM.h |
Added declaration for new do_drain_server_response_body() method |
plugins/header_rewrite/operators.cc |
Registered TS_HTTP_READ_RESPONSE_HDR_HOOK as an allowed hook for the set-body operator |
doc/admin-guide/plugins/header_rewrite.en.rst |
Updated set-body documentation with origin response use cases, connection reuse behavior explanation, and practical examples |
tests/gold_tests/pluginTest/header_rewrite/header_rewrite_set_body_origin.test.py |
Test entry point for replay-based test |
tests/gold_tests/pluginTest/header_rewrite/header_rewrite_set_body_origin.replay.yaml |
Comprehensive test scenarios covering both hooks, multiple status codes, and edge cases |
tests/gold_tests/pluginTest/header_rewrite/rules/rule_set_body_origin_send_resp.conf |
Test rule for SEND_RESPONSE_HDR_HOOK |
tests/gold_tests/pluginTest/header_rewrite/rules/rule_set_body_origin_read_resp.conf |
Test rule for READ_RESPONSE_HDR_HOOK |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Did you look at how |
|
FYI last time we tried doing something similar in txn_box we ran into this issue when the origin response was empty: #8081 |
|
Yes, I looked at
The ERROR approach in
Another important difference is how With
This PR takes the cleaner approach: check |
Correction: summary of all three commits in this PRCommit 1: Allow set-body to replace origin server response bodies Added Commit 2: Standardize set-body-from to preserve origin response headers Changed Commit 3: Skip origin connection when set-body is used at remap hook When New Rewrote
Updated All three test suites pass on both regular and ASAN builds. |
Previously, header_rewrite's set-body operator (which calls TSHttpTxnErrorBodySet) only worked for ATS-generated responses because internal_msg_buffer was only consumed by setup_internal_transfer(). When the origin sent a real response, the body was streamed through the tunnel and internal_msg_buffer was ignored. This change adds a check for internal_msg_buffer in the SERVER_READ and TRANSFORM_READ paths of handle_api_return(). When a plugin has set the internal message buffer, ATS now drains the origin response body (if possible) and uses setup_internal_transfer() instead of the tunnel. Key changes: - Check internal_msg_buffer in SERVER_READ and TRANSFORM_READ paths - Add do_drain_server_response_body() to synchronously drain small origin bodies so the server connection can be reused - Widen release_server_session() pooling condition to allow connection reuse after body drain (not just 304/HEAD) - Add TS_HTTP_READ_RESPONSE_HDR_HOOK to set-body's allowed hooks - Add autest covering origin body replacement at both hooks - Update set-body documentation with origin response examples
Change set-body-from to reenable with TS_EVENT_HTTP_CONTINUE instead of TS_EVENT_HTTP_ERROR, so the origin response headers and status code are preserved when the body is replaced. This aligns set-body-from behavior with set-body. Changes: - handleFetchEvents: reenable with TS_EVENT_HTTP_CONTINUE on success - Remove TSHttpTxnStatusSet workaround that was needed for error path - Add TS_HTTP_SEND_RESPONSE_HDR_HOOK support to set-body-from - Update test to expect 200 OK (was 500 INKApi Error) - Update documentation to reflect new behavior
When set-body (or any plugin using TSHttpTxnErrorBodySet()) is used at a pre-origin hook like REMAP_PSEUDO_HOOK, ATS now skips the origin connection entirely and serves a synthetic response. Previously, ATS would still open a TCP connection to the origin, exchange headers, and then throw away the body -- wasting resources. The change adds a check at the top of how_to_open_connection() in HttpTransact.cc: if internal_msg_buffer is already set, build a synthetic response and return INTERNAL_CACHE_NOOP. This works with set-status to support any status code (defaults to 200 OK). Also improves set-body-from test coverage: - Add SEND_RESPONSE_HDR_HOOK tests (previously untested) - Replace fragile gold-file comparisons with ContainsExpression/ ExcludesExpression verification of status codes and body content - Add new set-body-remap replay-based test for the origin-skip behavior - Update documentation with new synthetic response scenario and example
When remap fails with remap_required=1 but a plugin tunnel exists (TSHttpTxnServerIntercept or TSHttpTxnIntercept), the error response body from build_error_response() was left in internal_msg_buffer even though the error response headers were properly cleared. This caused how_to_open_connection() to see the stale buffer and short-circuit the plugin tunnel, serving the remap error page instead of the plugin-provided response. Fix: free internal_msg_buffer when discarding the error response in EndRemapRequest. Also add null guards around server_txn in the SERVER_READ and TRANSFORM_READ internal_msg_buffer paths, and revert the release_server_session() condition change which could allow server session reuse without body draining.
…sg_buffer In EndRemapRequest, the previous fix unconditionally cleared internal_msg_buffer in the else branch (remap success path). This destroyed the body set by set-body at remap hooks when no set-status was used, causing ATS to contact the origin instead of serving the synthetic response. Fix: Only clear internal_msg_buffer when a plugin tunnel is active (the original intent - clearing stale error bodies from build_error_response during remap failures for plugin tunnels). Also clear internal_msg_buffer in redirect_request() to prevent stale error bodies from build_error_response (e.g., connection failures) from being mistaken for plugin-set synthetic bodies by how_to_open_connection() during redirect/retry flows like the escalate plugin.
32087de to
0ed3d54
Compare
…docs - Add api_server_request_body_set guard in how_to_open_connection() to prevent TSHttpTxnServerRequestBodySet from triggering the synthetic response short-circuit (Scout finding: dual-use internal_msg_buffer) - Add chunked origin response test to header_rewrite_set_body_origin (Quinn finding: HTTP_UNDEFINED_CL drain branch untested) - Document header preservation behavior: operators should use rm-header to strip sensitive origin headers when using set-body (Sam finding) - Document set-body-from failure behavior: fetch failures forward the original origin response unmodified (Sam finding) - Document transform precedence: set-body takes priority over response transforms (Maya/Reese finding) - Fix misleading comment in rule_set_body_from_read_resp_fail.conf
- Fix Content-Length from 39 to 40 to match actual body size - Use PV-compatible chunked syntax (Transfer-Encoding header + size) instead of unsupported encoding: chunked directive
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.
Comments suppressed due to low confidence (1)
tests/gold_tests/pluginTest/header_rewrite/rules/rule_set_body_from_read_resp.conf:21
- For readability/consistency with the other header_rewrite rule files in this directory, consider indenting the operator line under the
cond(even though whitespace isn’t semantically significant for the parser). Right nowset-body-fromis not indented here but is indented in most other rules.
tests/gold_tests/pluginTest/header_rewrite/rules/rule_set_body_from_send_resp.conf
Outdated
Show resolved
Hide resolved
- Add api_server_request_body_set guard in TRANSFORM_READ and SERVER_READ branches of handle_api_return() to prevent TSHttpTxnServerRequestBodySet from triggering the internal transfer bypass (same fix as how_to_open_connection) - Fix docs: Content-Type is overwritten to text/html by setup_internal_transfer when set-body/set-body-from pass nullptr as mimetype. Document that operators should use set-header Content-Type to restore the desired type. - Fix docs: connection reuse -> clean connection shutdown (release_server_session only pools 304/auth-HEAD connections, not typical 200/403 responses) - Clarify set-body-from failure docs: certain fetch connection failures can cause ATS to generate an internal error page that FetchSM treats as a successful fetch, replacing the origin body - Fix indentation in all rule_set_body_from*.conf files for consistency
Test 6 exercises set-body at READ_RESPONSE_HDR_HOOK with a chunked origin response. The drain code detects HTTP_UNDEFINED_CL (chunked/unknown length) and sets NO_KEEPALIVE to close the connection rather than attempting synchronous drain. The replacement body is still served correctly to the client.
Summary
The
header_rewriteplugin'sset-bodyoperator (which callsTSHttpTxnErrorBodySet()) previously only worked for ATS-generated responses. When the origin returned a real response, the body was streamed through the HTTP tunnel andinternal_msg_bufferwas ignored. This meant plugins could not useset-bodyto replace or sanitize origin response bodies (e.g., stripping sensitive information from error pages).This PR adds a check for
internal_msg_bufferin theSERVER_READandTRANSFORM_READpaths ofHttpSM::handle_api_return(). When a plugin has set the internal message buffer, ATS now drains the origin response body (if possible for connection reuse) and usessetup_internal_transfer()instead of the tunnel.Changes
HttpSM::handle_api_return()— Checkinternal_msg_bufferinSERVER_READandTRANSFORM_READcases; divert tosetup_internal_transfer()when setHttpSM::do_drain_server_response_body()— New function that synchronously consumes the origin body from the buffer when fully available, enabling connection reuse. Falls back to closing the connection for chunked, unknown-length, or partially-received bodiesHttpSM::release_server_session()— Widen the pooling condition to allow connection reuse after successful body drain (not just 304/HEAD responses)header_rewrite operators.cc— AddTS_HTTP_READ_RESPONSE_HDR_HOOKtoset-body's allowed hooks so plugins can inspect origin response headers and replace the bodyset-bodydocs with origin response use case, connection reuse behavior, and example ruleREAD_RESPONSE_HDR_HOOKandSEND_RESPONSE_HDR_HOOK, including 403, 200, and empty-body casesExample Use Case
Sanitize a 403 response from an origin that leaks account information:
Test Plan
header_rewrite_set_body_origincovering 4 scenarios (403/200 at both hooks, empty body)curlagainst live proxy