[FLINK-39366][connector-http] Add retry support with exponential backoff for HTTP Sink#33
[FLINK-39366][connector-http] Add retry support with exponential backoff for HTTP Sink#33featzhang wants to merge 3 commits into
Conversation
|
@featzhang I will look at this tomorrow |
d0052d0 to
2b9ecf6
Compare
|
You're absolutely right — the lookup source has a comprehensive retry configuration system with The current PR provides a minimal viable retry implementation with just I see two paths forward: Option 1: Extend this PR to match the lookup source pattern:
Option 2: Merge basic retry now, extend in follow-up:
Which approach would you prefer? I'm happy to extend this PR to include the full retry configuration system if that's the direction you'd like to go. It would take a bit more time to implement and test, but would give us feature parity with the lookup source in one shot. |
Option 1 looks good - as it is then consistent with the lookup |
- Add SINK_HTTP_RETRY_TIMES constant to HttpConnectorConfigConstants - Add RETRY_TIMES ConfigOption (default: 3) to HttpDynamicSinkConnectorOptions - Register RETRY_TIMES as optional option in HttpDynamicTableSinkFactory - Pass retry times via Properties in HttpDynamicSink - Implement exponential backoff retry logic in HttpSinkWriter - Only retries on IOException (network errors) - Uses exponential backoff with initial delay of 1 second - Set retry.times=0 to disable retries - Update unit tests to cover retry behavior - Update documentation (English and Chinese) with new option description
- Remove premature numRecordsSendErrorsCounter.inc() before retry attempts; only count errors after all retries are exhausted - Add retry support for HTTP-level failures (getFailedRequests() non-empty), consistent with IOException retry behavior - Extract scheduleRetry() helper to reduce code duplication - Add test cases: testRetryOnHttpFailedRequests, testNoRetryWhenDisabled
…kup source Extend the HTTP sink retry model so it matches the lookup source feature set (Option 1 in PR review): - Add http.sink.retry-strategy.type (fixed-delay | exponential-delay). - Add http.sink.retry-codes and http.sink.success-codes so users can precisely define which HTTP status codes are transient vs fatal. - Add fixed-delay / exponential-delay tuning keys (initial-backoff, max-backoff, backoff-multiplier, delay) mirroring the ones the lookup source already exposes. - Reuse the existing retry utilities: HttpResponseChecker drives the status-code classification, resilience4j's IntervalFunction computes the backoff, and RetryConfigProvider is generalised through a RetryOptionKeys bag so sink and lookup share the same strategy logic. - SinkHttpClientResponse now distinguishes retryable (transient) and fatal failures; HttpSinkWriter retries only the retryable ones and counts fatal failures against numRecordsSendErrorsCounter immediately. - The legacy http.sink.error.code* options remain untouched: when the new retry-codes / success-codes options are not set the client falls back to the previous one-bucket behaviour for backwards compatibility. - Update both the English and the Chinese connector docs with the new options and a dedicated 'Retries and handling errors (Sink)' section. - Add SinkRetryConfigTest and a new HttpSinkWriterTest case for fatal failures.
2b9ecf6 to
82b9450
Compare
|
Addressed the review comment — went with Option 1 and aligned the sink retry configuration with the lookup source. Latest commit: What changedNew sink options (mirroring
Implementation notes
Tests
Ready for another look 🙏 |
What is the purpose of the change
This PR adds configurable retry support for HTTP Sink requests. When an HTTP Sink request fails due to a network error (
IOException), the connector will automatically retry the request up to a configurable number of times using an exponential backoff strategy.Brief change log
SINK_HTTP_RETRY_TIMESconstant toHttpConnectorConfigConstantsRETRY_TIMESConfigOption(default:3) toHttpDynamicSinkConnectorOptionsRETRY_TIMESas an optional option inHttpDynamicTableSinkFactoryPropertiesinHttpDynamicSinksubmitWithRetry()method inHttpSinkWriterwith exponential backoff (1s, 2s, 4s, ...)testRetryOnErrorto verify retry behaviorhttp.sink.retry.timesoptionVerifying this change
This change adds tests and can be verified as follows:
HttpSinkWriterTest#testRetryOnError: Verifies that the writer retries the configured number of times onIOExceptionHttpSinkWriterTest#testErrorMetric: Verifies that error metrics are recorded correctly when retries are exhaustedDoes this pull request potentially affect one of the following areas
@Public(Evolving): NoIOExceptionwill now be retried (configurable, default 3 times)Documentation
.mdfiles)Usage Example