Skip to content

Commit 5db0452

Browse files
committed
[SPARK-56146] Add e2e test for resourceRetainPolicy: OnFailure with a failed SparkApp
### What changes were proposed in this pull request? This PR adds an e2e test for `resourceRetainPolicy: OnFailure` with a failed `SparkApplication`. The test submits a `SparkApplication` with a non-existent main class (`NonExistentClass`) and `resourceRetainPolicy: OnFailure`, then asserts the state transition ends with `Failed` followed by `TerminatedWithoutReleaseResources` (i.e., resources are retained on failure). ### Why are the changes needed? Currently, the e2e tests only cover `resourceRetainPolicy: Always`. This PR increases test coverage by verifying the `OnFailure` policy correctly retains resources when an application fails. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with the newly e2e test added: `tests/e2e/resource-retain-on-failure/chainsaw-test.yaml`. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Code (claude-opus-4-6) Closes #579 from dongjoon-hyun/SPARK-56146. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent eb5b90d commit 5db0452

4 files changed

Lines changed: 126 additions & 0 deletions

File tree

.github/workflows/build_and_test.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ jobs:
9393
- python
9494
- state-transition
9595
- resource-retain-duration
96+
- resource-retain-on-failure
9697
- resource-selector
9798
- watched-namespaces
9899
- driver-start-timeout
@@ -105,6 +106,8 @@ jobs:
105106
test-group: state-transition
106107
- mode: dynamic
107108
test-group: resource-retain-duration
109+
- mode: dynamic
110+
test-group: resource-retain-on-failure
108111
- mode: dynamic
109112
test-group: resource-selector
110113
- mode: dynamic
@@ -121,6 +124,8 @@ jobs:
121124
test-group: state-transition
122125
- mode: selector
123126
test-group: resource-retain-duration
127+
- mode: selector
128+
test-group: resource-retain-on-failure
124129
- mode: selector
125130
test-group: watched-namespaces
126131
- mode: selector
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one or more
3+
# contributor license agreements. See the NOTICE file distributed with
4+
# this work for additional information regarding copyright ownership.
5+
# The ASF licenses this file to You under the Apache License, Version 2.0
6+
# (the "License"); you may not use this file except in compliance with
7+
# the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
#
17+
18+
apiVersion: spark.apache.org/v1
19+
kind: SparkApplication
20+
metadata:
21+
name: ($SPARK_APPLICATION_NAME)
22+
namespace: ($SPARK_APP_NAMESPACE)
23+
status:
24+
stateTransitionHistory:
25+
(*.currentStateSummary):
26+
- "Submitted"
27+
- "DriverRequested"
28+
- "DriverStarted"
29+
- "DriverReady"
30+
- "RunningHealthy"
31+
- "Failed"
32+
- "TerminatedWithoutReleaseResources"
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one or more
3+
# contributor license agreements. See the NOTICE file distributed with
4+
# this work for additional information regarding copyright ownership.
5+
# The ASF licenses this file to You under the Apache License, Version 2.0
6+
# (the "License"); you may not use this file except in compliance with
7+
# the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
#
17+
18+
apiVersion: chainsaw.kyverno.io/v1alpha1
19+
kind: Test
20+
metadata:
21+
name: resource-retain-on-failure-test
22+
spec:
23+
scenarios:
24+
- bindings:
25+
- name: TEST_NAME
26+
value: failed-on-failure-policy
27+
- name: APPLICATION_FILE_NAME
28+
value: spark-example-retain-on-failure.yaml
29+
- name: SPARK_APPLICATION_NAME
30+
value: spark-example-retain-on-failure
31+
steps:
32+
- try:
33+
- script:
34+
env:
35+
- name: FILE_NAME
36+
value: ($APPLICATION_FILE_NAME)
37+
content: kubectl apply -f $FILE_NAME
38+
- assert:
39+
bindings:
40+
- name: SPARK_APP_NAMESPACE
41+
value: default
42+
timeout: 120s
43+
file: "../assertions/spark-application/spark-state-transition-failed-with-retain-check.yaml"
44+
catch:
45+
- describe:
46+
apiVersion: spark.apache.org/v1
47+
kind: SparkApplication
48+
namespace: default
49+
finally:
50+
- script:
51+
env:
52+
- name: SPARK_APPLICATION_NAME
53+
value: ($SPARK_APPLICATION_NAME)
54+
timeout: 120s
55+
content: |
56+
kubectl delete sparkapplication $SPARK_APPLICATION_NAME --ignore-not-found=true
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one or more
3+
# contributor license agreements. See the NOTICE file distributed with
4+
# this work for additional information regarding copyright ownership.
5+
# The ASF licenses this file to You under the Apache License, Version 2.0
6+
# (the "License"); you may not use this file except in compliance with
7+
# the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
#
17+
18+
apiVersion: spark.apache.org/v1
19+
kind: SparkApplication
20+
metadata:
21+
name: spark-example-retain-on-failure
22+
namespace: default
23+
spec:
24+
mainClass: "org.apache.spark.examples.NonExistentClass"
25+
jars: "local:///opt/spark/examples/jars/spark-examples.jar"
26+
applicationTolerations:
27+
resourceRetainPolicy: OnFailure
28+
sparkConf:
29+
spark.executor.instances: "1"
30+
spark.kubernetes.container.image: "apache/spark:{{SPARK_VERSION}}-scala"
31+
spark.kubernetes.authenticate.driver.serviceAccountName: "spark"
32+
runtimeVersions:
33+
sparkVersion: "4.1.1"

0 commit comments

Comments
 (0)