Skip to content

Commit 6aa577b

Browse files
authored
docs: Add docs about global singletons to development guide (#3809)
1 parent 7a87461 commit 6aa577b

1 file changed

Lines changed: 68 additions & 0 deletions

File tree

docs/source/contributor-guide/development.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,74 @@ The runtime is created once per executor JVM in a `Lazy<Runtime>` static:
101101
| Storing `JNIEnv` in an operator | **No** | `JNIEnv` is thread-specific |
102102
| Capturing state at plan creation time | Yes | Runs on executor thread, store in struct |
103103

104+
## Global singletons
105+
106+
Comet code runs in both the driver and executor JVM processes, and different parts of the
107+
codebase run in each. Global singletons have **process lifetime** — they are created once and
108+
never dropped until the JVM exits. Since multiple Spark jobs, queries, and tasks share the same
109+
process, this makes it difficult to reason about what state a singleton holds and whether it is
110+
still valid.
111+
112+
### How to recognize them
113+
114+
**Rust:** `static` variables using `OnceLock`, `LazyLock`, `OnceCell`, `Lazy`, or `lazy_static!`:
115+
116+
```rust
117+
static TOKIO_RUNTIME: OnceLock<Runtime> = OnceLock::new();
118+
static TASK_SHARED_MEMORY_POOLS: Lazy<Mutex<HashMap<i64, PerTaskMemoryPool>>> = Lazy::new(..);
119+
```
120+
121+
**Java:** `static` fields, especially mutable collections:
122+
123+
```java
124+
private static final HashMap<Long, HashMap<Long, ScalarSubquery>> subqueryMap = new HashMap<>();
125+
```
126+
127+
**Scala:** `object` declarations (companion objects are JVM singletons) holding mutable state:
128+
129+
```scala
130+
object MyCache {
131+
private val cache = new ConcurrentHashMap[String, Value]()
132+
}
133+
```
134+
135+
### Why they are dangerous
136+
137+
- **Credential staleness.** A singleton caching an authenticated client will hold stale
138+
credentials after token rotation, causing silent failures mid-job.
139+
- **Unbounded growth.** A cache keyed by file path or configuration grows with every query
140+
but never shrinks. Over hours of process uptime this becomes a memory leak.
141+
- **Cross-job contamination.** Different Spark jobs on the same process may use different
142+
configurations. A singleton initialized by the first job silently serves wrong state to
143+
subsequent jobs.
144+
- **Testing difficulty.** Global state persists across test cases, making tests
145+
order-dependent.
146+
147+
### When a singleton is acceptable
148+
149+
Some state genuinely has process lifetime:
150+
151+
| Singleton | Why it is safe |
152+
| --------------------------------------------- | --------------------------------------------------- |
153+
| `TOKIO_RUNTIME` | One runtime per executor, no configuration variance |
154+
| `JAVA_VM` / `JVM_CLASSES` | One JVM per process, set once at JNI load |
155+
| `OperatorRegistry` / `ExpressionRegistry` | Immutable after initialization |
156+
| Compiled `Regex` patterns (`LazyLock<Regex>`) | Stateless and immutable |
157+
158+
### When to avoid a singleton
159+
160+
If any of these apply, do **not** use a global singleton:
161+
162+
- The state depends on configuration that can vary between jobs or queries
163+
- The state holds credentials or authenticated connections that will not expire or invalidate appropriately
164+
- The state grows proportionally to the number of queries or files processed
165+
- The state needs cleanup or refresh during process lifetime
166+
167+
Instead, scope state to the plan or task by adding the cache as a field in an existing session or context object.
168+
169+
If a singleton is truly needed, add a comment explaining why `static` is the right lifetime,
170+
whether the cache is bounded, and how credential refresh is handled (if applicable).
171+
104172
## Development Setup
105173

106174
1. Make sure `JAVA_HOME` is set and point to JDK using [support matrix](../user-guide/latest/installation.md)

0 commit comments

Comments
 (0)