Affected module
Backend
Describe the bug
In multi-pod Kubernetes deployments, TypeRegistry is a JVM-level singleton backed by ConcurrentHashMap that loads custom property definitions from the database once at startup and is never refreshed. When a custom property is created via the API, only the pod that handles the request updates its in-memory cache. Other pods remain unaware of the new property until they are restarted.
This causes custom property validation to fail on (N-1) of N pods. PATCH requests to set custom property values on entities succeed only ~1/N of the time (e.g., ~33% with 3 replicas), depending on which pod the load balancer routes to. The failing pods return:
{"code": 400, "message": "Unknown custom field <fieldName>"}
Root cause:
TypeRegistry.java maintains three static ConcurrentHashMap fields (TYPES, CUSTOM_PROPERTIES, CUSTOM_PROPERTY_SCHEMAS) that are populated during initialize() at startup and updated locally via addType() when a custom property is created on that pod. There is no:
- DB fallback on cache miss
- TTL-based expiry or periodic refresh
- Cross-pod cache invalidation
The validation path in EntityRepository.validateExtension() calls TypeRegistry.instance().getSchema(), which returns null on stale pods, triggering the error.
This contrasts with other caches in the codebase (e.g., SettingsCache, SubjectCache, BotTokenCache) that use Guava LoadingCache with TTL-based expiry and automatic DB reload on cache miss.
To Reproduce
- Deploy OpenMetadata with 2+ replicas on Kubernetes
- Create a custom property on any entity type (e.g., via
PUT /v1/metadata/types/{id})
- Immediately send a PATCH request setting that custom property value on an entity (e.g.,
PATCH /v1/tables/{id} with /extension/<propertyName>)
- Repeat the PATCH request multiple times — it will fail on pods that didn't handle the creation request
Expected behavior
Custom property creation should be reflected across all pods without requiring a restart. PATCH/GET requests involving custom properties should succeed regardless of which pod handles the request.
Version:
- OpenMetadata version: Confirmed on 1.6.0 through 1.12.1-SNAPSHOT (latest main). Unfixed as of current main branch.
- Deployment: Kubernetes with multiple replicas
Additional context
I have a fix ready that adds a DB fallback with TTL-based staleness detection to TypeRegistry, following the same LoadingCache-style pattern used by SettingsCache and SubjectCache. Will open a PR shortly.
Affected module
Backend
Describe the bug
In multi-pod Kubernetes deployments,
TypeRegistryis a JVM-level singleton backed byConcurrentHashMapthat loads custom property definitions from the database once at startup and is never refreshed. When a custom property is created via the API, only the pod that handles the request updates its in-memory cache. Other pods remain unaware of the new property until they are restarted.This causes custom property validation to fail on (N-1) of N pods. PATCH requests to set custom property values on entities succeed only ~1/N of the time (e.g., ~33% with 3 replicas), depending on which pod the load balancer routes to. The failing pods return:
Root cause:
TypeRegistry.javamaintains three staticConcurrentHashMapfields (TYPES,CUSTOM_PROPERTIES,CUSTOM_PROPERTY_SCHEMAS) that are populated duringinitialize()at startup and updated locally viaaddType()when a custom property is created on that pod. There is no:The validation path in
EntityRepository.validateExtension()callsTypeRegistry.instance().getSchema(), which returnsnullon stale pods, triggering the error.This contrasts with other caches in the codebase (e.g.,
SettingsCache,SubjectCache,BotTokenCache) that use GuavaLoadingCachewith TTL-based expiry and automatic DB reload on cache miss.To Reproduce
PUT /v1/metadata/types/{id})PATCH /v1/tables/{id}with/extension/<propertyName>)Expected behavior
Custom property creation should be reflected across all pods without requiring a restart. PATCH/GET requests involving custom properties should succeed regardless of which pod handles the request.
Version:
Additional context
I have a fix ready that adds a DB fallback with TTL-based staleness detection to
TypeRegistry, following the sameLoadingCache-style pattern used bySettingsCacheandSubjectCache. Will open a PR shortly.