feat(31): shared invariants module in quickwit-dst by g-talbot · Pull Request #6246 · quickwit-oss/quickwit

g-talbot · 2026-03-30T18:36:07Z

Summary

Adds invariants/ module to quickwit-dst — the single source of truth for invariant definitions shared across the verification pyramid (TLA+ specs, stateright models, production code)
InvariantId enum catalogs all 20 invariants (SS-1..5, TW-1..3, CS-1..3, MC-1..4, DM-1..5) with Display and descriptions
Shared pure functions: window_start_secs(), is_valid_window_duration(), compare_with_null_ordering()
check_invariant! macro always evaluates the condition (debug and release), implementing both Layer 3 (Prevention — debug_assert! panic in debug) and Layer 4 (Production — pluggable InvariantRecorder for Datadog metrics)
set_invariant_recorder(fn) wires up statsd at process startup; no recorder = no-op (single OnceLock load)
stateright is now an optional dependency behind model-checking feature — production crates pull only the invariant definitions, not the model checker
quickwit-parquet-engine uses shared functions and check_invariant! macro instead of inline debug_assert!

Stacks on gtt/phase-31-pg-metastore (PR #6245).

Test plan

cargo test -p quickwit-dst — 11 invariant unit tests + 2 doctests pass
cargo test -p quickwit-dst --features model-checking — all 24 tests pass (stateright models use shared functions)
cargo test -p quickwit-parquet-engine — all 153 tests pass
cargo test -p quickwit-metastore --all-features — all 187 tests pass (including postgres)
cargo test -p quickwit-indexing --all-features — 234 pass (8 pre-existing Pulsar/GCP PubSub flakes skipped)
cargo test -p quickwit-integration-tests --all-features — all 46 pass
cargo clippy -p quickwit-dst --all-features --tests — clean
cargo clippy -p quickwit-parquet-engine --all-features --tests — clean

🤖 Generated with Claude Code

quickwit/quickwit-cli/src/main.rs

quickwit/quickwit-cli/src/logger.rs

Co-authored-by: Matthew Kim <[email protected]>

Resolve merge conflicts by taking main's versions of otel_metrics.rs and arrow_metrics.rs (the PR didn't modify these files — conflicts came from the base branch divergence). Kept PR's table_config module export in quickwit-parquet-engine/src/lib.rs. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…metadata

…ter-wiring

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…metadata

…ter-wiring

Pre-existing splits were serialized before the parquet_file field was added, so their JSON doesn't contain it. Adding #[serde(default)] makes deserialization fall back to empty string for old splits. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

…ter-wiring

When the commit timeout fires and the accumulator contains only zero-column batches, union_fields is empty and concat_batches fails with "must either specify a row count or at least one column". Now flush_internal treats empty union_fields the same as empty pending_batches — resets state and returns None. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Resolve Cargo.lock/Cargo.toml merge conflicts - P1 (sort column lookup): Already addressed by sort fields tag_ prefix fix — sort field names now match Parquet column names - P2 (window_start at epoch 0): Remove time_range.start_secs > 0 guard so window_start is computed for all batches when window_duration > 0 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Resolve writer.rs conflict: keep META-07 self-describing roundtrip test - P1 (create_timestamp serde): Add #[serde(default)] to StoredMetricsSplit.create_timestamp for backward-compatible reads of pre-existing file-backed index JSON - P1 (compaction window overlap): No change needed — Bound::Included vs Bound::Excluded already handles half-open interval semantics correctly, and the edge case (zero duration) is impossible - fields.rs: No change — Matt noted it resolves with wide schema rebase Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

- Resolve postgres.rs conflict: keep check_invariant! macros, add window_duration_secs consistency check - Group setup_dogstatsd_exporter + setup_invariant_recorder into single setup_metrics() function (fulmicoton-dd review) - Rename `id` to `invariant_id` in invariant_recorder (fulmicoton-dd review) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

UPSTREAM-CANDIDATES.md incorrectly stated TLA+ specs and Stateright models don't exist. They do (contributed in #6246): ParquetDataModel.tla, SortSchema.tla, TimeWindowedCompaction.tla, plus quickwit-dst invariants and Stateright model tests. Updated to accurately reflect that the remaining aspirational piece is the simulation infrastructure (SimClock, FaultInjector, etc.). Also removed the /sesh-mode aspirational entry — it's actively being used and the underlying specs/models are real. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* feat: replace fixed MetricDataPoint fields with dynamic tag HashMap * feat: replace ParquetField enum with constants and dynamic validation * feat: derive sort order and bloom filters from batch schema * feat: union schema accumulation and schema-agnostic ingest validation * feat: dynamic column lookup in split writer * feat: remove ParquetSchema dependency from indexing actors * refactor: deduplicate test batch helpers * lint * feat(31): sort schema foundation — proto, parser, display, validation, window, TableConfig Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: rustdoc link errors — use backticks for private items * feat(31): compaction metadata types — extend split metadata, postgres model, field lookup Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): wire TableConfig into sort path, add compaction KV metadata Wire TableConfig-driven sort order into ParquetWriter and add self-describing Parquet file metadata for compaction: - ParquetWriter::new() takes &TableConfig, resolves sort fields at construction via parse_sort_fields() + ParquetField::from_name() - sort_batch() uses resolved fields with per-column direction (ASC/DESC) - SS-1 debug_assert verification: re-sort and check identity permutation - build_compaction_key_value_metadata(): embeds sort_fields, window_start, window_duration, num_merge_ops, row_keys (base64) in Parquet kv_metadata - SS-5 verify_ss5_kv_consistency(): kv_metadata matches source struct - write_to_file_with_metadata() replaces write_to_file() - prepare_write() shared method for bytes and file paths - ParquetWriterConfig gains to_writer_properties_with_metadata() - ParquetSplitWriter passes TableConfig through - All callers in quickwit-indexing updated with TableConfig::default() - 23 storage tests pass including META-07 self-describing roundtrip Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): PostgreSQL migration 27 + compaction columns in stage/list/publish Add compaction metadata to the PostgreSQL metastore: Migration 27: - 6 new columns: window_start, window_duration_secs, sort_fields, num_merge_ops, row_keys, zonemap_regexes - Partial index idx_metrics_splits_compaction_scope on (index_uid, sort_fields, window_start) WHERE split_state = 'Published' stage_metrics_splits: - INSERT extended from 15 to 21 bind parameters for compaction columns - ON CONFLICT SET updates all compaction columns list_metrics_splits: - PgMetricsSplit construction includes compaction fields (defaults from JSON) Also fixes pre-existing compilation errors on upstream-10b-parquet-actors: - Missing StageMetricsSplitsRequestExt import - index_id vs index_uid type mismatches in publish/mark/delete - IndexUid binding (to_string() for sqlx) - ListMetricsSplitsResponseExt trait disambiguation Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): close port gaps — split_writer metadata, compaction scope, publish validation Close critical gaps identified during port review: split_writer.rs: - Store table_config on ParquetSplitWriter (not just pass-through) - Compute window_start from batch time range using table_config.window_duration_secs - Populate sort_fields, window_duration_secs, parquet_files on metadata before write - Call write_to_file_with_metadata(Some(&metadata)) to embed KV metadata in Parquet - Update size_bytes after write completes metastore/mod.rs: - Add window_start and sort_fields fields to ListMetricsSplitsQuery - Add with_compaction_scope() builder method metastore/postgres/metastore.rs: - Add compaction scope filters (AND window_start = $N, AND sort_fields = $N) to list query - Add replaced_split_ids count verification in publish_metrics_splits - Bind compaction scope query parameters ingest/config.rs: - Add table_config: TableConfig field to ParquetIngestConfig Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): final gap fixes — file-backed scope filter, META-07 test, dead code removal - file_backed_index/mod.rs: Add window_start and sort_fields filtering to metrics_split_matches_query() for compaction scope queries - writer.rs: Add test_meta07_self_describing_parquet_roundtrip test (writes compaction metadata to Parquet, reads back from cold file, verifies all fields roundtrip correctly) - fields.rs: Remove dead sort_order() method (replaced by TableConfig) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): correct postgres types for window_duration_secs and zonemap_regexes Gap 1: Change window_duration_secs from i32 to Option<i32> in both PgMetricsSplit and InsertableMetricsSplit. Pre-Phase-31 splits now correctly map 0 → NULL in PostgreSQL, enabling Phase 32 compaction queries to use `WHERE window_duration_secs IS NOT NULL` instead of the fragile `WHERE window_duration_secs > 0`. Gap 2: Change zonemap_regexes from String to serde_json::Value in both structs. This maps directly to JSONB in sqlx, avoiding ambiguity when PostgreSQL JSONB operators are used in Phase 34/35 zonemap pruning. Gap 3: Add two missing tests: - test_insertable_from_metadata_with_compaction_fields: verifies all 6 compaction fields round-trip through InsertableMetricsSplit - test_insertable_from_metadata_pre_phase31_defaults: verifies pre-Phase-31 metadata produces window_duration_secs: None, zonemap_regexes: json!({}) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * test(31): add metrics split test suite to shared metastore_test_suite! macro 11 tests covering the full metrics split lifecycle: - stage (happy path + non-existent index error) - stage upsert (ON CONFLICT update) - list by state, time range, metric name, compaction scope - publish (happy path + non-existent split error) - mark for deletion - delete (happy path + idempotent non-existent) Tests are generic and run against both file-backed and PostgreSQL backends. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): read compaction columns in list_metrics_splits, fix cleanup_index FK * fix(31): correct error types for non-existent metrics splits - publish_metrics_splits: return NotFound (not FailedPrecondition) when staged splits don't exist - delete_metrics_splits: succeed silently (idempotent) for non-existent splits instead of returning FailedPrecondition - Tests now assert the correct error types on both backends Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt metastore tests and postgres * fix(31): address PR review — align metrics_splits with splits table - Migration 27: add maturity_timestamp, delete_opstamp, node_id columns and publish_timestamp trigger to match the splits table (Paul's review) - ListMetricsSplitsQuery: adopt FilterRange<i64> for time_range (matching log-side pattern), single time_range field for both read and compaction paths, add node_id/delete_opstamp/update_timestamp/create_timestamp/ mature filters to close gaps with ListSplitsQuery - Use SplitState enum instead of stringly-typed Vec<String> for split_states - StoredMetricsSplit: add create_timestamp, node_id, delete_opstamp, maturity_timestamp so file-backed metastore can filter on them locally - File-backed filter: use FilterRange::overlaps_with() for time range and window intersection, apply all new filters matching log-side predicate - Postgres: intersection semantics for window queries, FilterRange-based SQL generation for all range filters - Fix InsertableMetricsSplit.window_duration_secs from Option<i32> to i32 - Rename two-letter variables (ws, sf, dt) throughout Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: fix rustfmt nightly formatting Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): add shared invariants module to quickwit-dst Extract duplicated invariant logic into a shared `invariants/` module within `quickwit-dst`. This is the "single source of truth" layer in the verification pyramid — used by stateright models, production debug_assert checks, and (future) Datadog metrics emission. Key changes: - `invariants/registry.rs`: InvariantId enum (20 variants) with Display - `invariants/window.rs`: shared window_start_secs(), is_valid_window_duration() - `invariants/sort.rs`: generic compare_with_null_ordering() for SS-2 - `invariants/check.rs`: check_invariant! macro wrapping debug_assert - stateright gated behind `model-checking` feature (optional dep) - quickwit-parquet-engine uses shared functions and check_invariant! Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): check invariants in release builds, add pluggable recorder The check_invariant! macro now always evaluates the condition — not just in debug builds. This implements Layer 4 (Production) of the verification stack: invariant checks run in release, with results forwarded to a pluggable InvariantRecorder for Datadog metrics emission. - Debug builds: panic on violation (debug_assert, Layer 3) - All builds: evaluate condition, call recorder (Layer 4) - set_invariant_recorder() wires up statsd at process startup - No recorder registered = no-op (single OnceLock load) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): wire invariant recorder to DogStatsD metrics Emit cloudprem.pomsky.invariant.checked and .violated counters with invariant label via the metrics crate / DogStatsD exporter at process startup, completing Layer 4 of the verification stack. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: license headers + cfg(not(test)) for quickwit-dst and quickwit-cli * chore: regenerate third-party license file Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: fix rustfmt nightly formatting for quickwit-dst and quickwit-parquet-engine Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: add CLAUDE.md and docs/internals architecture documentation Ports CLAUDE.md (development guide, coding standards, known pitfalls) and the full docs/internals tree including ADRs, gap analyses, TLA+ specs, verification guides, style references, and compaction architecture. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: split CLAUDE.md into repo context + opt-in /sesh-mode skill - Move verification-first workflow (TLA+, DST, formal specs) to /sesh-mode skill - Keep repo knowledge in CLAUDE.md (pitfalls, reliability rules, testing, docker, commands) - Remove Crate Map (derivable from filesystem) - Remove Coding Style bullet summary (CODE_STYLE.md is linked) - Fix relative links in SKILL.md for .claude/skills/sesh-mode/ path Co-Authored-By: Claude <[email protected]> * docs: add machete, cargo doc, and fmt details to CI checklist in CLAUDE.md * review: parquet_file singular, proto doc link, fix metastore accessor * style: fix rustfmt nightly comment wrapping in split metadata Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: use plain code span for proto reference to avoid broken rustdoc link Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Update quickwit/quickwit-parquet-engine/src/table_config.rs Co-authored-by: Matthew Kim <[email protected]> * Update quickwit/quickwit-parquet-engine/src/table_config.rs Co-authored-by: Matthew Kim <[email protected]> * style: rustfmt long match arm in default_sort_fields Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: make parquet_file field backward-compatible in MetricsSplitMetadata Pre-existing splits were serialized before the parquet_file field was added, so their JSON doesn't contain it. Adding #[serde(default)] makes deserialization fall back to empty string for old splits. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: handle empty-column batches in accumulator flush When the commit timeout fires and the accumulator contains only zero-column batches, union_fields is empty and concat_batches fails with "must either specify a row count or at least one column". Now flush_internal treats empty union_fields the same as empty pending_batches — resets state and returns None. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt check_invariant macro argument Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * revert: move code changes to separate PR Reverts c8bf8d7, cafcac5, a088f53 — these are code changes (delete_metrics_splits error handling, doc comment tweaks) that don't belong in a docs-only PR. They will land in a separate PR. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * remove ADR-004 from upstream PR — moving to DataDog/pomsky This ADR contains company-specific information and should live in the private fork, not in the upstream quickwit-oss repo. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: rebrand for upstream — remove Datadog/Pomsky/Quickhouse references - Rewrite CLAUDE.md as generic Quickwit AI development guide - Replace Quickhouse-Pomsky -> Quickwit branding across all docs - Replace "Datadog" observability references with generic "production observability" language - Remove "Husky (Datadog)" qualifier from gap docs (keep Husky citations — the blog post is public) - Generalize internal knowledge (query rate numbers, product-specific lateness guarantees) - Remove PomChi reference, private Google Doc link - Add docs/internals/UPSTREAM-CANDIDATES.md for pomsky tracking Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: remove ClickHouse references, track aspirational items - Remove all ClickHouse/ClickStack references from gap docs and ADRs (keep Prometheus, Mimir, InfluxDB, Husky as prior art) - Restore gap-005 Option C (compaction-time dedup) without ClickHouse citation - Mark /sesh-mode reference in CLAUDE.md as aspirational - Add aspirational items section to UPSTREAM-CANDIDATES.md tracking items described in docs but not yet implemented (TLA+ specs, DST, Kani, Bloodhound, performance baselines, benchmark binaries) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: fix aspirational items — TLA+ specs and Stateright models exist UPSTREAM-CANDIDATES.md incorrectly stated TLA+ specs and Stateright models don't exist. They do (contributed in #6246): ParquetDataModel.tla, SortSchema.tla, TimeWindowedCompaction.tla, plus quickwit-dst invariants and Stateright model tests. Updated to accurately reflect that the remaining aspirational piece is the simulation infrastructure (SimClock, FaultInjector, etc.). Also removed the /sesh-mode aspirational entry — it's actively being used and the underlying specs/models are real. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * chore: add .planning to .gitignore Prevents GSD planning artifacts from being committed to the repository. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * revert: remove pomsky-specific Makefile changes Reverts test env vars (CP_ENABLE_REVERSE_CONNECTION) and load-cloudprem-ui target — these are pomsky-specific and don't belong in upstream. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: add pitfall rule against silently swallowing unexpected state Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> --------- Co-authored-by: Matthew Kim <[email protected]> Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> Co-authored-by: Verdonk Lucas <[email protected]>

* feat: replace fixed MetricDataPoint fields with dynamic tag HashMap * feat: replace ParquetField enum with constants and dynamic validation * feat: derive sort order and bloom filters from batch schema * feat: union schema accumulation and schema-agnostic ingest validation * feat: dynamic column lookup in split writer * feat: remove ParquetSchema dependency from indexing actors * refactor: deduplicate test batch helpers * lint * feat(31): sort schema foundation — proto, parser, display, validation, window, TableConfig Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: rustdoc link errors — use backticks for private items * feat(31): compaction metadata types — extend split metadata, postgres model, field lookup Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): wire TableConfig into sort path, add compaction KV metadata Wire TableConfig-driven sort order into ParquetWriter and add self-describing Parquet file metadata for compaction: - ParquetWriter::new() takes &TableConfig, resolves sort fields at construction via parse_sort_fields() + ParquetField::from_name() - sort_batch() uses resolved fields with per-column direction (ASC/DESC) - SS-1 debug_assert verification: re-sort and check identity permutation - build_compaction_key_value_metadata(): embeds sort_fields, window_start, window_duration, num_merge_ops, row_keys (base64) in Parquet kv_metadata - SS-5 verify_ss5_kv_consistency(): kv_metadata matches source struct - write_to_file_with_metadata() replaces write_to_file() - prepare_write() shared method for bytes and file paths - ParquetWriterConfig gains to_writer_properties_with_metadata() - ParquetSplitWriter passes TableConfig through - All callers in quickwit-indexing updated with TableConfig::default() - 23 storage tests pass including META-07 self-describing roundtrip Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): PostgreSQL migration 27 + compaction columns in stage/list/publish Add compaction metadata to the PostgreSQL metastore: Migration 27: - 6 new columns: window_start, window_duration_secs, sort_fields, num_merge_ops, row_keys, zonemap_regexes - Partial index idx_metrics_splits_compaction_scope on (index_uid, sort_fields, window_start) WHERE split_state = 'Published' stage_metrics_splits: - INSERT extended from 15 to 21 bind parameters for compaction columns - ON CONFLICT SET updates all compaction columns list_metrics_splits: - PgMetricsSplit construction includes compaction fields (defaults from JSON) Also fixes pre-existing compilation errors on upstream-10b-parquet-actors: - Missing StageMetricsSplitsRequestExt import - index_id vs index_uid type mismatches in publish/mark/delete - IndexUid binding (to_string() for sqlx) - ListMetricsSplitsResponseExt trait disambiguation Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): close port gaps — split_writer metadata, compaction scope, publish validation Close critical gaps identified during port review: split_writer.rs: - Store table_config on ParquetSplitWriter (not just pass-through) - Compute window_start from batch time range using table_config.window_duration_secs - Populate sort_fields, window_duration_secs, parquet_files on metadata before write - Call write_to_file_with_metadata(Some(&metadata)) to embed KV metadata in Parquet - Update size_bytes after write completes metastore/mod.rs: - Add window_start and sort_fields fields to ListMetricsSplitsQuery - Add with_compaction_scope() builder method metastore/postgres/metastore.rs: - Add compaction scope filters (AND window_start = $N, AND sort_fields = $N) to list query - Add replaced_split_ids count verification in publish_metrics_splits - Bind compaction scope query parameters ingest/config.rs: - Add table_config: TableConfig field to ParquetIngestConfig Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): final gap fixes — file-backed scope filter, META-07 test, dead code removal - file_backed_index/mod.rs: Add window_start and sort_fields filtering to metrics_split_matches_query() for compaction scope queries - writer.rs: Add test_meta07_self_describing_parquet_roundtrip test (writes compaction metadata to Parquet, reads back from cold file, verifies all fields roundtrip correctly) - fields.rs: Remove dead sort_order() method (replaced by TableConfig) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): correct postgres types for window_duration_secs and zonemap_regexes Gap 1: Change window_duration_secs from i32 to Option<i32> in both PgMetricsSplit and InsertableMetricsSplit. Pre-Phase-31 splits now correctly map 0 → NULL in PostgreSQL, enabling Phase 32 compaction queries to use `WHERE window_duration_secs IS NOT NULL` instead of the fragile `WHERE window_duration_secs > 0`. Gap 2: Change zonemap_regexes from String to serde_json::Value in both structs. This maps directly to JSONB in sqlx, avoiding ambiguity when PostgreSQL JSONB operators are used in Phase 34/35 zonemap pruning. Gap 3: Add two missing tests: - test_insertable_from_metadata_with_compaction_fields: verifies all 6 compaction fields round-trip through InsertableMetricsSplit - test_insertable_from_metadata_pre_phase31_defaults: verifies pre-Phase-31 metadata produces window_duration_secs: None, zonemap_regexes: json!({}) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * test(31): add metrics split test suite to shared metastore_test_suite! macro 11 tests covering the full metrics split lifecycle: - stage (happy path + non-existent index error) - stage upsert (ON CONFLICT update) - list by state, time range, metric name, compaction scope - publish (happy path + non-existent split error) - mark for deletion - delete (happy path + idempotent non-existent) Tests are generic and run against both file-backed and PostgreSQL backends. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix(31): read compaction columns in list_metrics_splits, fix cleanup_index FK * fix(31): correct error types for non-existent metrics splits - publish_metrics_splits: return NotFound (not FailedPrecondition) when staged splits don't exist - delete_metrics_splits: succeed silently (idempotent) for non-existent splits instead of returning FailedPrecondition - Tests now assert the correct error types on both backends Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt metastore tests and postgres * fix(31): address PR review — align metrics_splits with splits table - Migration 27: add maturity_timestamp, delete_opstamp, node_id columns and publish_timestamp trigger to match the splits table (Paul's review) - ListMetricsSplitsQuery: adopt FilterRange<i64> for time_range (matching log-side pattern), single time_range field for both read and compaction paths, add node_id/delete_opstamp/update_timestamp/create_timestamp/ mature filters to close gaps with ListSplitsQuery - Use SplitState enum instead of stringly-typed Vec<String> for split_states - StoredMetricsSplit: add create_timestamp, node_id, delete_opstamp, maturity_timestamp so file-backed metastore can filter on them locally - File-backed filter: use FilterRange::overlaps_with() for time range and window intersection, apply all new filters matching log-side predicate - Postgres: intersection semantics for window queries, FilterRange-based SQL generation for all range filters - Fix InsertableMetricsSplit.window_duration_secs from Option<i32> to i32 - Rename two-letter variables (ws, sf, dt) throughout Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: fix rustfmt nightly formatting Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): add shared invariants module to quickwit-dst Extract duplicated invariant logic into a shared `invariants/` module within `quickwit-dst`. This is the "single source of truth" layer in the verification pyramid — used by stateright models, production debug_assert checks, and (future) Datadog metrics emission. Key changes: - `invariants/registry.rs`: InvariantId enum (20 variants) with Display - `invariants/window.rs`: shared window_start_secs(), is_valid_window_duration() - `invariants/sort.rs`: generic compare_with_null_ordering() for SS-2 - `invariants/check.rs`: check_invariant! macro wrapping debug_assert - stateright gated behind `model-checking` feature (optional dep) - quickwit-parquet-engine uses shared functions and check_invariant! Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): check invariants in release builds, add pluggable recorder The check_invariant! macro now always evaluates the condition — not just in debug builds. This implements Layer 4 (Production) of the verification stack: invariant checks run in release, with results forwarded to a pluggable InvariantRecorder for Datadog metrics emission. - Debug builds: panic on violation (debug_assert, Layer 3) - All builds: evaluate condition, call recorder (Layer 4) - set_invariant_recorder() wires up statsd at process startup - No recorder registered = no-op (single OnceLock load) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat(31): wire invariant recorder to DogStatsD metrics Emit cloudprem.pomsky.invariant.checked and .violated counters with invariant label via the metrics crate / DogStatsD exporter at process startup, completing Layer 4 of the verification stack. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: license headers + cfg(not(test)) for quickwit-dst and quickwit-cli * chore: regenerate third-party license file Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: fix rustfmt nightly formatting for quickwit-dst and quickwit-parquet-engine Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: add CLAUDE.md and docs/internals architecture documentation Ports CLAUDE.md (development guide, coding standards, known pitfalls) and the full docs/internals tree including ADRs, gap analyses, TLA+ specs, verification guides, style references, and compaction architecture. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: split CLAUDE.md into repo context + opt-in /sesh-mode skill - Move verification-first workflow (TLA+, DST, formal specs) to /sesh-mode skill - Keep repo knowledge in CLAUDE.md (pitfalls, reliability rules, testing, docker, commands) - Remove Crate Map (derivable from filesystem) - Remove Coding Style bullet summary (CODE_STYLE.md is linked) - Fix relative links in SKILL.md for .claude/skills/sesh-mode/ path Co-Authored-By: Claude <[email protected]> * docs: add machete, cargo doc, and fmt details to CI checklist in CLAUDE.md * review: parquet_file singular, proto doc link, fix metastore accessor * style: fix rustfmt nightly comment wrapping in split metadata Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: use plain code span for proto reference to avoid broken rustdoc link Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * Update quickwit/quickwit-parquet-engine/src/table_config.rs Co-authored-by: Matthew Kim <[email protected]> * Update quickwit/quickwit-parquet-engine/src/table_config.rs Co-authored-by: Matthew Kim <[email protected]> * style: rustfmt long match arm in default_sort_fields Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: make parquet_file field backward-compatible in MetricsSplitMetadata Pre-existing splits were serialized before the parquet_file field was added, so their JSON doesn't contain it. Adding #[serde(default)] makes deserialization fall back to empty string for old splits. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: handle empty-column batches in accumulator flush When the commit timeout fires and the accumulator contains only zero-column batches, union_fields is empty and concat_batches fails with "must either specify a row count or at least one column". Now flush_internal treats empty union_fields the same as empty pending_batches — resets state and returns None. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt check_invariant macro argument Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * revert: move code changes to separate PR Reverts c8bf8d7, cafcac5, a088f53 — these are code changes (delete_metrics_splits error handling, doc comment tweaks) that don't belong in a docs-only PR. They will land in a separate PR. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * remove ADR-004 from upstream PR — moving to DataDog/pomsky This ADR contains company-specific information and should live in the private fork, not in the upstream quickwit-oss repo. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: rebrand for upstream — remove Datadog/Pomsky/Quickhouse references - Rewrite CLAUDE.md as generic Quickwit AI development guide - Replace Quickhouse-Pomsky -> Quickwit branding across all docs - Replace "Datadog" observability references with generic "production observability" language - Remove "Husky (Datadog)" qualifier from gap docs (keep Husky citations — the blog post is public) - Generalize internal knowledge (query rate numbers, product-specific lateness guarantees) - Remove PomChi reference, private Google Doc link - Add docs/internals/UPSTREAM-CANDIDATES.md for pomsky tracking Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: remove ClickHouse references, track aspirational items - Remove all ClickHouse/ClickStack references from gap docs and ADRs (keep Prometheus, Mimir, InfluxDB, Husky as prior art) - Restore gap-005 Option C (compaction-time dedup) without ClickHouse citation - Mark /sesh-mode reference in CLAUDE.md as aspirational - Add aspirational items section to UPSTREAM-CANDIDATES.md tracking items described in docs but not yet implemented (TLA+ specs, DST, Kani, Bloodhound, performance baselines, benchmark binaries) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: fix aspirational items — TLA+ specs and Stateright models exist UPSTREAM-CANDIDATES.md incorrectly stated TLA+ specs and Stateright models don't exist. They do (contributed in #6246): ParquetDataModel.tla, SortSchema.tla, TimeWindowedCompaction.tla, plus quickwit-dst invariants and Stateright model tests. Updated to accurately reflect that the remaining aspirational piece is the simulation infrastructure (SimClock, FaultInjector, etc.). Also removed the /sesh-mode aspirational entry — it's actively being used and the underlying specs/models are real. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * chore: add .planning to .gitignore Prevents GSD planning artifacts from being committed to the repository. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * revert: remove pomsky-specific Makefile changes Reverts test env vars (CP_ENABLE_REVERSE_CONNECTION) and load-cloudprem-ui target — these are pomsky-specific and don't belong in upstream. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * docs: add pitfall rule against silently swallowing unexpected state Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * feat: enforce physical column ordering in Parquet files for two-GET streaming merge (#6281) * feat: enforce physical column ordering in Parquet files Sort schema columns are written first (in their configured sort order), followed by all remaining data columns in alphabetical order. This physical layout enables a two-GET streaming merge during compaction: the footer GET provides the schema and offsets, then a single streaming GET from the start of the row group delivers sort columns first — allowing the compactor to compute the global merge order before data columns arrive. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * test: verify input column order is actually scrambled The sanity check only asserted presence, not ordering. Now it verifies that host appears before service in the input (scrambled) which is the opposite of the sort-schema order (service before host). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * style: rustfmt test code Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: collapse nested if to satisfy clippy::collapsible_if Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> --------- Co-authored-by: Matthew Kim <[email protected]> Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]> Co-authored-by: Verdonk Lucas <[email protected]>

mattmkim added 2 commits March 30, 2026 14:21

feat: replace fixed MetricDataPoint fields with dynamic tag HashMap

d892f23

feat: replace ParquetField enum with constants and dynamic validation

4e928fe

g-talbot requested review from fulmicoton-dd and mattmkim March 30, 2026 18:36

g-talbot mentioned this pull request Mar 30, 2026

docs: add CLAUDE.md, architecture docs, and /sesh-mode skill #6247

Merged

4 tasks

mattmkim added 6 commits March 30, 2026 14:44

feat: derive sort order and bloom filters from batch schema

d8d71ed

feat: union schema accumulation and schema-agnostic ingest validation

ad115bf

feat: dynamic column lookup in split writer

d0a995e

feat: remove ParquetSchema dependency from indexing actors

35c3942

refactor: deduplicate test batch helpers

33c4070

lint

7a5979f

fulmicoton-dd reviewed Mar 31, 2026

View reviewed changes

quickwit/quickwit-cli/src/main.rs Outdated Show resolved Hide resolved

fulmicoton-dd reviewed Mar 31, 2026

View reviewed changes

quickwit/quickwit-cli/src/logger.rs Outdated Show resolved Hide resolved

fulmicoton-dd approved these changes Mar 31, 2026

View reviewed changes

Merge branch 'main' into matthew.kim/metrics-wide-schema

440631b

g-talbot force-pushed the gtt/phase-31-pg-metastore branch from c8836d6 to eccf657 Compare March 31, 2026 20:41

g-talbot force-pushed the gtt/phase-31-execute branch 2 times, most recently from 3eb3316 to c4418dd Compare March 31, 2026 20:55

g-talbot force-pushed the gtt/phase-31-pg-metastore branch 2 times, most recently from 6eeaecd to f2113e5 Compare March 31, 2026 21:03

g-talbot force-pushed the gtt/phase-31-execute branch 2 times, most recently from aa5ff45 to 142ddd6 Compare March 31, 2026 21:08

g-talbot force-pushed the gtt/phase-31-pg-metastore branch from f2113e5 to 0d561b4 Compare March 31, 2026 21:08

g-talbot force-pushed the gtt/phase-31-execute branch from 142ddd6 to 972bd20 Compare March 31, 2026 21:26

g-talbot force-pushed the gtt/phase-31-pg-metastore branch from 0d561b4 to 8ba9201 Compare March 31, 2026 21:26

Merge branch 'main' into matthew.kim/metrics-wide-schema

5eae799

g-talbot force-pushed the gtt/phase-31-execute branch from 972bd20 to 73c90c4 Compare March 31, 2026 21:29

g-talbot force-pushed the gtt/phase-31-pg-metastore branch from 8ba9201 to 0761d11 Compare March 31, 2026 21:30

g-talbot force-pushed the gtt/phase-31-execute branch from 73c90c4 to 44bd6bb Compare March 31, 2026 21:31

g-talbot force-pushed the gtt/phase-31-pg-metastore branch from 0761d11 to b551b13 Compare March 31, 2026 21:31

g-talbot and others added 22 commits April 6, 2026 21:11

Update quickwit/quickwit-parquet-engine/src/table_config.rs

759c2ca

Co-authored-by: Matthew Kim <[email protected]>

Update quickwit/quickwit-parquet-engine/src/table_config.rs

6454f1d

Co-authored-by: Matthew Kim <[email protected]>

Merge branch 'gtt/phase-31-sort-schema' into gtt/phase-31-compaction-…

1e67900

…metadata

Merge branch 'gtt/phase-31-compaction-metadata' into gtt/phase-31-wri…

85fcb2d

…ter-wiring

Merge branch 'gtt/phase-31-writer-wiring' into gtt/phase-31-pg-metastore

06a6bf0

Merge branch 'gtt/phase-31-pg-metastore' into gtt/phase-31-execute

9b95178

style: rustfmt long match arm in default_sort_fields

4481bef

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

Merge branch 'gtt/phase-31-sort-schema' into gtt/phase-31-compaction-…

64c5d5f

…metadata

Merge branch 'gtt/phase-31-compaction-metadata' into gtt/phase-31-wri…

a8bf948

…ter-wiring

Merge branch 'gtt/phase-31-writer-wiring' into gtt/phase-31-pg-metastore

caa9c3e

Merge branch 'gtt/phase-31-pg-metastore' into gtt/phase-31-execute

42ab5c2

Merge branch 'gtt/phase-31-compaction-metadata' into gtt/phase-31-wri…

b968085

…ter-wiring

Merge branch 'gtt/phase-31-writer-wiring' into gtt/phase-31-pg-metastore

1c99ddd

Merge branch 'gtt/phase-31-pg-metastore' into gtt/phase-31-execute

bbfef12

Merge branch 'gtt/phase-31-writer-wiring' into gtt/phase-31-pg-metastore

e8c6be4

Merge branch 'gtt/phase-31-pg-metastore' into gtt/phase-31-execute

71586e4

Merge branch 'gtt/phase-31-pg-metastore' into gtt/phase-31-execute

7dbd1ef

Base automatically changed from gtt/phase-31-pg-metastore to main April 8, 2026 15:24

g-talbot and others added 2 commits April 8, 2026 14:45

style: rustfmt check_invariant macro argument

501f014

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

g-talbot merged commit 777a338 into main Apr 8, 2026
8 checks passed

g-talbot deleted the gtt/phase-31-execute branch April 8, 2026 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(31): shared invariants module in quickwit-dst#6246

feat(31): shared invariants module in quickwit-dst#6246
g-talbot merged 55 commits intomainfrom
gtt/phase-31-execute

g-talbot commented Mar 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

g-talbot commented Mar 30, 2026

Summary

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants