CSHARP-5992: Add LINQ translation benchmark suite#2004
Conversation
ea1069d to
148d1de
Compare
15 benchmarks covering filter, projection, and IQueryable composition translation paths. New LinqBench category added to AllCategories and the perf-test runner. BulkWriteBench also added to AllCategories.
Document benchmark inventory, code path coverage, interpretation guidance, and provisional thresholds. Record targeted injection test results validating selective sensitivity of each benchmark to its target translator code path.
PartialEvaluator injection test showed OrChainFilter is still affected (evaluator traverses all expressions). Reframed as a sensitivity amplifier rather than diagnostic isolator.
…methods LinqBench uses translations/second instead of MB/s since there is no data throughput to measure. Add Unit and MetricName to BenchmarkResult so exporters label scores correctly. Change all benchmark methods to return their values to prevent JIT dead-code elimination.
The composite score loop was using default MB/s labels for all categories including LinqBench. Now correctly labels LinqBench composites as translations_per_second.
…d-to-end benchmarks; update regression thresholds - Redesign LinqTranslationBenchmark.cs: 15 individual feature benchmarks → 10 representative user queries covering distinct translator code paths (MultiFieldSearch, OrStatusFilter, BatchLookup, ArrayElementQuery, FieldSelection, AggregationProjection, ProjectionSentinel, UpdatePipeline, QueryablePipeline, GroupByAggregation) - Add LinqEndToEndBenchmark.cs: one-off characterization of translation overhead vs pre-built BsonDocument queries on a live collection; not wired into CI - Update README regression thresholds based on 7-run M1 Max drift characterization: tight bucket (15%) for MultiFieldSearch/UpdatePipeline/BatchLookup/ArrayElementQuery; wider bucket (30%) for OrStatusFilter/FieldSelection/AggregationProjection/ QueryablePipeline/GroupByAggregation
BorisDog
left a comment
There was a problem hiding this comment.
Look very good overall.
| // --- OrStatusFilter: simple OR filter --- | ||
|
|
||
| [Benchmark] | ||
| public List<OrderDocument> OrStatusFilterLinq() |
There was a problem hiding this comment.
minor: I think we need more descriptive naming, not tied to a specific field name. Something like OrFilterLinq, or FindOrFilterLinq
| { | ||
| var pipeline = PipelineDefinition<OrderDocument, BsonDocument>.Create(_groupByPipeline); | ||
| return _collection.Aggregate(pipeline).ToList(); | ||
| } |
There was a problem hiding this comment.
I would also add additional case with all the following
- Select/Pojection,
- something with arrays, like $in, $all....
- OrderBy.ThenBy, Take
| { | ||
| return _collection.Find(x => | ||
| x.Status == _statusFilter && | ||
| x.CustomerName.StartsWith(_prefix) && |
There was a problem hiding this comment.
Does StartsWith create the exact same regex as in the raw version?
There was a problem hiding this comment.
It might makes sense to made a first translation somewhere in GlobalSetup and compare the produced MQL. So if we will change the translation in future the Benchmark will throw.
| { | ||
| private const string DatabaseName = "linqbench"; | ||
| private const string CollectionName = "orders"; | ||
| private const int SeedCount = 500; |
There was a problem hiding this comment.
Consider creating indexes in the database such that server time is minimized and translation time changes will be more apparent.
| public List<BsonDocument> GroupByLinq() | ||
| { | ||
| return _collection.Aggregate() | ||
| .Group(x => x.Status, g => new { Status = g.Key, Count = g.Count(), TotalRevenue = g.Sum(x => x.Total) }) |
There was a problem hiding this comment.
There is projection to an anonymous type here, followed by creation of the BSON document. The raw example does not project to an anonymous type.
…quivalence fixes; rename OrStatusFilter to OrFilter
There was a problem hiding this comment.
Pull request overview
Adds a new LINQ-focused benchmark suite to the driver benchmarks project, enabling perf-job tracking and composite scoring for LINQ translation performance (plus an optional end-to-end comparison suite).
Changes:
- Introduces
LinqTranslationBenchmark(translation-only, no query execution) andLinqEndToEndBenchmark(LINQ vs raw query plans with real DB execution). - Wires new
LinqBenchcategory into perf-job filtering and composite-score export (including score units/metric names). - Extends benchmark category constants and composite export output to include
LinqBench(and nowBulkWriteBench) inAllCategories.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| evergreen/run-perf-tests.sh | Adds LinqBench to Evergreen perf category filter. |
| benchmarks/MongoDB.Driver.Benchmarks/Linq/README.md | Documents the LINQ benchmark inventory and interpretation guidance. |
| benchmarks/MongoDB.Driver.Benchmarks/Linq/LinqTranslationBenchmark.cs | Adds translation-focused benchmarks across filter/field/projection/update/IQueryable entry points. |
| benchmarks/MongoDB.Driver.Benchmarks/Linq/LinqEndToEndBenchmark.cs | Adds end-to-end LINQ vs raw benchmarks that seed data and run against a live server. |
| benchmarks/MongoDB.Driver.Benchmarks/Exporters/LocalExporter.cs | Emits per-category and per-benchmark units (MB/s vs translations/s). |
| benchmarks/MongoDB.Driver.Benchmarks/Exporters/EvergreenExporter.cs | Emits per-category and per-benchmark metric names for Evergreen (MB/s vs translations/s). |
| benchmarks/MongoDB.Driver.Benchmarks/DriverBenchmarkCategory.cs | Adds LinqBench and includes it (and BulkWriteBench) in composite category list. |
| benchmarks/MongoDB.Driver.Benchmarks/BenchmarkResult.cs | Adds unit/metric metadata and computes translations/s scoring for LinqBench. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | Benchmark | Expression | Translator path exercised | | ||
| |---|---|---| | ||
| | `MultiFieldSearch` | `x.Status == s && x.CustomerName.StartsWith(prefix) && x.ShippingAddress.City == city && x.CreatedAt > cutoff && !x.IsPaid` | And → Comparison, MethodCall (StartsWith), nested MemberAccess, Not + boolean MemberAccess | | ||
| | `OrStatusFilter` | 4-way `==` OR with literal constants | Or → Comparison. No closures — fastest filter, most sensitive to small regressions | |
| [GlobalSetup] | ||
| public void Setup() | ||
| { | ||
| _client = MongoConfiguration.CreateClient(); |
| private void SetupQueryableExpressions(string statusFilter) | ||
| { | ||
| var mongoUri = Environment.GetEnvironmentVariable("MONGODB_URI"); | ||
| var settings = mongoUri != null ? MongoClientSettings.FromConnectionString(mongoUri) : new MongoClientSettings(); | ||
| settings.ClusterSource = DisposingClusterSource.Instance; | ||
| _queryClient = new MongoClient(settings); | ||
|
|
||
| var collection = _queryClient.GetDatabase("linqbench").GetCollection<OrderDocument>("orders"); | ||
| var queryable = collection.AsQueryable(); | ||
|
|
|
|
||
| **Allocation changes** are often more actionable than time changes. A new allocation in a hot path is a real regression even if the time delta is within noise. The `[MemoryDiagnoser]` columns (`Gen0`, `Allocated`) make allocation regressions visible. | ||
|
|
||
| **`OrStatusFilter`** is the fastest filter (~7µs, ~5x faster than others) because it uses literal constants instead of captured variables, producing a simpler expression tree with less work at every stage. This makes it the most sensitive filter benchmark — small translator regressions that would be lost in the noise on slower benchmarks show up clearly here. |
| | Bucket | Threshold | Benchmarks | Observed range | | ||
| |---|---|---|---| | ||
| | Tight | 15% | `MultiFieldSearch`, `UpdatePipeline`, `BatchLookup`, `ArrayElementQuery` | 9–15% | | ||
| | Wider | 30% | `OrStatusFilter`, `FieldSelection`, `AggregationProjection`, `QueryablePipeline`, `GroupByAggregation` | 20–29% | | ||
| | Sentinel | 100% | `ProjectionSentinel` | 5% | | ||
|
|
||
| `OrStatusFilter` and `FieldSelection` land in the wider bucket for different reasons: `OrStatusFilter` is ~7µs and proportional noise is large at that scale; `FieldSelection` is ~2µs with similar characteristics. The three complex benchmarks (`AggregationProjection`, `QueryablePipeline`, `GroupByAggregation`) show higher drift because they allocate more and exercise more GC pressure. |
| public void Cleanup() | ||
| { | ||
| _client.DropDatabase(DatabaseName); | ||
| _client.Dispose(); |
There was a problem hiding this comment.
In v3 MongoDB.Driver this will not unregister the cluster. If we really want to clean up we should follow the clean up guide.
| { | ||
| return _collection.Find(x => | ||
| x.Status == _statusFilter && | ||
| x.CustomerName.StartsWith(_prefix) && |
There was a problem hiding this comment.
It might makes sense to made a first translation somewhere in GlobalSetup and compare the produced MQL. So if we will change the translation in future the Benchmark will throw.
| { | ||
| private const string DatabaseName = "linqbench"; | ||
| private const string CollectionName = "orders"; | ||
| private const int SeedCount = 500; |
There was a problem hiding this comment.
Do we really need 500 documents? As far as I understood the point is to measure the LINQ translation overhead, so I would say it should be enough to have even a single document in the collection to reduce the server/network/serialization overhead.
Summary
Adds a 10-benchmark
LinqBenchsuite that exercises the LINQ-to-aggregation translation layer in isolation (no DB I/O), plus a smallLinqEndToEndsuite that compares LINQ vs hand-builtBsonDocumentquery plans end-to-end. WiresLinqBenchinto the perf-job category filter and the composite-score path. Designed to give us a defensible signal when the translator (especiallySerializerFinder,AstSimplifier, and the various method-call sub-translators) moves.Cross-commit runs against pre-/post-SerializerFinder and against an in-flight optimization PR (#1961) show the suite catches the kinds of regressions and improvements we'd want it to catch — see Validation below.
Motivation
The driver has spec-driven benchmarks for I/O-heavy patterns (
Find,BulkWrite,GridFS, BSON encode/decode) but nothing for the LINQ translator. As internal code paths shift —SerializerFinderoverhauls,AstSimplifierchanges, new visitor support — we currently have no signal on translation-cost movement. When CSHARP-5572 introducedSerializerFinder(#1700), we had no way to quantify what it cost us. This PR closes that gap.What this adds
benchmarks/MongoDB.Driver.Benchmarks/Linq/LinqTranslationBenchmark.csbenchmarks/MongoDB.Driver.Benchmarks/Linq/LinqEndToEndBenchmark.csbenchmarks/MongoDB.Driver.Benchmarks/Linq/README.mdbenchmarks/MongoDB.Driver.Benchmarks/DriverBenchmarkCategory.csLinqBenchconst;LinqBenchandBulkWriteBenchadded toAllCategoriesso composite scores are emittedbenchmarks/MongoDB.Driver.Benchmarks/BenchmarkResult.cs+Exporters/*.csLinqBenchevergreen/run-perf-tests.shLinqBenchto the--anyCategoriesfilterCedar/SPS auto-discovers new composite categories — no dashboard config required.
Design decisions
LinqTranslationBenchmarkcalls into translator entry points directly (LinqProviderAdapter.TranslateExpressionTo*,ExpressionToExecutableQueryTranslator.Translate). No DB execution. This isolates translator regressions from network and serialization noise. End-to-end coverage lives inLinqEndToEndBenchmarkfor visibility but is not what we'd alert on first.SerializerFinderVisit*.csfile they exercised. Pushed back to "what users actually write" framing; matrix coverage remains an option for targeted gaps in the future.LinqBenchdoes not cross-tag with the spec categories (DriverBench,BsonBench, etc.). Keeps LINQ numbers out of the spec composite averages.LinqBenchdoes land inAllCategoriesso its own composite is emitted.BulkWriteBenchcomposite addition bundled in this PR. Was previously excluded with a "not part of the benchmarking spec" comment. Team wanted its composite tracked too; doing it here avoids a tiny standalone PR.[MemoryDiagnoser]on everything. Allocation regressions matter independently of time and surface earlier than time regressions on noisy hardware.Benchmark inventory
Translation suite (10 benchmarks)
Filters (4) —
TranslateExpressionToFilter:MultiFieldSearchStatus == s && CustomerName.StartsWith(prefix) && ShippingAddress.City == city && CreatedAt > cutoff && !IsPaidOrFilter==OR with literal constantsBatchLookupids.Contains(x.Id)$inArrayElementQueryx.Items.Any(i => i.Price > t)$elemMatch,@<elem>symbolField (1) —
TranslateExpressionToField:FieldSelectionx => x.Items[0].ProductIdGetItemMethodToFilterFieldTranslatorpath.List<T>[0]is aMethodCallExpressionin C#, notIndexExpression.Projections (2) —
TranslateExpressionToProjection:AggregationProjectionSubtotal + Tax - DiscountandItems.Select(i => i.ProductId)ExpressionToAggregationExpressionTranslator+AstSimplifierProjectionSentinelx => xUpdate (1) —
TranslateExpressionToSetStage:UpdatePipelineExpressionToSetStageTranslator, MemberInit pattern matchingIQueryable (2) —
ExpressionToExecutableQueryTranslator.Translate:QueryablePipelineWhere → Select → OrderBy → TakeGroupByAggregationGroupBy → Selectwith Count + SumGroupByMethodToPipelineTranslator,$group,IGroupingSerializer, accumulatorsEnd-to-end suite (12 benchmarks)
Six patterns (
MultiFieldSearch,OrFilter,GroupBy,Projection,InFilter,PagedQuery), each run twice: once written in LINQ, once as a hand-builtBsonDocument/ pipeline. Seeds 500 documents plus secondary indexes onStatus,CreatedAt, andShippingAddress.Cityinto a local mongod in[GlobalSetup]. The LINQ and Raw versions of each pair render to byte-equivalent BSON (verified viaLinqProviderAdapter), so LINQ−Raw delta isolates translator + provider overhead from query-shape differences. Useful for the "how much of the user-visible latency is translation?" question. Not part of the regression-alert path.Validation
Five lines of evidence that the suite produces actionable signal.
1. Within-run noise
Across all perf-hw runs (n=10 each, multiple commits), BDN-reported within-run StdDev is <1% on every non-sentinel benchmark (typically 0.2-0.9%). The
ProjectionSentinelandFieldSelectionmicro-benchmarks land at 0.3-1.4%. Each individual run is a well-converged measurement.2. Selectivity — targeted regression injection
Thread.SpinWait(300)(~10 µs on M1) injected into four specific translator code paths in turn:SerializerFinder.FindSerializers()GetItemMethodToFilterFieldTranslatorFieldSelectiononlyNotExpressionToFilterTranslatorMultiFieldSearch(Not),OrFilter(chains Comparison dispatch)GroupByMethodToPipelineTranslatorGroupByAggregationonlyResult: each injection moved only the benchmarks that should have moved.
ProjectionSentinelstayed flat across all injections. The suite has clean per-path selectivity — when something regresses, the benchmarks that move tell you which translator moved.3. Cross-run drift on the actual perf-job hardware (n=10 on
rhel90-dbx-perf-large)Submitted via
evergreen patchagainsttest-csharp-spec-benchmarkswith the runner looping 10× through--anyCategories "LinqBench". .NET 8.0, X64 RyuJit. All runs in a single perf-task invocation on the same host.MultiFieldSearchOrFilterBatchLookupArrayElementQueryFieldSelectionAggregationProjectionProjectionSentinelUpdatePipelineQueryablePipelineGroupByAggregationAll non-sentinel benchmarks land in 2-4.5% range, CV ≤1.5%. That's a ~5× compression of the M1 drift bands we'd characterized earlier (which sat at 12-38% range on the heavier benchmarks). Only
FieldSelection(~6µs benchmark, ~7% range) andProjectionSentinel(~37 ns, ~5% range) drift wider — both are fast enough that small absolute drift looks proportionally large.Two consequences:
For reference, M1 Max characterization on the same suite (n=7) showed ranges of 5-30%, with the heavier IQueryable benchmarks at CV 8-11%. M1 numbers are kept as a development-machine guide; perf-hw numbers are what we'd actually alert on.
4. Cross-commit reality check on perf-hardware — does the suite catch real translator changes?
The suite was transplanted onto pinned commits and run on
rhel90-dbx-perf-large(n=10 per commit). Set A captures a known historical regression (the SerializerFinder introduction in #1700). Set B captures an in-flight optimization (PR #1961). Together these answer "does the suite detect what we want it to detect?"Set A — SerializerFinder cost-of-introduction. Parent of #1700 (
46640eac98) vs the #1700 merge (59c9d34180). Both commits are from 2026-02-09; the only meaningful diff between them is the SerializerFinder introduction.MultiFieldSearchOrFilterBatchLookupArrayElementQueryFieldSelectionAggregationProjectionProjectionSentinelUpdatePipelineQueryablePipelineGroupByAggregationEvery non-sentinel benchmark moves above the 2-7% drift band.
ProjectionSentinelcorrectly stays close to flat (the fast-path detection bypasses SerializerFinder entirely). The smallest non-sentinel delta isMultiFieldSearchat +8.5%/+10.0%, which still clears its 2.8% drift band cleanly. Most benchmarks cluster around 2-3× regressions, consistent with adding a per-translation serializer-discovery pass over the full expression tree.UpdatePipelineis a structural outlier (+626% time, +898% allocation — 9 KB → 91 KB), and the suite revealed why. Reading the two commits side by side:TranslateExpressionToFilter,TranslateExpressionToProjection, andExpressionToExecutableQueryTranslator.Translateall preprocess the lambda once at the top (LinqExpressionPreprocessor.Preprocess(expression)) and then letTranslationContext.CreaterunSerializerFinderon the canonicalized tree.TranslateExpressionToSetStagedoes it the other way around — it runsSerializerFinderon the raw lambda first and only preprocesses each field's value expression later, insideExpressionToSetStageTranslator.TranslateNewWithOptionalMemberInitializers. SinceSerializerFinderis a fixed-point visitor that loops until it stops making progress, running it over an un-preprocessed tree (partial-evaluated constants not collapsed, CLR-compat rewrites not applied) costs disproportionately more passes for expressions like ourTotal = x.Subtotal + x.Tax - x.Discountarithmetic chain. The same factor shows up on M1 (+735%) and perf-hw (+626%), ruling out machine-specific artifacts. This is a real driver asymmetry — follow-up ticket noted below. (Note that the design isn't a bug: SetStage's dispatch pattern-matches onbody is NewExpression | MemberInitExpression, whichPartialEvaluatorwould collapse if applied at the top. Any fix has to preserve that dispatch shape.)Set B — In-flight optimization PR #1961 (
SerializerFinderVisitMethodCallswitch →MethodInfo-keyed lookup table, plus a follow-upLookup performance optimizationcommit). Base (66780341e7= current main) vs head (54973d039a). Same perf-task host, runs interleaved with Set A.MultiFieldSearchOrFilterBatchLookupArrayElementQueryFieldSelectionAggregationProjectionProjectionSentinelUpdatePipelineQueryablePipelineGroupByAggregationHonest reading: at perf-hw resolution, the PR delivers measurable allocation wins on IQueryable benchmarks (
ArrayElementQuery -12.4%,QueryablePipeline -4.5%,BatchLookup -3.3%,GroupByAggregation -3.3%) — all of which walk method-call subtrees thatSerializerFinderVisitMethodCalldispatches through. The two largest time deltas (QueryablePipeline -2.6%,GroupByAggregation -2.4%) are at the edge of their respective 2.9-3.4% drift bands — directionally consistent but not slam-dunk on a single n=10 sample. Everything else sits inside noise.The takeaway for the suite isn't "PR #1961 is great" or "PR #1961 is marginal" — it's that the suite resolves what kind of improvement this is: a real allocation reduction in the method-call dispatch path, with smaller time effects. That distinction was invisible at M1 noise levels (where everything looked like noise or a 5% time improvement). At perf-hw, allocation deltas are the cleaner metric for changes this size.
5. End-to-end overhead — Atlas dev cluster, perf-hardware, n=10
To answer "how much of user-visible LINQ latency is translator cost?", the e2e suite runs six patterns twice each (LINQ-translated vs hand-built
BsonDocument/ pipeline) against a live Atlas dev cluster from the perf host. 500 documents seeded per run with secondary indexes onStatus,CreatedAt, andShippingAddress.City; dropped at cleanup. Each LINQ-Raw pair renders to byte-equivalent BSON — verified by running each LINQ expression throughLinqProviderAdapterand diffing the resulting filter/pipeline — so LINQ−Raw delta reflects translator and provider overhead, not query-shape differences.MultiFieldSearchOrFilterGroupByProjectionInFilterPagedQueryTranslator share is the fraction of user-visible LINQ time that disappears if you write raw
BsonDocumentinstead — an upper bound on translator cost (the LINQ delta also includes provider overhead like cursor construction and command serialization).Projection51%,MultiFieldSearch40%): translator is ~40-50% of user-visible latency on indexed selective queries.Projectionis the upper-bound case — LINQ has to translate aWherefilter and construct a serializer for the anonymous-type projection. A 10% translator regression here is a ~5% user-visible regression.GroupBy27%,InFilter29%,PagedQuery26%): translator is ~25-30%. Each does meaningful server-side work ($group,$in, sort-skip-limit) so server time partially offsets translator overhead.OrFilter1.2%): translator is ~1% because the 4-way OR matches a large fraction of documents (no index helps), and the result set serializes ~700 KB. A 10% translator regression here is invisible to users.Projectionallocates 3.5× more in LINQ (125 KB vs 36 KB),GroupBy3.2× more (65 KB vs 21 KB). Projected-type-serializer andIGroupingSerializerconstruction is allocation-heavy; this is the metric that would catch translator-side allocation regressions even when network time masks the time delta.Caveat: Atlas dev cluster across the internet from the perf host. Per-run range on individual benchmarks is wide — typically ±15-30%, up to ±50% on patterns where server time dominates (
GroupByRaw,ProjectionRaw). The translator-share ratios are more stable than the absolute times because LINQ and Raw runs on the same iteration see correlated network noise. Single-batch result, 500 docs — multi-batch cursor iteration may behave differently.Regression-alert thresholds (perf-hardware-calibrated)
Calibrated to observed drift on
rhel90-dbx-perf-large:MultiFieldSearch,BatchLookup,ArrayElementQuery,AggregationProjection,UpdatePipeline,QueryablePipeline,GroupByAggregation,OrFilterFieldSelectionProjectionSentinelAllocation thresholds should be even tighter — observed allocation drift is 0-1.2% across all benchmarks. A 5% allocation threshold would catch real allocation regressions while ignoring observed noise.
These replace the M1-calibrated 15%/30%/100% bands used during development.
Follow-ups (not in this PR)
TranslateExpressionToSetStagepreprocessing asymmetry surfaced by Set A. The current design (Preprocess each field's value individually, after dispatching on the un-preprocessed body shape) is intentional —PartialEvaluatorwould collapseMemberInitExpressionif applied at the top — but it causesSerializerFinderto do disproportionate work on arithmetic update expressions. The fix is likely either a partial preprocess that preserves dispatch shape, or a SerializerFinder optimization on un-preprocessed trees. Not a one-line change.Lookup/Join,Distinct,SelectMany,Cast,$exprfallback paths.Test plan
dotnet run -c Release -- --filter "*LinqTranslation*"frombenchmarks/MongoDB.Driver.Benchmarks/— all 10 benchmarks run, no build errors, results land inBenchmarkDotNet.Artifacts/results/dotnet run -c Release -- --filter "*LinqEndToEnd*"— requires a reachable mongod (local or remote), all 12 benchmarks rundotnet run -c Release -- --driverBenchmarks --anyCategories "LinqBench"— full perf-task path: composite score is emitted forLinqBenchevergreen patchagainsttest-csharp-spec-benchmarks/driver-performance-tests— task succeeds, results uploadableLinqBenchcomposite without dashboard changes (verified by team after merge)Caveats reviewers should know
rhel90-dbx-perf-largeruns (n=10 per commit). Cross-run drift is 2-7% on time, 0-1.2% on allocation. Set A and Set B results survive that noise floor as described per-row.subPathRootparameter was added toTranslateExpressionToFieldbetween Feb and April, so theFieldSelectionbenchmark drops that arg at A1/A2. This isolates the comparison to driver code differences, not benchmark code differences.54973d039a) is not onmongodb/mongo-csharp-drivermain's history. Submitting viaevergreen patch-file --base <sha>silently runs the project's mainline task instead — we hit this once. Re-submitted viaevergreen patch --uncommittedfrom a worktree at54973d039aso the PR1961 commits land inside the patch diff. Documented for anyone who needs to do this in the future.LinqEndToEndBenchmarkuses a realMongoClient. Without a reachable cluster it fails at[GlobalSetup]. The translation suite has no such dependency.