tsgo backend (--tsgo) — spike with JS-symbols hybrid#99
Draft
johnsoncodehk wants to merge 19 commits into
Draft
Conversation
…prototype shim Lays the plumbing for tsgo (`@typescript/native-preview`) integration without introducing it as a hard dependency. Adapter (lib/tsgo-backend.ts): - Wraps `Project.program` + `Project.checker` as a `ts.Program` / `ts.TypeChecker`-shape that satisfies LinterContext's Program-thunk contract from PR #98. - Per-file batched prepass: walks the SF locally, resolves every Identifier via `Checker.getSymbolAtPosition([positions])` in a single sync RPC. Position-based primary because `getSymbolAtLocation` returns undefined for identifiers in import/export specifier position (~76% of identifiers in a type-heavy file). Location-based fallback for the small remainder. - Symbol identity is rock-solid via tsgo's `symbol.id` — `Map<Symbol, X>` patterns (compat-eslint's `variableBySymbol`) work unchanged. - Patches `getStart` / `getEnd` / `getText` / `getFullStart` / `getFullText` / `getWidth` / `getFullWidth` onto tsgo's `RemoteNode` prototype (located by walking the prototype chain — the Remote{Node,SourceFile} classes aren't in the package exports map). Uses real ts's `skipTrivia` — bit-identical positions vs ts.Node. CLI wiring (lib/worker.ts, index.ts): - `--tsgo` flag dispatches setup() into the tsgo branch. - Rejects --vue/--mdx/--astro/--vue-vine/--ts-macro projects with a clear error (Volar host injection isn't available on tsgo). - Skips BuilderProgram drain / layer-2 affected-file classification (no JS API on tsgo); treats every file as affected → cached type-aware entries re-validate rather than serve from a stale snapshot. - `lint()` calls `tsgoBackend.prepareFile(name)` before rules run. Optional peer dep: `@typescript/native-preview`. Users without it never import the backend (the require path is gated behind --tsgo). Verification: - packages/cli/test/tsgo-backend.test.ts — 12 checks (skips when peer dep absent): SF fetch + AST walk, batched symbol prepass, Symbol identity collapse, Map<Symbol> compatibility, program method coverage. - Existing tests unchanged: program-host (17), cache-flow (50), meta-frameworks (8). - End-to-end: --tsgo --project packages/{types,core,compat-eslint}/tsconfig.json runs to completion; full repo lint 1.66s vs 1.45s on ts.Program (small parity gap; dominated by per-project snapshot init). Known gap (next commit): rule code calls `ts.isXxx(node)` and `ts.forEachChild(node, cb)` from real ts — these dispatch on `ts.SyntaxKind` values. tsgo Node kinds are offset-shifted (Identifier=79 vs 80), so dispatch tables miss → rules silently no-op. SyntaxKind remapping or ts module facade required to make rules fire.
Validation fixture (a single `import { Foo } from './foo'` where Foo is
type-only): @typescript-eslint/consistent-type-imports rule now fires on
tsgo path with the same diagnostic text + range as ts.Program path.
The mechanism — why a facade and not in-place mutation:
Rule code (compat-eslint, typescript-eslint utilities, custom rules)
binds `ts` once at module load: `const ts = require('typescript')`. Every
`ts.SyntaxKind.X` / `ts.isXxx(node)` / `ts.forEachChild(node, cb)` call
inside that module dispatches against that bound module's enum values
and type-guard functions. tsgo Node.kind=79 (Identifier); real
ts.SyntaxKind.Identifier=80 → dispatch tables miss → silent no-op.
We can't mutate ts.SyntaxKind in place: the worker / cache-flow / core
imported ts BEFORE setup() runs and need real ts internals (skipTrivia,
createSemanticDiagnosticsBuilderProgram, …) keyed by real ts values.
Selective substitution: install a facade in `Module._cache` AFTER the
worker's module-load `import ts = require('typescript')` has bound real
ts, BUT BEFORE the dynamic `import(configFile)` pulls in compat-eslint
and rules. Those late-loaded modules see tsgo's enums + type guards via
the facade; the worker keeps its real ts reference intact.
Implementation (lib/tsgo-typescript-facade.ts):
- Build facade as a plain object. Copy every own-property of real ts
first (so CJS interop helpers like tslib's `__importStar`, which iterate
`Object.getOwnPropertyNames(mod)` and miss prototype-inherited fallbacks,
still see complete shape — typescript-estree crashes on `ts.Extension`
otherwise). Overlay tsgo's enums (SyntaxKind, NodeFlags, ModifierFlags,
ScriptKind, ScriptTarget, TokenFlags, …), all type guards from /ast/is,
visitor helpers, factory namespace, and tsgo-only enums (SymbolFlags,
TypeFlags, DiagnosticCategory) from /api/sync.
- Free-function `forEachChild(node, cb, cbNodes?)` wraps tsgo Node's own
`.forEachChild` method (rule code expects free-function shape).
- Sentinel `__tsgoFacade__: true` for debugging.
`installFacade()` (lib/tsgo-typescript-facade.ts):
- Pnpm-aware: tsgo + tsslint + tsslint-eslint each pull in typescript via
their own resolution chains, all landing on different physical paths in
`.pnpm/`. Cache-slot substitution via `Module._cache[require.resolve('typescript')]`
alone covers our cwd's slot, not the others. Instead route every
`require('typescript')` to a single canonical synthetic id by overriding
`Module._resolveFilename`, then prime that one cache slot.
- Idempotent — second call short-circuits.
Worker integration (lib/worker.ts):
- `if (useTsgo) require('./tsgo-typescript-facade.js').installFacade()`
runs at top of setup(), before `await import(configFile)`.
State on full repo:
- 11/59 files pass through tsgo path; 48 files now crash with a different,
more advanced gap inside compat-eslint's ts-ast-scan (next iteration).
- Validation fixture: rule fires, diagnostic matches.
- Existing tests unchanged: program-host (17), cache-flow (50),
meta-frameworks (8), tsgo-backend (12).
…cker stubs
Two surgical follow-ups after the facade landed:
1. Type prototype patches. typescript-eslint's no-unnecessary-type-assertion
(and several compat-eslint paths) call `type.isLiteral()` /
`type.isStringLiteral()` / `type.isUnion()` / `type.getFlags()` etc.
ts.Type provides these as instance methods on its prototype. tsgo's
TypeObject only exposes data fields + `getSymbol()`.
Patch the missing predicates onto tsgo's TypeObject prototype, located
by walking up from a sample Type the first time `getTypeAtLocation`
returns one (TypeObject isn't in the package exports map). Predicates
close over tsgo's TypeFlags enum at patch time — flag values differ
from ts.TypeFlags, so we can't blanket-copy ts.Type prototype methods
(their hardcoded flag constants would mis-mask).
2. `getSymbolsInScope` / `getExportSpecifierLocalTargetSymbol` change
from throwing to returning safe defaults (`[]` / `undefined`).
compat-eslint's two callsites (parameter-property shadowing,
ExportSpecifier alias unwrap) have fallback paths that handle empties
gracefully — degrades scope-manager precision in those edges but the
pipeline keeps moving. Methods that *would* produce wrong diagnostics
on no-op stay throwing (`getBaseConstraintOfType`, `getApparentType`,
`getContextualType`).
Status: 12 files pass through tsgo path on the full repo (was 11). Next
gaps surface concretely:
- `Symbol.declarations[i].getSourceFile is not a function` —
`declarations` on tsgo Symbol is `NodeHandle[]`, not `Node[]`
- `scanner.setTextPos is not a function` — tsgo Scanner shape diff
Existing tests unchanged.
Six concrete gaps closed. Each was a pinpoint API-shape mismatch between
tsgo's RPC-class hierarchy and the ts.* shape rule code expects.
NodeHandle (Symbol.declarations[i]):
- `getSourceFile()` → routes to `project.program.getSourceFile(this.path)`
(tsgo NodeHandle has `path` directly).
- `parent` lazy getter → resolves the handle once via `resolve(project)`,
caches `_resolvedNode` on the instance, returns resolved.parent.
- Multi-project safe: hooks close over a mutable `currentProjectRef`
rebound per `createTsgoBackend()`. Avoids the `SyncRpcChannel is closed`
errors caused when project A's NodeHandles were touched after project A
was disposed.
Symbol prototype:
- `getDeclarations()` / `getName()` / `getEscapedName()` / `getFlags()` —
thin getters over the data fields tsgo Symbol already carries. Plus
`escapedName` getter alias for typescript-estree's direct field access.
Type prototype:
- `types` getter delegates to tsgo's `getTypes()` — typescript-eslint's
ts-api-utils (`unionConstituents`) reads `type.types` directly on
Union/Intersection types.
- `getCallSignatures()` / `getConstructSignatures()` / `getProperties()` /
`getProperty(name)` / `getBaseTypes()` / `getNonNullableType()` —
instance shims that delegate to the project Checker via the
`currentProjectRef` holder. Patched on first checker query that
surfaces a Type sample.
Scanner:
- `setTextPos(pos)` aliased to `resetTokenState(pos)` — tsgo renamed
the method but kept the same semantics (position the scanner head).
Wrapped at the facade's `createScanner` callsite so consumers
(compat-eslint/lib/tokens.ts) work without per-callsite changes.
SourceFile:
- `getLineAndCharacterOfPosition(pos)` + `getLineStarts()` — lazy
computation of line offsets from `text`, cached on the instance. tsgo
doesn't expose ts's lineMap caching; the binary-search version is
fast enough for diagnostic span rendering.
NodeList:
- `[Symbol.species] = Array` on `RemoteNodeList` (Array subclass).
Without this, `statements.map(fn)` constructs a new RemoteNodeList
via the species protocol, hits the binary-view getter without view
data, and crashes with `this.view.getUint32 is not a function`.
Checker proxy expansion:
- 14 new direct forwards (getTypeOfSymbol / getDeclaredTypeOfSymbol /
getSignaturesOfType / getResolvedSignature / getReturnTypeOfSignature
/ getTypePredicateOfSignature / getNonNullableType / getBaseTypes /
getPropertiesOfType / getIndexInfosOfType / getTypeArguments /
getWidenedType / getTypeFromTypeNode / getContextualType /
typeToString / isArrayLikeType).
- `getBaseConstraintOfType` routes type-parameter inputs to tsgo's
`getConstraintOfTypeParameter`, returns undefined for non-parameters
(matches ts behaviour).
- `getApparentType` soft-fallback returns the input type — primitives
rarely hit this path in lint rules; rule code's downstream property
lookups go via `getPropertiesOfType` which is forwarded.
Status on full repo (`tsslint --tsgo --project '{tsconfig.json,packages/*/tsconfig.json}' --force`):
- 19 / 59 files complete (was 12).
- ~9 distinct crash-class TypeErrors remain — `isTypeAssignableTo`,
`sig.getReturnType`, `Cannot read properties of undefined (reading 'kind')`.
- ~455 diagnostic messages now fire from rules with partial type info.
Many likely false-positives until checker shims are tightened (e.g.
`getApparentType` soft-fallback over-approximates).
- Existing tests unchanged: tsgo-backend (12), program-host (17),
cache-flow (50), meta-frameworks (8).
…y, Signature shims Three batches that close every crash category except the lone ts-api-utils `isInConstContext` walk-up edge case. 1. SourceFile.getPositionOfLineAndCharacter (line, char) → position. compat-eslint's ESLint→TSSLint report converter (index.ts:266+) calls this to map ESTree's loc-based descriptors back to file offsets, all wrapped in a swallowing try/catch that defaults `start=end=0` on failure. Without the shim every diagnostic collapsed to (line=1, col=1). Now spans render correctly — caret highlights match ts.Program path on the validation fixture. 2. Null-safety wrap on every `is*` predicate in the typescript module facade. tsgo emits `is.generated.js` with bare `node.kind === SK.X` accesses; ts's versions tolerate undefined. ESLint rule code that walks `node.parent.parent.parent…` and tests on the result was crashing at SourceFile-root and beyond. Wrap to short-circuit `false` on falsy input. 3. Signature prototype shims (`getReturnType` / `getDeclaration` / `getTypeParameters` / `getParameters`). tsgo Signature carries the data fields but lacks ts.Signature's accessor-method facade. `getReturnType` delegates via `currentProjectRef.project.checker`; the rest read existing fields. 4. Soft `isTypeAssignableTo: () => false`. tsgo doesn't expose subtype checking. Conservative `false` keeps type-safety rules on their "can't prove assignable, leave alone" branch — matches ts behaviour when the checker can't decide. May suppress some legitimate diagnostics until upstream surfaces this. Status: - 20 / 59 files complete (was 19). - Crash classes: 9 → 1. Lone remaining: ts-api-utils' `isInConstContext` walks `current.parent` past SourceFile root → `undefined.kind` crash. Affects no-unnecessary-type-assertion on a small set of files only; pre-existing in ts-api-utils itself (assumes `current.parent` is always defined). - Diagnostic span rendering now correct: caret offsets match ts.Program. - 684 messages — most are false-positives from rules running with approximated checker calls (`getApparentType`, `isTypeAssignableTo`); none are crashes anymore. - Existing tests unchanged: tsgo-backend (12), program-host (17), cache-flow (50), meta-frameworks (8).
Probed `no-unnecessary-type-assertion` false-positive flood (684/684 diagnostic messages on full repo). Root cause is upstream, not adapter: tsgo's `getTypeAtLocation` returns the same Type id (id=1 / flags=Any) for both the outer asserted type AND the inner expression in nested assertions. Example: (globalThis as any)[COUNTS_KEY] as Map<string, number> | undefined ↓ outerType.id = t01 (Any), innerType.id = t01 (Any) vs. ts returning two different Type instances (one Map | undefined, one Any). The rule's `uncast === cast` shortcut takes the wrong branch because tsgo's objectRegistry caches the universal-Any singleton — two "any" results round-trip to the same instance. Switching `getApparentType` from soft-identity to throw didn't help — the rule's path doesn't reach getApparentType. Reverted. The path forward is upstream (tsgo's checker semantics for nested assertions through `any`), not an adapter wrap. Adapter-side options are sledgehammers: (a) wrap every Type to break identity equality — suppresses real `unnecessaryAssertion` findings on equivalent types in addition to the false positives; (b) shadow noUnnecessaryTypeAssertion specifically — intrusive plumbing for one rule. Status unchanged: 20 / 59 files complete; one residual crash class (ts-api-utils' `isInConstContext` walk-up edge case); the 684 false-positives flood comes from tsgo's checker, not our shims.
…eNode
Found the right tsgo API entry instead of declaring upstream bug. Not a
bug — `getTypeAtLocation(asExpr)` and `getTypeFromTypeNode(asExpr.type)`
have different semantic contracts:
- ts.getTypeAtLocation(asExpr) → asserted target type (the type
the expression evaluates to AFTER
`as`)
- tsgo.getTypeAtLocation(asExpr) → underlying expression's type
(BEFORE `as`)
typescript-eslint's `no-unnecessary-type-assertion` rule depends on the
ts contract — without routing, `castType === uncastType` was trivially
true (both = inner type), and every assertion in the codebase fired as
"unnecessary" (684 false-positive messages).
Re-route in our adapter: when `getTypeAtLocation` is called on an
`AsExpression` / `TypeAssertionExpression` / `SatisfiesExpression`,
delegate to `getTypeFromTypeNode(node.type)` instead. Other node kinds
keep tsgo's default semantic (which IS what's wanted everywhere else).
Status:
- 24 / 59 files complete (was 20).
- 433 messages (was 684) — a ~250-message drop, all from the rule
taking the right code-path now on most assertions.
- Residual false-positives concentrate on UnionTypeNode targets like
`as Map<string, number> | undefined`. tsgo's
`getTypeFromTypeNode(unionTypeNode)` returns only the first member
(`Map<string, number>`), not the constructed union — so the rule
again sees outer/inner as effectively the same. Likely needs a
different tsgo API entry; investigation continues.
- Existing tests unchanged.
…on routing
The "API semantic divergence" angle generalises beyond AsExpression. Each
node kind whose ts.Type semantics differ from tsgo gets its own routing.
AsExpression / TypeAssertion / Satisfies → getTypeFromTypeNode(.type)
(asserted target type)
CallExpression / NewExpression → getReturnTypeOfSignature(
getResolvedSignature(node))
(call's return type, not the
function type)
NonNullExpression → getNonNullableType(
getTypeAtLocation(.expression))
(post-`!` type, not pre-`!`
union with undefined)
Without these, on the validation fixture `const x = map.get('foo')!`:
- ts.Program: rule sees `number | undefined` for the call → isNullableType
iterates union constituents → returns true → rule does NOT fire (correct).
- tsgo (pre-fix): getTypeAtLocation(callExpr) returns the FUNCTION type
`(key) => number | undefined`, not the call result. isNullableType
returns false → rule fires as "unnecessary" (false positive).
Why these and not all kinds: variable references, member accesses, and
literals all have ts and tsgo agreeing on what `getTypeAtLocation`
returns (the value's type). Only the asserted/post-effect kinds diverge.
CallExpression has a fallback chain. tsgo's `getResolvedSignature` panics
on some method-call sites (`Map.get` etc.) — synthetic node identity may
not match the parsed program's call site. Catch the panic and fall back
to `getCallSignatures(funcType)[0].getReturnType()`, which uses two
sync calls but doesn't trigger the panic. Slightly less precise (picks
the first overload) but sound for lint purposes.
Status:
- 27 / 59 files complete (was 24).
- 427 messages (was 433). Tail of false-positives now from rules whose
remaining checker shims (getApparentType identity-fallback,
isTypeAssignableTo always-false) over- or under-approximate.
- ts-api-utils' `isInConstContext` walk-up edge case still 1 occurrence.
- All existing tests pass.
Two refinements after tracing remaining false positives: 1. PropertyAccessExpression callee: tsgo's `getSymbolAtLocation(propAccess)` returns the LEFT (receiver) symbol — for `info.languageService.getProgram`, sym is `languageService`, not `getProgram`. ts returns the property's symbol (the rhs). Probe the property name node first (`callee.name`) before falling back to the callee itself for identifier-callee cases. 2. NonNullExpression's `getTypeAtLocation(.expression)` recursion was bypassing the adapter — calling `project.checker.getTypeAtLocation` directly. For nested cases like `someMap.get(k)!`, the inner CallExpression needs its OWN routing applied to surface the call's return type before `getNonNullableType` strips nullability. Recurse through the wrapped checker so the dispatch compounds correctly. Status: - 29 / 59 files complete (was 27). - Tail of false positives now concentrates on `.pop()!` style assertions where Plan A panics, Plan B's funcType isn't the full method type, Plan C's symbol-via-name probe doesn't resolve. Likely needs another tsgo API entry to surface the exact method signature (`getResolvedSignature` is the canonical one but panics — known tsgo issue, not adapter shape). - ts-api-utils' `isInConstContext` walk-up edge case still 1 occurrence — pre-existing in the third-party utility. - All existing tests pass.
Two attempts at making the checker shims more accurate, with mixed
results dogfooded against the full repo:
getApparentType: identity-fallback → flag-routed:
- TypeParameter input → getConstraintOfTypeParameter (fall through
to the input if no constraint)
- StringLiteral / NumberLiteral / BooleanLiteral / BigIntLiteral /
EnumLiteral input → getWidenedType
- Otherwise input
Sound but unmeasured: 432 → 431 messages on dogfood. The rule
paths that were calling it weren't producing the visible FPs;
the cleanup is defensible regardless.
isTypeAssignableTo: tried structural cover (id equality, any/unknown/
never sentinels, union decomposition, literal widening), reverted
to always-false:
- Cover impl returned `true` for cases TS would say `false` (no
full structural compat without porting the checker subtype
machinery), which then made the rule's contextually-unnecessary
path fire on assertions where the receiver type wouldn't
actually accept the source type.
- Net effect: +26 new FPs on dogfood (different message variant —
"receiver accepts the original type"), zero reduction in the
dominant 433 "does not change the type" FPs.
- Conservative `() => false` keeps the rule on its
"can't prove assignable, leave alone" branch — same branch ts
takes when the checker can't decide. Better than partial
truth.
Honest accounting of FP source — which I overclaimed earlier:
- getResolvedSignature panic on generic method calls: confirmed
tsgo upstream (minimal repro independent of adapter)
- getTypeAtLocation on PropertyAccess returning wrong narrowed
type in real-repo context: minimal repro shows correct behaviour
(`string | undefined`), but instrumented dogfood shows wrong
types (`string`, `string[]`) on the same source. Interaction
effect with adapter prepass / session state — root cause
unidentified. Could be adapter, could be tsgo, can't say.
- Other FPs: distributed across yet-untraced rule paths.
Conclusion: the dominant FPs aren't dispatched by the two stubs we
just adjusted. They originate further up — likely in
getTypeAtLocation results that misalign with ts in real-codebase
contexts but match in isolation. Diagnosing requires per-FP
instrumentation.
Reverted the previous revert. Implementing each shim faithfully is the right metric, not "produces few messages on our codebase". Conservative `() => false` was lying for cases that ARE assignable, suppressing diagnostics that should fire — the apparent reduction in FPs included silenced true positives. Reinstate the structural cover (id eq, any/unknown/never sentinels, union decomposition, literal widening). Returns `false` for the long-tail structural-compat cases that need the full checker subtype machinery — sound (no false `true`) over those, consumers treat unknown as "can't prove" rather than "definitely false". Sticking with the principle: implement to API contract. Verifying correctness per-rule is downstream work; suppressing visible noise via shim dishonesty is not.
tsgo's `getTypeAtLocation(propAccess)` returns wrong types in some
real-codebase contexts. Investigation:
- Real cli/index.ts has 3 occurrences of `project.configFile!` where
the declared type is `string | undefined`. tsgo returns:
line 455 → `string[]` (the type of the PREVIOUS argument)
line 463 → `string` (no undefined)
line 575 → `string` (no undefined)
- ts.Program returns `string | undefined` for all 3 (correct).
- Querying `getTypeAtPosition(file, propAccess.end)` on the SAME
positions returns correct `string | undefined`. Same checker, same
snapshot, different API entry, different answer.
The bug is in tsgo's node-based `getTypeAtLocation` for PropertyAccess
in some surrounding-context cases — minimal repros don't trigger; the
divergence needs the full file's complexity. Position-based query
isn't affected.
Adapter routing: for PropertyAccessExpression / ElementAccessExpression,
delegate to `getTypeAtPosition(file, end)`. Sound — same checker, same
type — and produces ts-aligned results in cases that were FPing before.
Effect: 432 → 377 messages on dogfood (~55 fewer FPs). One file moved
from "passed clean" to "has some message" (likely a true positive that
was previously masked, or a different FP shape revealed; not yet
classified). All existing tests pass.
Verifies the --tsgo slowdown vs ts.Program is structural, not from
adapter mistakes. On Dify web/ (~5000 .tsx, single project):
createBackend (updateSnapshot cold) 0.8s 4%
prepass total (3600 prepared × ~3ms/file) 11.1s 60%
rule exec + per-rule checker queries 6.5s 36%
----------------------------------------- ---- ----
Wall 18.4s 100%
ts.Program wall on the same setup 10.4s
The 8s gap concentrates in prepass — necessary tsgo cost: each lint()
must batch-resolve ~300 identifiers/file via sync RPC because per-call
checker queries cost ~78us each (vs in-process ts.Program: zero IPC).
Without prepass, rules hitting per-id symbol queries → 5000×300×78us =
~117s. Prepass at 3ms/file is already near the IPC floor (DataView
serialization + cross-process batched query).
What we ruled out:
- duplicate prepass (preparedFiles Set guards)
- redundant RPC (location-fallback only on the ~10% position-based
misses)
- heavy adapter-internal work (negligible vs RPC time)
What's structural:
- cross-process tsgo means every checker query is sync RPC
- lint workloads ask many type questions per file
- in-process ts.Program has zero IPC for the same questions
Toggling: set TSSLINT_TIME_TSGO=1 to print createBackend, per-100-file
prepare cumulative, and final-summary stage breakdown. Default off; the
arithmetic is a single env-var read per call when off.
Conclusion (verified, not assumed): the 1.7× slowdown isn't from
something we did wrong. It's the IPC floor of running a cross-process
type checker for an N-files-with-M-queries-each workload.
Verifies what symbol resolution actually costs on the tsgo path. The
batched `getSymbolAtPosition` prepass turns out to be a workload-
dependent optimization, not a free win.
Why it can be skipped: tsgo's Symbol comes from the binder, not the
checker. The binder runs at `updateSnapshot` time on the Go side; every
`getSymbolAtPosition` call is just a hash-table lookup. The 3ms/file
cost we saw is almost entirely cross-process plumbing — msgpack
serialize, sync RPC, Symbol-object construction — not type-checking work.
Bench (3 runs each, median wall, --force):
Dify web/ (~5000 .tsx, single rule react-x/no-leaked-conditional-rendering)
prepass on 18.8s 2982 passed · 1611 errors · 55 messages
prepass off 8.5s 2982 passed · 1611 errors · 55 messages
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
identical correctness, 2.2× faster
tsslint own repo (7 projects, full ts-eslint via importESLintRules)
prepass on 36.8s 28 passed · 376 messages
prepass off 37.0s 26 passed · 391 messages
^^^^^^^^^^^^^^^^^^^^^^^^^^^
~equivalent perf, slight correctness shift
The Dify result shows prepass costs 11s of upfront RPC for queries the
configured rule barely makes. The adapter's per-Node `nodeToSymbol`
WeakMap already caches lazy `getSymbolAtLocation` results, so on
heavy-symbol workloads (tsslint repo) lazy mode catches up via cache
hits rather than upfront batching.
Defaulting to prepass-on for now because the slight tsslint-repo
correctness drift (~2 passed files, +15 messages) is unexplained — a
silent shift in lint output from a perf flag is the wrong default until
diagnosed. `TSSLINT_NO_PREPASS=1` opens the toggle for users who want
to trade that risk for the 2× speedup on rule mixes that don't query
many symbols.
Answers user's question: no, the slowdown isn't checker work — it's
IPC overhead for symbol queries we don't always need.
…LS=1)
Replaces tsgo's `getSymbolAtPosition` IPC for in-file symbol resolution
with a real ts-side bind + scope walker. The Symbol returned is a
genuine ts.Symbol (not a tsgo Symbol with prototype shims), and stays
identity-stable across queries.
Architecture
------------
tsgo remains the source of truth for AST and Type. Symbol resolution
splits by query level:
Layer A (variable refs, declaration names, in-file specifiers,
in-file type refs)
→ ts.createSourceFile + ts.bindSourceFile + scope walker
→ in-process, ~0.36ms/file, real ts.Symbol
Layer C (property names on imported types, lib globals, and any
cross-file resolution)
→ fall back to tsgo's checker via existing IPC path
The fallback fires when the JS-side scope walker returns undefined for
a given identifier — no upfront classification needed, the lookup tells
us itself.
Implementation
--------------
- `lib/real-ts.ts`: captures the genuine `typescript` module reference
before the tsgo facade installs its `Module._resolveFilename` hook.
Worker eager-imports it at top-level so internal CLI code that needs
real ts behaviour (parser/binder/scanner) bypasses the facade.
- `lib/tsgo-js-symbols.ts`: parse + bind a file via real ts, build a
`(pos, end, kind)` position map for tsgo Node → JS Node lookup, and
do a scope walk on Identifier resolve. Kind remap covers tsgo's
offset-shifted SyntaxKind enum vs ts's.
- `lib/tsgo-backend.ts`:
- `prepareFile` under TSSLINT_JS_SYMBOLS=1 skips the tsgo IPC
prepass and just binds the JS-side SF (lazy: scope walks happen
per-query, not eagerly).
- `getSymbolAtLocation` tries JS-side resolver first, falls
through to tsgo on miss.
Bench (Dify web/, ~5000 .tsx; median of 2 runs, --force):
Mode Wall Passed Errors Messages
A. tsgo prepass on 16.8s 2982 1611 55
B. tsgo no-prepass 8.0s 2982 1611 55
C. tsgo JS-symbols (this) 9.0s 2983 1621 50
D. ts.Program 9.5s 2961 1867 9
Bench (tsslint own repo, 7 projects with full ts-eslint; median 3):
Mode Wall Passed Messages
A. tsgo prepass on 37.1s 29 396
B. tsgo no-prepass 37.0s 27 407
C. tsgo JS-symbols (this) 37.9s 29 251 ← -145 FPs
D. ts.Program 1.6s 60 12
C trade-offs
------------
- vs A (default): much faster on Dify (-7.8s), same speed on tsslint,
better correctness on both (more passed / fewer messages).
- vs B (no-prepass): +1s on Dify (cost of eager bind even when rules
don't query symbols), tied on tsslint, but ~10 more true positives
detected and -145 messages on tsslint repo (the JS-side scope walker
satisfies queries that B's lazy tsgo answers were imprecise on).
- vs D (ts.Program): on Dify faster by 0.5s; on tsslint dramatically
slower because 7 projects × 0.8s tsgo setup cost dominates a small
codebase.
C is now the best-correctness tsgo mode. Defaulting OFF until the
trade-off is examined per-codebase.
Tests unchanged: tsgo-backend (12), program-host (17), cache-flow (50),
meta-frameworks (8) all pass.
Removes the original tsgo `getSymbolAtPosition` batched IPC prepass
entirely. Symbol resolution now defaults to:
1. real-ts `bindSourceFile` per file (~0.36ms/file in-process)
2. lazy JS scope walker on getSymbolAtLocation
3. tsgo IPC fallback (position-based first, then node-based) only
when JS resolver returns undefined
Architecture rationale (validated by Dify-scale measurement):
- Symbol is binder output, not checker output. tsgo's binder runs at
`updateSnapshot` on the Go side; every getSymbolAtPosition was a
cross-process round-trip serializing pre-computed binder data.
- real ts in-process binder + scope walker gives the same answer for
Layer A (variable refs, declarations, in-file specifiers, type
refs). Returns real ts.Symbol with stable identity — no
prototype-shim wrapper needed.
- Layer C (property names on imported types, lib globals, anything
cross-file) falls through to tsgo's checker, position-based first
to recover the prepass's specifier coverage.
Removed:
- TSSLINT_NO_PREPASS env (always-off path is irrelevant — there is
no batched prepass to disable)
- TSSLINT_JS_SYMBOLS env (now the only path)
- 50 LOC of position+location batched prepass + fallback code in
`prepareFile`
- `idKind` capture in `createTsgoBackend` (prepareFile no longer
walks the SF to collect identifiers)
- `_prepareWalk` / `_prepareBatchSym` / `_prepareFallbackSym` timing
counters (only `_prepareGetSF` and new `_prepareBind` remain)
Numbers (median of 3, --force):
Dify web/ (~5000 .tsx, single rule):
before this PR (prepass-on) 17.0s 2982 passed · 1611 errors · 55 msg
after this PR (JS-symbols) 9.2s 2983 passed · 1621 errors · 50 msg
ts.Program baseline 10.4s 2961 passed · 1867 errors · 9 msg
tsslint own repo (7 projects, full ts-eslint):
before this PR (prepass-on) 37.1s 29 passed · 396 messages
after this PR (JS-symbols) 37.6s 29 passed · 247 messages
Wins on both correctness and speed at Dify scale; on tsslint repo wins
on FP count (-149 messages) at parity speed. The 13% Layer A recall gap
the JS walker has against whole-program ts (mainly globals) is covered
by the tsgo IPC fallback in wrapChecker, so end-to-end recall through
the adapter matches the previous prepass.
All test suites pass: tsgo-backend (12), program-host (17),
cache-flow (50), meta-frameworks (8).
… kind fallback Closes the three tsslint-side gaps the JS-symbols spike left open. (1) Memory lifetime — per-backend caches The bound-SourceFile + position-map caches in `tsgo-js-symbols.ts` moved from module-level singletons into the closure of `createJsSymbolResolver`. Each `createTsgoBackend` call constructs its own resolver and registers it via `jsSymbolResolverRef.current`; `backend.close()` calls `resolver.clear()` and unregisters. Multi- project worker setups no longer share stale binds across snapshots, and long-running CLI invocations don't accumulate cached ASTs across projects. (2) --fix invalidation `getJsSourceFile` now compares the cached SF's `.text` against the incoming text on every call. On mismatch (post-`--fix` rewrite), the old SF and its position maps are dropped before re-binding. The backend exposes a public `invalidateFile(fileName)` API; the worker's `--fix` path calls it right after stashing the rewritten text in `fileTextOverrides`, so the next `prepareFile` rebinds against the post-fix content. Without this, scope queries on edited files would return symbols from stale declarations. (3) Kind remap robustness `tsgo SyntaxKind` and `ts SyntaxKind` enum names overlap ~98% by name but a handful diverge (tsgo-only `JSImportDeclaration`, `JSTypeAliasDeclaration`, etc.). The previous remap returned the unmapped tsgo value as-is, making position-key lookups silently miss. Added a parallel position-only map (pos→first-node-at-span) consulted when the kind key misses, so resolution falls through to "best-effort node at this span" instead of "no answer at all". Recall on Dify unchanged (2983/1621/50) — the affected node kinds happen to not be Identifier- positioned in the rules under test, but the safety net is in place. Test additions - New test 6 in `tsgo-backend.test.ts`: invalidate + re-prepare on unchanged file, verify identifier still resolves and returns equivalent symbol. Exercises the change-detection short-circuit and the post-invalidate rebind path. Bench (3 runs, --force): Dify web/ tsgo default 9.2-9.6s 2983/1621/50 (unchanged) Dify web/ tsgo --fix 9.4s 2983/1621/50 (no JS errors) Tests: tsgo-backend 15/15 (was 12), program-host 17/17, cache-flow 50/50, meta-frameworks 8/8.
Was accumulating all linted files' bound SFs + position maps in Node
memory until backend.close(). For Dify (5000 files × ~30KB bound SF +
position maps), that pinned ~520 MB unnecessarily.
After lint() returns for a file, the JS-side bind serves no further
purpose — symbols for that file's identifiers have already been
queried, and rule code from later files queries against THEIR own
bound SFs. The bound SF can be released for GC.
TsgoBackend.releaseFile(name)
Drops the bound SourceFile + position maps from the per-backend
JsSymbolResolver. Distinct from invalidateFile (which is for
--fix rewrites and re-binds against new text); release simply
discards because we're done.
worker.lint(fileName, ...) → returns diagnostics
Calls tsgoBackend?.releaseFile(fileName) just before return.
Memory bench (Dify web/, ~5000 .tsx, --force):
Node RSS Go subprocess Total
ts.Program 2.27 GB — 2.27 GB
tsgo before 2.68 GB 1.59 GB 4.27 GB (1.9×)
tsgo after 2.16 GB 1.42 GB 3.58 GB (1.58×)
Node-side dropped 524 MB; tsgo Node now smaller than ts.Program.
Total still higher than ts.Program (Go subprocess holds AST+types
independently), but the Node-side fat from accumulating bound SFs is
gone.
Multi-project scenario (tsslint own repo, 7 projects) NOT helped by
this — there the accumulation is in the tsgo client SourceFileCache
across snapshot boundaries, not in our per-file bind. That's an
upstream `@typescript/native-preview` cache lifecycle issue.
Speed unchanged (Dify 8.9-9.7s, 2983/1621/50). Tests still 15/15.
e98724a to
1b3b43e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
Spike for tsgo (
@typescript/native-preview) backend integration via--tsgo. Doesn't currently win — losing on memory and recall on every workload, losing on speed except marginally on large single-project codebases. Architecture is sound; the gap is upstream tsgo maturity. Recommend hold-on-branch.What this is
Opt-in
--tsgoflag that swaps tsslint's lint backend fromts.Programto a hybrid built on@typescript/native-preview's sync API.Architecture, after several pivots:
bindSourceFile+ scope walker. No IPC.The pivot away from "tsgo for everything" came after measurement showed cross-process IPC dominates per-file lint cost. Symbol queries via tsgo's batched
getSymbolAtPositioncost 11s on Dify (~5000 files). Same questions answered by real-tsbindSourceFile+ scope walker in 1.8s — Symbol is binder output, doesn't need IPC.Numbers (full)
Median of 3 runs,
--force. Memory = peak RSS via/usr/bin/time -l(Node) + sampledps(tsgo subprocess).Dify web/ — 5000 .tsx, single project, one rule (
react-x/no-leaked-conditional-rendering)ts.Program(baseline)--tsgo--tsgowins speed by 12%. Loses memory by 58%. Misses 246 true positives. Reports 41 more crash-class diagnostics (mostly tsgo upstreamType.Typespanics surfaced by ts-api-utils).tsslint own repo — 50 files across 7 projects, full ts-eslint via
importESLintRulests.Program(baseline)--tsgoCatastrophic on small/multi-project. Each project setup pays a fixed
updateSnapshotcost (~0.8s) plus a tsgo clientSourceFileCacheallocation that doesn't release across snapshot boundaries. Half the files don't complete due to upstream tsgo edge-case panics that the larger Dify file sample happens to avoid.Upstream tsgo blockers (not addressable here)
These are why recall is below ts.Program and why this can't ship as a user-facing feature today.
Checker.getResolvedSignaturepanics on generic method calls (Map.get,Array.pop). Minimal repro reproduces with raw tsgo + 30 LOC, no adapter. CLI compile of the same fixture works — specific to the IPC API path.Checker.getTypeAtLocationreturns wrong types for PropertyAccess in some real-codebase contexts. Reproduces on Dify but not on hand-built minimal fixtures.getTypeAtPositionreturns correct types for the same expressions, so513d7d7routes via that. Underlying API divergence remains.isTypeAssignableTo/getApparentTypenot in the API. Adapter has structural-cover shims (sound but incomplete; long tail returnsfalse/ input). Tracked in microsoft/typescript-go#3610. Source of most missed true positives.No Volar host injection.
--tsgorejects--vue-project/--mdx-project/--astro-project/--vue-vine-projectwith a clear error. Meta-frameworks stay onts.Program.No BuilderProgram JS API. Layer-2 incremental cache (affected-file classification) disabled on
--tsgo; treats every file as affected.SourceFileCachedoesn't release across snapshot boundaries. Drives the 5.8× memory blowup on multi-project lint. Per-filereleaseFile(commit3e161c2) only releases what tsslint owns; the tsgo client's hydrated AST + Type/Symbol object cache stays pinned until snapshot dispose.What this PR does own
Within tsslint's adapter layer, every issue we can address has been addressed:
clear()onbackend.close()invalidateFile(name)for--fixrewrites; worker wires itreleaseFile(name)after each file's lint pass — drops bound SF + position maps for GCJSImportDeclaration)Tests:
tsgo-backend.test.ts— 15 cases (adapter, JS-symbols, invalidation, Symbol identity)program-host.test.ts— 17/17 unchangedcache-flow.test.ts— 50/50 unchangedmeta-frameworks.test.ts— 8/8 unchangedRecommendation: hold on branch
--tsgodoesn't beatts.Programon any workload that matters today:Architecturally the work is sound and reusable. When upstream addresses any of (1)/(3)/(6) above, this branch becomes immediately useful. Rebasing onto master should remain cheap because we changed almost nothing outside the new
lib/tsgo-*.tsfiles.The right call is to leave this open as a draft and revisit when upstream tsgo ships:
getResolvedSignaturefor generic method callsisTypeAssignableTo/getApparentTypeSourceFileCache.releaseSourceFile(path)(or in-process FFI binding)Test plan
node packages/cli/test/tsgo-backend.test.js— 15/15node packages/cli/test/program-host.test.js— 17/17node packages/cli/test/cache-flow.test.js— 50/50node packages/cli/test/meta-frameworks.test.js— 8/8--fixexercise on Dify: no JS errors, parity with non-fix run--tsgo) regression check: no change in output or perf