Skip to content

Fix ground truth for inheritance/MRO benchmarks (Liskov substitution)#14

Open
jaltmayerpizzorno wants to merge 21 commits intosecure-software-engineering:mainfrom
plasma-umass:fix-inheritance-ground-truth
Open

Fix ground truth for inheritance/MRO benchmarks (Liskov substitution)#14
jaltmayerpizzorno wants to merge 21 commits intosecure-software-engineering:mainfrom
plasma-umass:fix-inheritance-ground-truth

Conversation

@jaltmayerpizzorno
Copy link
Copy Markdown
Contributor

@jaltmayerpizzorno jaltmayerpizzorno commented Mar 13, 2026

Hi! Thanks again for creating and maintaining TypeEvalPy — it has been an invaluable resource for our work evaluating type inference tools.

While running the benchmarks, we noticed that 5 inheritance/MRO ground truth annotations use only each method body's return type, without accounting for the Liskov substitution principle. When annotated as given, mypy --strict reports incompatible override errors on all of them. Widening the parent method return types to include the subclass override types resolves this and makes the annotations consistent with what a type-safe program requires.

Affected benchmarks

Benchmark Function Before After
classes/inheritance_overriding MyClass.func str int|str
mro/parents_same_superclass A.func str int|str
mro/self_assignment B.func int int|str
mro/two_parents B.func str int|str
mro/two_parents_method_defined A.func float float|str
mro/two_parents_method_defined B.func int int|str

Note: B.func in two_parents_method_defined is widened to int|str (not float|int|str), because float comes from A.func and A is not in B's class hierarchy — they are unrelated sibling co-parents of C. The LSP widening should only include overrides from B's own subclass chain.

We verified with mypy --strict that the original annotations produce override errors and the corrected ones pass cleanly.

Thanks for considering this!

…meration;

- made test more interesting by substituting <value1> with more than just "int";
…able definition.

  Corresponds change a40d4db in the templates;
The previous ground truth annotated each method with only its body's
return type, ignoring that subclass overrides must have compatible
return types per the Liskov substitution principle. When annotated as
given, mypy --strict reports override errors on every affected
benchmark. The corrected annotations widen parent method return types
to include subclass override types, making all benchmarks pass mypy.

Affected benchmarks:
- classes/inheritance_overriding: MyClass.func str -> int|str
- mro/parents_same_superclass: A.func str -> int|str
- mro/self_assignment: B.func int -> int|str
- mro/two_parents: B.func str -> int|str
- mro/two_parents_method_defined: A.func float -> float|str,
  B.func int -> float|int|str
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates TypeEvalPy micro-benchmark ground-truth annotations for inheritance/MRO cases so method return types respect Liskov substitution (i.e., base method types are widened to accommodate override return types), aligning the benchmarks with what a type-safe program requires under strict type checking.

Changes:

  • Widen base-class method return types in inheritance overriding to include subclass override return types.
  • Adjust MRO/multiple-inheritance ground truth return types to avoid incompatible override relationships.
  • Expand affected main_gt.json entries to represent unions of valid polymorphic return types.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
micro-benchmark/python_features/mro/two_parents_method_defined/main_gt.json Widens A.func and B.func return types to unions to account for overrides in the hierarchy.
micro-benchmark/python_features/mro/two_parents/main_gt.json Widens B.func return type to include the type introduced via MRO in subclass C.
micro-benchmark/python_features/mro/self_assignment/main_gt.json Widens B.func return type to include the subclass override type.
micro-benchmark/python_features/mro/parents_same_superclass/main_gt.json Widens A.func return type to include the overriding subclass return type.
micro-benchmark/python_features/classes/inheritance_overriding/main_gt.json Widens MyClass.func return type to include the subclass override type.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

B.func should not include float in its return type: float comes from
A.func, but A is not in B's class hierarchy (they are sibling co-parents
of C). The LSP-widened type for B.func is int|str (B's own int plus
C's override str), not float|int|str.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants