Validate SQLModel on all Non-Database Sourced Data#1823
Validate SQLModel on all Non-Database Sourced Data#1823ClayGendron wants to merge 2 commits intofastapi:mainfrom
Conversation
Enable Pydantic validation on table=True model __init__, matching the existing behavior of model_validate(). Validation does not run on ORM loads from the database — SQLAlchemy does not call __init__ when hydrating from query results.
mahdirajaee
left a comment
There was a problem hiding this comment.
This is a significant behavioral change — table models now run Pydantic validation on __init__ instead of bypassing it via sqlmodel_table_construct. The key insight is that validate_python(data, self_instance=self) is called unconditionally for both table and non-table models now, which means field validators, model validators, and type coercion all fire during construction. The test suite additions are excellent and cover the critical scenarios: field_validator, model_validator (both before/after modes), BeforeValidator via Annotated, and crucially test_validation_does_not_run_on_orm_load which verifies that loading from the database still bypasses validation (important for not breaking existing data that might not pass newer validators).
The removal of sqlmodel_table_construct is the right call since it was essentially a copy of Pydantic's model_construct() that skipped validation — which was the original bug. One thing to watch: this is a breaking change for users who relied on being able to instantiate table models with missing required fields (the test_not_allow_instantiation_without_arguments test makes this explicit). The old test test_allow_instantiation_without_arguments previously passed with Item() where name: str had no default — that will now raise ValidationError. Projects doing item = Item(); item.name = "..." will need to migrate to Item(name="...").
The sqlmodel_init branching for table vs non-table models post-validation is sensible: non-table models merge dicts directly, while table models use setattr to trigger SQLAlchemy instrumentation and manually restore __pydantic_fields_set__. Worth confirming that the fields_set save/restore around the setattr loop doesn't cause issues if a setattr triggers a SQLAlchemy event that modifies __pydantic_fields_set__ internally, though that seems unlikely in practice.
As noted by other issues and pull requests, when setting
table=Truein aSQLModel, Pydantic validation does not run, and this breaks the contract that "a SQLModel model is also a Pydantic model". This PR builds on prior ones and also hopes to address concerns with changing the intentional validation bypass for table models.First, for those new to this issue, here is an example of how SQLModels behave differently when they are a
table:The same is true for
@field_validatorand@model_validator. Both will be silently ignored whentable=True.Use Case
I am building a library that includes a base SQLModel class (without
table=True) that has validators to normalize and populate data. Downstream developers using my library would then create their owntable=Trueclass that inherits from the base:The base class works correctly on its own, but it can't hold the downstream
project_field:But when a downstream developer inherits with
table=True, both validators are silently skipped:My library must rely on validation from the custom defined inherited class, but it will not work out of the box. My issue could be resolved with the solution described in the multiple models doc, but that approach would mean I would be asking developers to create twice as many classes, one for validation and one for table mapping. Using SQLModel was chosen for this project as it promised to provide a unified model between Pydantic and SQLAlchemy, which in my case, means any initialized model derived from
DocumentBaseis valid, both in python and in the database.The Change
The change is in
sqlmodel_init()in_compat.py. Previously, table models calledsqlmodel_table_construct()which skips validation entirely. Now, all models go throughvalidate_python(), and table models do a post-validation step to re-trigger SQLAlchemy instrumentation viasetattr:This mirrors the existing pattern used by
sqlmodel_validate()(themodel_validate()path) which already validates table models successfully.Addressing Prior Concerns
"SQLAlchemy needs to assign values after instantiation" (#52)
The concern raised in #52 was that relationships need to be assignable after construction, so validation can't run on
__init__.Relationships are not part of
model_fields— they live in__sqlmodel_relationships__and are handled separately, outside of Pydantic validation.validate_python()never sees or validates relationship attributes. Both sides of a bidirectional relationship can be created independently, exactly as before:Performance on ORM Reads
Validation does not run when loading from the database. SQLAlchemy does not call
__init__when hydrating instances from query results (SQLAlchemy docs: Constructors and Object Initialization). This is unchanged, as it is safe to assume that data loaded from the database is valid.To verify, here is a test that writes invalid data directly to the database and confirms it loads without triggering validation:
Breaking Change
This could represent a behavior change for code that previously constructed
table=Truemodels with invalid data.Related Issues and PRs
table=Truenot validate data?table=TrueThank you for reviewing!