swev-id: django__django-15563#514
Conversation
Test & Lint Summary
Result: 4 tests passed; linting via compileall succeeded with no errors. |
|
Requesting review for this fix. References Issue #504. |
noa-lucent
left a comment
There was a problem hiding this comment.
Thanks for tackling the MTI update bug. I found two places where we silently skip work when the update invariants break: (1) in the compiler we when returns None, and (2) in we fall back to an empty list when a model key is missing. Both cases mask real errors and leave parent tables stale. Please make these code paths fail loudly instead of papering over them.
django/db/models/sql/compiler.py
Outdated
| for model in self.query.related_updates: | ||
| parent_link = self.query.get_meta().get_ancestor_link(model) | ||
| if parent_link is None: | ||
| continue |
There was a problem hiding this comment.
[major] returning would mean we somehow enqueued a model in that isn’t actually an ancestor of the base model. That’s an invariant violation, and silently -ing leaves the parent table stale without any signal. In this internal code we should fail loudly instead of hiding the problem. Please turn this into an assertion (or otherwise raise) so that we surface the bug rather than silently skipping the update.
django/db/models/sql/subqueries.py
Outdated
| query.values = values | ||
| if self.related_ids is not None: | ||
| query.add_filter("pk__in", self.related_ids) | ||
| ids = self.related_ids.get(model, []) |
There was a problem hiding this comment.
[major] now populates per ancestor, so by the time we get here every in should have an entry. Falling back to masks cases where we failed to collect IDs and silently turns the parent update into a no-op. Please require the key (e.g. ) or assert that it exists so we fail loudly when invariants break.
noa-lucent
left a comment
There was a problem hiding this comment.
The assertions you added address the invariants I was worried about and prevent silent skips in the parent updates. All previous concerns are resolved from my side.
noa-lucent
left a comment
There was a problem hiding this comment.
Thanks for the fix and coverage. I left one blocking note on the new SQL assertion test: as written it still passes against the buggy implementation because the condition accepts the child PKs. Please tighten the assertions so the test fails if we regress.
| continue | ||
| self.assertIn(" WHERE ", statement) | ||
| self.assertIn(" IN ", statement) | ||
| self.assertTrue( |
There was a problem hiding this comment.
[major] This regression test still passes against the broken implementation. When the UPDATE on incorrectly uses the child PKs (the original bug), the SQL still contains and , so both succeed. To prove we really constrain by the parent-link IDs, the assertions need to check each table separately: e.g. require for the secondary-parent UPDATE and for the primary-parent UPDATE (and ideally assert the wrong IDs are absent). That way the test will fail if we regress to selecting the wrong key. Can you tighten the assertions accordingly?
| continue | ||
| self.assertIn(" WHERE ", statement) | ||
| self.assertIn(" IN ", statement) | ||
| self.assertTrue( |
There was a problem hiding this comment.
[major] The regression still passes against the buggy implementation. When the SecondaryParent UPDATE mistakenly uses the child PKs (the original bug), the SQL still contains self.child_one.pk/self.child_two.pk, so both assertTrue(... or ...) succeed. To prove we really constrain by each parent-link ID, the test should branch per table — e.g. require child_*_secondary_id for the secondary-parent UPDATE and child_*.pk for the primary-parent UPDATE (and ideally assert that the wrong IDs are absent). Without that tightening, this test won’t fail on a regression.
|
Tests & Lint Summary
|
noa-lucent
left a comment
There was a problem hiding this comment.
The tightened SQL assertions now force the secondary-parent UPDATE to use the parent-link IDs and fail against the original bug. My previous concern is resolved.
noa-lucent
left a comment
There was a problem hiding this comment.
Thanks for digging into the MTI update regression. The fix currently adds a second query per ancestor and broadens the UpdateQuery API. To keep the patch focused, please revert the related_ids dict changes (and the new MTI test module) and instead extend the existing preselect so it returns the ancestor pk columns in the same round-trip. Once those columns are available we can rebuild the per-parent id lists without cloning new queries or touching the UpdateQuery surface. Happy to take another look once that’s in place.
django/db/models/sql/compiler.py
Outdated
| parent_link is not None | ||
| ), "Related update requested for non-ancestor model" | ||
| parent_query = query.clone() | ||
| parent_query.select = [] |
There was a problem hiding this comment.
[major] This loop is doing a second pass over the update candidates for every ancestor. That gives us 1+n round trips per update and forces the rest of the compiler to grow the dict plumbing. The regression we need to cover is only that the existing preselect wasn’t including the ancestor PK columns. We can fix that by having the first select the child PK plus any parent-link attnames that aren’t already on the primary-key chain, then build the per-parent id lists from that single result set. Could you revert the extra dict machinery and fold the ancestor IDs into the existing preselect instead?}
There was a problem hiding this comment.
Updated pre_sql_setup() to fold ancestor IDs into the primary preselect and raise if we ever miss a parent link. Ancestor queries now reuse that single result set without any extra round-trips.
django/db/models/sql/subqueries.py
Outdated
| """ | ||
| self.values = [] | ||
| self.related_ids = None | ||
| self.related_ids = {} |
There was a problem hiding this comment.
[major] Let’s keep UpdateQuery.related_ids as the original list of primary keys. Once pre_sql_setup() returns the ancestor IDs from its single preselect we can fan them out per ancestor without changing the query object’s public surface. The dict cloning/copying here is extra churn that we don’t need once the compiler collects all columns up front. Could you drop this type change and keep the old related_ids storage?
There was a problem hiding this comment.
Restored UpdateQuery.related_ids to lists populated by the compiler pass. We now error out if the compiler fails to supply IDs instead of materialising empty dict entries.
| if self.related_ids is not None: | ||
| query.add_filter("pk__in", self.related_ids) | ||
| try: | ||
| ids = self.related_ids[model] |
There was a problem hiding this comment.
[major] Once the compiler hands us all the ancestor PKs via a single preselect, we can populate the per-model id lists without any KeyError plumbing or extra assertions here. Please revert this catch/raise block and keep the old pk__in call that reused the preselected ids.
There was a problem hiding this comment.
Dropped the fallback plumbing—get_related_updates() now expects the compiler to hand back the ancestor PK lists and will raise immediately if any model is missing.
| @@ -0,0 +1,259 @@ | |||
| import re | |||
There was a problem hiding this comment.
[major] The regression coverage shouldn’t need its own 250-line test module with dynamically constructed apps. We can exercise this bug inside the existing suite by extending the current multi-table inheritance cases. Please drop this new file and keep the coverage focused in the existing test module.
There was a problem hiding this comment.
Moved the regression coverage into tests/model_inheritance/tests.py so it extends the existing MTI suite instead of living in a standalone module. The tightened SQL assertions now distinguish parent-link IDs from child PKs.
Summary
Test & Lint
|
noa-lucent
left a comment
There was a problem hiding this comment.
The compiler now folds ancestor parent-link columns into the single preselect and reuses those values to build per-parent pk lists, so the related updates no longer drift onto unrelated rows. The regression coverage exercises mixed parent updates and verifies the emitted SQL constrains each ancestor with the right ids. Looks good to me.
Summary
Testing
Pre-fix failure
Resolves #504