Fix cast inversion for integer-to-numeric casts#35888
Draft
Conversation
Contributor
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
74fe1e5 to
ee8b6f1
Compare
The `invert_casts_on_expr_eq_literal` mechanism moves casts from the column side to the literal side in equality predicates, enabling index lookups. It wasn't firing for integer-to-numeric casts because: 1. The 6 `CastIntNToNumeric` / `CastUintNToNumeric` EagerUnaryFunc impls didn't override `preserves_uniqueness` (defaulting to false), even though integer-to-numeric is injective. 2. Even with that fix, the inverse (`CastNumericToIntN`) doesn't preserve uniqueness (it rounds), and the old condition required both func AND inverse to preserve uniqueness. Fix (1) by adding `preserves_uniqueness() -> true` to all 6 impls. Fix (2) by adding a round-trip verification: when the inverse doesn't preserve uniqueness, check that `func(inverse(literal)) == literal`. This safely confirms the inverse produced the exact pre-image without any lossy rounding. Apply the same logic to `impossible_literal_equality_because_types` so that predicates like `uint_col = -1` or `uint_col = 3.14` are detected as impossible. https://claude.ai/code/session_01GH1nqb2vmcQTRhXQ1c8FY8
ee8b6f1 to
642637a
Compare
The cast inversion logic added in the previous commit only fired during LiteralConstraints index matching — it didn't rewrite the predicates themselves. Add the inversion to MirScalarExpr::reduce so that `Cast(col, Numeric) = Literal(Numeric(0))` is simplified to `col = Literal(0::uint64)` directly in the expression tree. Also fold impossible equalities (e.g., uint_col = -1, uint_col = 3.14) to false during reduction. https://claude.ai/code/session_01GH1nqb2vmcQTRhXQ1c8FY8
Fix an infinite loop in `MirScalarExpr::reduce()` caused by the cast inversion added in the previous commit. The issue: `invert_casts_on_expr_eq_literal` always places the literal in the `expr1` position when returning, even when no cast inversion occurs. This conflicts with the canonical ordering (which may swap operands so the "smaller" expression is first), causing the two transformations to fight indefinitely in the fixed-point loop. The fix guards the cast inversion call so it only fires when at least one side of the equality is a `CallUnary` (i.e., an actual cast is present). When neither side has a function call, there's nothing to invert and skipping avoids the operand-order oscillation. Also update 8 builtin view descriptors that now correctly infer keys after the cast inversion removes casts from `WHERE worker_id = 0` predicates in per-worker views. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert the reduce() integration of cast inversion (from a7c140f) and the associated builtin view key updates. Doing cast inversion in reduce() is too early in the optimizer pipeline: it strips casts before LiteralConstraints can match them against indexes. For example, an index on `a::integer` would no longer be used for `WHERE a = 0` because reduce() already removed the `::integer` cast. Additionally, the reduce() integration caused an infinite loop because `invert_casts_on_expr_eq_literal` flips operand order even when no inversion occurs, fighting with the canonical ordering in the fixed-point loop. The first commit (642637a) with preserves_uniqueness + round-trip verification remains — it correctly enables cast inversion for integer-to-numeric casts in LiteralConstraints and CanonicalizeMfp where index matching has already been resolved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the optimizer so that filters like
uint_col = 0can use indexes, instead of casting both sides to Numeric.When a
UInt64column is compared to an integer literal, the planner resolves to(Numeric, Numeric)equality (since there's no implicit cast fromInt32toUInt64). This insertsCastUint64ToNumericon the column side, preventing index usage. The existinginvert_casts_on_expr_eq_literalmechanism should move the cast from the column to the literal, but it wasn't firing because:CastIntNToNumeric/CastUintNToNumericEagerUnaryFuncimpls didn't overridepreserves_uniqueness(defaulting tofalse), even though integer-to-numeric is injective.CastNumericToIntN) doesn't preserve uniqueness (it rounds), and the old condition required both func and inverse to preserve uniqueness.Changes
preserves_uniqueness() -> trueto all 6 integer-to-numeric cast impls (CastInt16ToNumeric,CastInt32ToNumeric,CastInt64ToNumeric,CastUint16ToNumeric,CastUint32ToNumeric,CastUint64ToNumeric).invert_casts_on_expr_eq_literal_inner: when the inverse doesn't preserve uniqueness, check thatfunc(inverse(literal)) == literal. This safely confirms the inverse produced the exact pre-image without lossy rounding.impossible_literal_equality_because_typesso predicates likeuint_col = -1oruint_col = 3.14are detected as impossible and constant-folded to empty.Motivation
This particularly affects introspection views like
mz_dataflow_addressesthat filter onworker_id = 0whereworker_idisuint8. Without this fix, the Numeric cast prevents index usage. With this fix, the optimizer inverts the cast and producesworker_id = 0::uint8, enabling index lookups.Test plan
cargo test -p mz-expr— unit tests passtest/sqllogictest/transform/literal_constraints.sltcovering:uint8column with integer literalINlist with integer literalsConstant <empty>)https://claude.ai/code/session_01GH1nqb2vmcQTRhXQ1c8FY8