Skip to content

Fix cast inversion for integer-to-numeric casts#35888

Draft
antiguru wants to merge 4 commits intomainfrom
claude/fix-int8-filter-casting-9ARC9
Draft

Fix cast inversion for integer-to-numeric casts#35888
antiguru wants to merge 4 commits intomainfrom
claude/fix-int8-filter-casting-9ARC9

Conversation

@antiguru
Copy link
Copy Markdown
Member

@antiguru antiguru commented Apr 7, 2026

Summary

Fixes the optimizer so that filters like uint_col = 0 can use indexes, instead of casting both sides to Numeric.

When a UInt64 column is compared to an integer literal, the planner resolves to (Numeric, Numeric) equality (since there's no implicit cast from Int32 to UInt64). This inserts CastUint64ToNumeric on the column side, preventing index usage. The existing invert_casts_on_expr_eq_literal mechanism should move the cast from the column to the literal, but it wasn't firing because:

  1. The 6 CastIntNToNumeric / CastUintNToNumeric EagerUnaryFunc impls didn't override preserves_uniqueness (defaulting to false), even though integer-to-numeric is injective.
  2. The inverse (CastNumericToIntN) doesn't preserve uniqueness (it rounds), and the old condition required both func and inverse to preserve uniqueness.

Changes

  • Add preserves_uniqueness() -> true to all 6 integer-to-numeric cast impls (CastInt16ToNumeric, CastInt32ToNumeric, CastInt64ToNumeric, CastUint16ToNumeric, CastUint32ToNumeric, CastUint64ToNumeric).
  • Add round-trip verification to invert_casts_on_expr_eq_literal_inner: when the inverse doesn't preserve uniqueness, check that func(inverse(literal)) == literal. This safely confirms the inverse produced the exact pre-image without lossy rounding.
  • Apply same logic to impossible_literal_equality_because_types so predicates like uint_col = -1 or uint_col = 3.14 are detected as impossible and constant-folded to empty.

Motivation

This particularly affects introspection views like mz_dataflow_addresses that filter on worker_id = 0 where worker_id is uint8. Without this fix, the Numeric cast prevents index usage. With this fix, the optimizer inverts the cast and produces worker_id = 0::uint8, enabling index lookups.

Test plan

  • cargo test -p mz-expr — unit tests pass
  • SLT tests added in test/sqllogictest/transform/literal_constraints.slt covering:
    • Index lookup on uint8 column with integer literal
    • IN list with integer literals
    • Negative literal detected as impossible (Constant <empty>)
    • Fractional literal detected as impossible via round-trip check

https://claude.ai/code/session_01GH1nqb2vmcQTRhXQ1c8FY8

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

@antiguru antiguru force-pushed the claude/fix-int8-filter-casting-9ARC9 branch from 74fe1e5 to ee8b6f1 Compare April 7, 2026 09:48
The `invert_casts_on_expr_eq_literal` mechanism moves casts from the
column side to the literal side in equality predicates, enabling index
lookups. It wasn't firing for integer-to-numeric casts because:

1. The 6 `CastIntNToNumeric` / `CastUintNToNumeric` EagerUnaryFunc impls
   didn't override `preserves_uniqueness` (defaulting to false), even
   though integer-to-numeric is injective.

2. Even with that fix, the inverse (`CastNumericToIntN`) doesn't preserve
   uniqueness (it rounds), and the old condition required both func AND
   inverse to preserve uniqueness.

Fix (1) by adding `preserves_uniqueness() -> true` to all 6 impls.

Fix (2) by adding a round-trip verification: when the inverse doesn't
preserve uniqueness, check that `func(inverse(literal)) == literal`.
This safely confirms the inverse produced the exact pre-image without
any lossy rounding. Apply the same logic to
`impossible_literal_equality_because_types` so that predicates like
`uint_col = -1` or `uint_col = 3.14` are detected as impossible.

https://claude.ai/code/session_01GH1nqb2vmcQTRhXQ1c8FY8
@antiguru antiguru force-pushed the claude/fix-int8-filter-casting-9ARC9 branch from ee8b6f1 to 642637a Compare April 7, 2026 10:41
@antiguru antiguru changed the title Fix cast inversion for unsigned integer comparisons with literals Fix cast inversion for integer-to-numeric casts Apr 7, 2026
claude and others added 3 commits April 7, 2026 11:44
The cast inversion logic added in the previous commit only fired during
LiteralConstraints index matching — it didn't rewrite the predicates
themselves. Add the inversion to MirScalarExpr::reduce so that
`Cast(col, Numeric) = Literal(Numeric(0))` is simplified to
`col = Literal(0::uint64)` directly in the expression tree.

Also fold impossible equalities (e.g., uint_col = -1, uint_col = 3.14)
to false during reduction.

https://claude.ai/code/session_01GH1nqb2vmcQTRhXQ1c8FY8
Fix an infinite loop in `MirScalarExpr::reduce()` caused by the cast
inversion added in the previous commit. The issue:
`invert_casts_on_expr_eq_literal` always places the literal in the
`expr1` position when returning, even when no cast inversion occurs.
This conflicts with the canonical ordering (which may swap operands so
the "smaller" expression is first), causing the two transformations to
fight indefinitely in the fixed-point loop.

The fix guards the cast inversion call so it only fires when at least
one side of the equality is a `CallUnary` (i.e., an actual cast is
present). When neither side has a function call, there's nothing to
invert and skipping avoids the operand-order oscillation.

Also update 8 builtin view descriptors that now correctly infer keys
after the cast inversion removes casts from `WHERE worker_id = 0`
predicates in per-worker views.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert the reduce() integration of cast inversion (from a7c140f) and
the associated builtin view key updates.

Doing cast inversion in reduce() is too early in the optimizer pipeline:
it strips casts before LiteralConstraints can match them against indexes.
For example, an index on `a::integer` would no longer be used for
`WHERE a = 0` because reduce() already removed the `::integer` cast.

Additionally, the reduce() integration caused an infinite loop because
`invert_casts_on_expr_eq_literal` flips operand order even when no
inversion occurs, fighting with the canonical ordering in the
fixed-point loop.

The first commit (642637a) with preserves_uniqueness + round-trip
verification remains — it correctly enables cast inversion for
integer-to-numeric casts in LiteralConstraints and CanonicalizeMfp
where index matching has already been resolved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants