Wren Engine Bug: Composite Join Condition Produces Incorrect SQL
Describe the bug
When a relationship condition uses a composite key (two or more column pairs joined with AND), the Wren Engine's query planner generates an incorrect ON clause in the expanded SQL. Both sides of the join condition are resolved to the same table alias, producing a self-referencing condition like ON model_a.col1 = model_a.col2 instead of the correct cross-table ON model_a.col1 = model_b.col1 AND model_a.col2 = model_b.col2.
Single-column join conditions work correctly. The bug is triggered specifically by the AND-separated composite condition.
To Reproduce
- Create
orders and customers tables in PostgreSQL with a composite key (tenant_id + custkey):
CREATE TABLE customers (
tenant_id VARCHAR(50) NOT NULL,
custkey INTEGER NOT NULL,
name VARCHAR(100) NOT NULL,
PRIMARY KEY (tenant_id, custkey)
);
CREATE TABLE orders (
tenant_id VARCHAR(50) NOT NULL,
orderkey INTEGER NOT NULL,
custkey INTEGER NOT NULL,
price NUMERIC(10,2),
PRIMARY KEY (tenant_id, orderkey),
FOREIGN KEY (tenant_id, custkey) REFERENCES customers(tenant_id, custkey)
);
INSERT INTO customers (tenant_id, custkey, name) VALUES
('ACME', 1, 'Alice Johnson'), ('ACME', 2, 'Bob Smith'),
('GLOBEX', 1, 'Carol Williams'), ('GLOBEX', 2, 'Dave Brown');
INSERT INTO orders (tenant_id, orderkey, custkey, price) VALUES
('ACME', 101, 1, 250.00), ('ACME', 102, 2, 175.50),
('GLOBEX', 201, 1, 320.00), ('GLOBEX', 202, 2, 89.99);
- Create this MDL with a composite join condition (
tenant_id + custkey):
{
"catalog": "wren_poc",
"schema": "public",
"models": [
{
"name": "orders",
"tableReference": { "catalog": "wren_poc", "schema": "public", "table": "orders" },
"primaryKey": "orderkey",
"columns": [
{ "name": "tenant_id", "type": "VARCHAR" },
{ "name": "orderkey", "type": "INTEGER" },
{ "name": "custkey", "type": "INTEGER" },
{ "name": "price", "type": "FLOAT" },
{ "name": "customers", "type": "customers", "relationship": "orders_customers" },
{
"name": "customer_name",
"type": "VARCHAR",
"isCalculated": true,
"expression": "customers.name"
}
]
},
{
"name": "customers",
"tableReference": { "catalog": "wren_poc", "schema": "public", "table": "customers" },
"primaryKey": "custkey",
"columns": [
{ "name": "tenant_id", "type": "VARCHAR" },
{ "name": "custkey", "type": "INTEGER" },
{ "name": "name", "type": "VARCHAR" }
]
}
],
"relationships": [
{
"name": "orders_customers",
"models": ["orders", "customers"],
"joinType": "MANY_TO_ONE",
"condition": "orders.tenant_id = customers.tenant_id AND orders.custkey = customers.custkey"
}
]
}
- Query the calculated field:
SELECT orderkey, customer_name, price FROM orders LIMIT 5
Actual behavior
The Wren Engine generates incorrect SQL where the ON clause references the same table on both sides:
-- Simplified from actual dry-plan output:
... RIGHT OUTER JOIN (...) AS orders
ON orders.tenant_id = orders.custkey -- BUG: should be customers.tenant_id = orders.tenant_id
-- AND customers.custkey = orders.custkey
The condition orders.tenant_id = orders.custkey compares two different columns on the same table (tenant_id vs custkey), which is semantically wrong and returns 0 rows.
Expected behavior
The expanded SQL should produce a correct cross-table join:
... RIGHT OUTER JOIN (...) AS orders
ON customers.tenant_id = orders.tenant_id
AND customers.custkey = orders.custkey
Workaround
Use a single-column join condition. If both tables have RLAC rules that filter by tenant_id, the tenant isolation is already enforced independently on each table — the join only needs the business key:
{
"name": "orders_customers",
"models": ["orders", "customers"],
"joinType": "MANY_TO_ONE",
"condition": "orders.custkey = customers.custkey"
}
This produces correct SQL:
ON customers.custkey = orders.custkey
Analysis
The bug appears to be in how the query planner resolves the composite condition string. With a single-column condition (orders.custkey = customers.custkey), the planner correctly maps each side to the appropriate table alias. With a composite condition containing AND, the planner appears to collapse or mismap the table references.
This was discovered in a multi-tenant healthcare analytics system where models use composite keys (client + personuuid) for tenant-isolated joins. The RLAC rule client = @client_id already enforces tenant isolation on each table independently, so the workaround (single-column join on personuuid only) is functionally correct. However, the composite join should work as documented.
Environment
- Wren Engine fork based on
ghcr.io/canner/wren-engine-ibis:latest
- PostgreSQL 15 (Docker)
- Linux (Oracle Linux 8, x86_64)
- Tested via ibis-server v3
/dry-plan and /query endpoints
Relationship to previous bug report
This is a separate issue from the "relationship handle column name must match model name" bug documented in wren-engine-bug-report.md. That bug was about expression resolution; this bug is about composite join condition expansion in the query planner.
Documentation gap
The Advanced Tutorial only shows single-column join conditions (Orders.custkey = Customer.custkey). No examples of composite/multi-column join conditions exist in the documentation. It's unclear whether composite conditions are a supported feature or an undocumented edge case.
Wren Engine Bug: Composite Join Condition Produces Incorrect SQL
Describe the bug
When a relationship
conditionuses a composite key (two or more column pairs joined with AND), the Wren Engine's query planner generates an incorrect ON clause in the expanded SQL. Both sides of the join condition are resolved to the same table alias, producing a self-referencing condition likeON model_a.col1 = model_a.col2instead of the correct cross-tableON model_a.col1 = model_b.col1 AND model_a.col2 = model_b.col2.Single-column join conditions work correctly. The bug is triggered specifically by the AND-separated composite condition.
To Reproduce
ordersandcustomerstables in PostgreSQL with a composite key (tenant_id+custkey):tenant_id+custkey):{ "catalog": "wren_poc", "schema": "public", "models": [ { "name": "orders", "tableReference": { "catalog": "wren_poc", "schema": "public", "table": "orders" }, "primaryKey": "orderkey", "columns": [ { "name": "tenant_id", "type": "VARCHAR" }, { "name": "orderkey", "type": "INTEGER" }, { "name": "custkey", "type": "INTEGER" }, { "name": "price", "type": "FLOAT" }, { "name": "customers", "type": "customers", "relationship": "orders_customers" }, { "name": "customer_name", "type": "VARCHAR", "isCalculated": true, "expression": "customers.name" } ] }, { "name": "customers", "tableReference": { "catalog": "wren_poc", "schema": "public", "table": "customers" }, "primaryKey": "custkey", "columns": [ { "name": "tenant_id", "type": "VARCHAR" }, { "name": "custkey", "type": "INTEGER" }, { "name": "name", "type": "VARCHAR" } ] } ], "relationships": [ { "name": "orders_customers", "models": ["orders", "customers"], "joinType": "MANY_TO_ONE", "condition": "orders.tenant_id = customers.tenant_id AND orders.custkey = customers.custkey" } ] }Actual behavior
The Wren Engine generates incorrect SQL where the ON clause references the same table on both sides:
The condition
orders.tenant_id = orders.custkeycompares two different columns on the same table (tenant_id vs custkey), which is semantically wrong and returns 0 rows.Expected behavior
The expanded SQL should produce a correct cross-table join:
Workaround
Use a single-column join condition. If both tables have RLAC rules that filter by
tenant_id, the tenant isolation is already enforced independently on each table — the join only needs the business key:{ "name": "orders_customers", "models": ["orders", "customers"], "joinType": "MANY_TO_ONE", "condition": "orders.custkey = customers.custkey" }This produces correct SQL:
Analysis
The bug appears to be in how the query planner resolves the composite condition string. With a single-column condition (
orders.custkey = customers.custkey), the planner correctly maps each side to the appropriate table alias. With a composite condition containing AND, the planner appears to collapse or mismap the table references.This was discovered in a multi-tenant healthcare analytics system where models use composite keys (
client + personuuid) for tenant-isolated joins. The RLAC ruleclient = @client_idalready enforces tenant isolation on each table independently, so the workaround (single-column join onpersonuuidonly) is functionally correct. However, the composite join should work as documented.Environment
ghcr.io/canner/wren-engine-ibis:latest/dry-planand/queryendpointsRelationship to previous bug report
This is a separate issue from the "relationship handle column name must match model name" bug documented in
wren-engine-bug-report.md. That bug was about expression resolution; this bug is about composite join condition expansion in the query planner.Documentation gap
The Advanced Tutorial only shows single-column join conditions (
Orders.custkey = Customer.custkey). No examples of composite/multi-column join conditions exist in the documentation. It's unclear whether composite conditions are a supported feature or an undocumented edge case.