Describe the bug
encounter deserialization error when query have inlist and other filter like below
Error: Internal("PhysicalExpr Column references column 'p_size' at index 1 (zero-based) but input schema only has 1 columns: [\"p_size\"]")
the query is
SELECT p_size FROM part WHERE p_size IN (14, 6, 5, 31) and p_partkey > 1000
To Reproduce
add a reproduce in pr https://github.com/apache/datafusion/pull/17224/files
this is another reproduce https://github.com/haohuaijin/inlist-reproduce
The code is as follows
use std::sync::Arc;
use arrow::datatypes::{DataType, Field, Schema, SchemaRef};
use datafusion::{
datasource::{
file_format::parquet::ParquetFormat,
listing::{ListingOptions, ListingTableUrl},
},
prelude::SessionContext,
};
use datafusion_proto::{
physical_plan::{AsExecutionPlan, DefaultPhysicalExtensionCodec},
protobuf::PhysicalPlanNode,
};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let ctx = SessionContext::new();
let listing_options = ListingOptions::new(Arc::new(ParquetFormat::default()));
let table_path = ListingTableUrl::parse("data.parquet")?;
ctx.register_listing_table(
"default",
&table_path,
listing_options.clone(),
Some(get_schema()),
None,
)
.await?;
let plan = ctx
.sql("select message from default where message in ('a', 'b', 'c', 'd') and timestamp >= 1")
.await
.unwrap()
.create_physical_plan()
.await
.unwrap();
let node: PhysicalPlanNode =
PhysicalPlanNode::try_from_physical_plan(plan, &DefaultPhysicalExtensionCodec {}).unwrap();
let plan = node
.try_into_physical_plan(&ctx, &ctx.runtime_env(), &DefaultPhysicalExtensionCodec {})
.unwrap();
println!("{:?}", plan);
Ok(())
}
fn get_schema() -> SchemaRef {
SchemaRef::new(Schema::new(vec![
Field::new("timestamp", DataType::Int64, false),
Field::new("message", DataType::Utf8, true),
]))
}
Expected behavior
deserialization success
Additional context
it work fine in datafusion v47 and v48
look like related #16665 and #16744
Describe the bug
encounter deserialization error when query have inlist and other filter like below
the query is
To Reproduce
add a reproduce in pr https://github.com/apache/datafusion/pull/17224/files
this is another reproduce https://github.com/haohuaijin/inlist-reproduce
The code is as follows
Expected behavior
deserialization success
Additional context
it work fine in datafusion v47 and v48
look like related #16665 and #16744