-
Notifications
You must be signed in to change notification settings - Fork 993
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
https://docs.rs/parquet/6.1.0/parquet/file/reader/trait.ChunkReader.html The ChunkReader
trait is designed to hide the details of the underlying storage, which however may use the async
feature, rusto-s3 for example:
pub async fn get_object_range
https://durch.github.io/rust-s3/s3/bucket/struct.Bucket.html#method.get_object_range
That means every get_read
call has to be implemented by blocking on an async function.
impl parquet::file::reader::ChunkReader for ParquetChunkReader {
type T = SliceableCuror;
fn get_read(&self, start: u64, length: usize) -> parquet::errors::Result<Self::SliceableCuror> {
futures::executor::block_on(async move {
bucket.get_object_range(...)
})
}
}
The non-async limitation of the ChunkReader trait will result in some problems, especially when the query execution engines are using async, tokio, specifically, it will have to use tokio::spawn_blocking
to perform a parquet read.
Describe the solution you'd like
A clear and concise description of what you want to happen.
We can add async
to ChunkReader::get_read
.
#[async_trait::async_trait]
pub trait ChunkReader: Length {
type T: Read;
async fn get_read(&self, start: u64, length: usize) -> Result<Self::T>;
}
Many of the API will definitely need also to be changed to async. Until now I have no very clear idea and plan
for moving them to the async version with backward compability.
One possible solution is to provide a totally aync API, and the non-async API wraps it via async_std::block_on
.
pub trait RowGroupReaderAsync {
async fn get_column_reader(&self, i: usize) -> Result<ColumnReader> { ... }
}
pub struct RowGroupReader {
fn get_column_reader() { async_std::block_on(self.get_column_reader_async()) }
}
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.