Skip to content

Enhance AggregationFuzzer to verify results against Presto #6595

@mbasmanova

Description

@mbasmanova

Description

Currently, Aggregation Fuzzer verifies results against DuckDB. However, not all functions are available in DuckDB and sometimes semantics don't match. It would be better to verify against Presto Java.

We could launch PrestoJava process and talk to it via REST API: https://prestodb.io/docs/current/develop/client-protocol.html

I put together a prototype of a PrestoQueryRunner that can execute Presto queries:

https://github.com/prestodb/presto/compare/master...mbasmanova:presto:native-query-runner?expand=1

There are a few things to figure out still.

(1) By default, Presto returns results in JSON format. This is slow and hard to parse. Hence, I hacked Presto to return results in PrestoPage format (base64-encoded). Perhaps, we could introduce a new HTTP header that a client would specify to request PrestoPage format instead of default JSON format.

(2) The PrestoQueryRunner needs HTTP client. I'm using Proxygen as it is already available in Prestissimo. However, we need this code in Velox, so we can run fuzzer on each PR. Hence, we need to figure out how to add Proxygen dependency to Velox.

(3) PrestoQueryRunner code needs to be hooked into the Aggregation Fuzzer.

CC: @amitkdutta @aditi-pandit @kgpai @laithsakka @pedroerp @spershin

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions