-
Notifications
You must be signed in to change notification settings - Fork 20
Add async iterator on result #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for the PR! This is a very cool idea. To make it even better, and to fit in with the rest of the API, it should allow iterating over either row arrays or row objects, and support the raw or converted (to JS, JSON, or custom) variants. To make that maintainable, we'd like need an async chunk iterator as a building block. If you'd like to give that a shot, go ahead, or I can try to outline the API I have in mind when I get some time. |
I tried wiring up support for all the variants, but it add a lot of stuff in the codebase, i feel like the kind of call that’s yours to make. This is just a minimal version that could serve as a base. this binding is already a blessing compared to the first one — I’d rather not mess it up Performance-wise, I was surprised how much per-row object creation adds up. With a template object + Object.create for each row i got a ~10% improvement though it’s hard to benchmark. but yeah at this level its best to let the consumer choose to eat the cost or not I’m working on a more experimental, fully typed high-level DuckDB TypeScript runtime, and this is the UX I’ve landed on based on the select return value:
|
Yes, the reason for the variants is to provide a choice between convenience and performance. Generally the column-oriented ones are going to perform better than the row-oriented ones, and raw arrays will perform better than objects, but for small results it doesn't matter, and rows and objects can be convenient at times. Supporting all the variants without a lot of code duplication that's hard to maintain took some iteration. I think it could be done while also supporting async iterators, but it will take some experimentation, which I haven't had time for yet. (I still hope to, though probably not very soon.) That library/runtime you're building looks interesting. How are you ensuring the results are correctly typed? I'd like to provide better typing for results, but I haven't discovered a good way yet. (See #140.) |
I follow a similar approach to convex.dev, where intermediate schemas are written to a local .buckdb/ directory. Either on first execution it inspects .columnTypes() dynamically, or — if you’re in a live environment — it can describe the schema ahead of time (e.g. https://buckdb.pages.dev). It also codegens phantom types from duckdb_functions() and duckdb_types() to produce full method signatures and static type info for function calls. Then it use TS generics to handle joins, CTEs, name aliases, etc. to infer return value btw… are you guys hiring ? |
Summary
This provides a high-level abstraction for result streaming that matches JavaScript language idioms alongside existing chunk-based APIs.
it permit to iterate over query results using
for await
loopsUsage Example
Features Added
[Symbol.asyncIterator]()
method toDuckDBResult
classTechnical Details
DuckDBResult
APITesting
the tests verify: