Skip to content

feat: Consider providing a human-readable / configuration-file format to express a subset of the Ibis syntax for transparently shareable data-preprocessing pipelines. #11451

@mmcdermott

Description

@mmcdermott

Is your feature request related to a problem?

I'm not an Ibis user, so not precisely

What is the motivation behind your request?

I do a lot of data pre-processing work for ML / AI workflows. A lot of these workflows are very similar, and require a small subset of SQL/dataframe manipulation operations (and some that are a bit more esoteric in typical SQL settings). Things like basic arithmetic, timestamp resolution from strings and/or adding time resolution to a date, or date resolution to a year, regex manipulations, string interpolations, re-typing.

Often, these operations are written in code, with the user's custom style, custom db backend, etc., and shared with people who may or may not have expertise in that custom style or backend (and sometimes even to people without strong coding backgrounds, such as the local data owners).

Describe the solution you'd like

I would like a tool that let's me specify a simple configuration file (e.g., in YAML format) that is largely human-readable without coding expertise that specifies these operations clearly and concisely -- that can then be translated into backend specific dataframe operations that can be used to transform an input set of dataframes into an output set of dataframes directly. If done well, it should be easier to write this configuration file then it would be to write the custom code required to do the target transformations, and such a language could be shared with local data owners who have subject matter expertise, but not python expertise.

I think it'd be great to use Ibis as the backend for this functionality, so this simple configuration file could map to Ibis transformations, which could map to back-end operations. Would such a feature be of interest to the Ibis community? Are there any plans to support such a feature?

What version of ibis are you running?

I'm not

What backend(s) are you using, if any?

I'm not

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeatures or general enhancements

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions