[Feature]: Custom columns in view created by vectorizer

### What problem does the new feature solve?

I went through documentation to look for this feature, but apparently it doesn't exist yet,
I'm looking for a parameter in vectorizer through which a user can define which all columns from the metadata column he/she want in the view.

Currently in my project, I have multiple PDFs exceeding 200 pages, as a result,
the rows for those particular PDFs exceeds 100s of rows in the embeddings table and consequently I get multiple redundant text for such huge PDFs from the view. As a result, the client side of application consumes a lot of bandwidth. I know, this is not a storage problem, more or less the client has to load a whole lot more data and that too redundant leading to a spike in our read metrics in the db.

The two workaround I found, is to don't query unwanted columns from the view and join it with the metadata column to get the actual text, only at the very last stage when row count gets reduced after all the filters.

Or manually drop the extracted_text column from view and join the data in the final result set.

But each time my data changes and vectorizer runs, that column gets created again in the view, so I have to manually drop that column each week.

I feel this should be a feature, or maybe any of you guys can suggest a better workaround.


PS, I'm a Data Engineer at a startup with less than 1 year of XP, any of your help would mean a lot to me

### What does the feature do?

It would give user, the option to configure which all columns the user want in the view created by the vectorizer.

The user would pass an array of columns present in the main table, which he/she want to be also present in the view.

This would be passed as an optional parameter to ai.vectorizer function.

### Implementation challenges

_No response_

### Are you going to work on this feature?

🦸 Yes , I will submit a PR soon!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature]: Custom columns in view created by vectorizer #874

What problem does the new feature solve?

What does the feature do?

Implementation challenges

Are you going to work on this feature?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature]: Custom columns in view created by vectorizer #874

Description

What problem does the new feature solve?

What does the feature do?

Implementation challenges

Are you going to work on this feature?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions