Skip to content
This repository was archived by the owner on Mar 16, 2022. It is now read-only.
This repository was archived by the owner on Mar 16, 2022. It is now read-only.

Model Compositionality #6

Closed
Closed
@parmeet

Description

@parmeet

One of the dominant scenario for text is to use some pre-trained encoder (Roberta, BERT, XLMR etc) and attach task specific head on top of it (classification head, Language modeling head, POS tagging head, Q&A head etc). I believe this is also true for Vision as well (as well as to audio @mthrok ?). To the best of my knowledge (please correct me if I am am mistaken), vision currently provides factory function for every possible combination there-of? This approach is somewhat limiting in terms of scalability and boiler-plate code over-head that comes with it. Also versioning could be bit redundant if we replicate same weights class across each combination for the encoder part.

I wonder what folks think about extending this framework to support model composition?

As a reference HF also explicitly provide classes for every combination. Here is one example for Roberta Encoder + Q&A task.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions