Post-processing transforms and transforms configuration

I share the concern here with @mthrok raised in #1 . 

It is not uncommon to have post-processing or more precisely decoding schemes in text. This is typically the case when dealing with some kind of text generation task (translation, summarization etc) where we need to convert predicted token ids into corresponding tokens. I wonder what's the recommended way of doing this. 

Could there be a value in encapsulating this inside a transform class whose `__call__` method implement pre-processing and the class consists or special method to perform decoding/post-processing etc? 

Also in case of translation, the transforms may not be fixed w.r.t model but require some configuration input from user. For example, now we have universal models that can convert back and forth among 100 different languages. But when it comes to transforms, user would need to explicitly specify which language pair they want to work with, so that the corresponding encoding/decoding schemes can be instantiated. My understanding so far is that these transforms are static w.r.t to corresponding model. If so,  In what way the proposed API can be extended to accommodate user-configurable transforms?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Post-processing transforms and transforms configuration #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Post-processing transforms and transforms configuration #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions