Optimizing batch processing for transformer-based Word Embeddings on GPU #9267

maziyarpanahi · 2022-06-13T12:32:23Z

The transformer-based word embedding models benefit massively from being computed on GPU devices (locally and in a cluster). However, there is one use case (worst-case scenario) where there is 1 sentence per row. In this scenario, the local GPU device suffers a big performance drawback compared to cluster mode or multiple sentences per row.

This PR follows the work that was done for BERT (word and sentence embeddings) annotators here which improves utilizing GPU device locally for the following model architecture: #6462

Done previously:

BertEmbeddings
BertSentenceEmbeddings
CamemBertEmbeddings

Description

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Bug fix (non-breaking change which fixes an issue)
Code improvements with no or little impact
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING page.
I have added tests to cover my changes.
All new and existing tests passed.

- this improves performance on GPU when there is only a single sentence present in each row

maziyarpanahi added 2 commits June 9, 2022 07:54

Fix rare serialization issue with SP model in XLM-RoBERTa

f3aa7c8

Improve transformers on GPU device

e8a25da

- this improves performance on GPU when there is only a single sentence present in each row

maziyarpanahi added enhancement DON'T MERGE Do not merge this PR labels Jun 13, 2022

maziyarpanahi self-assigned this Jun 13, 2022

maziyarpanahi changed the base branch from master to release/400-release-candidate June 13, 2022 12:32

maziyarpanahi merged commit ceb44ff into release/400-release-candidate Jun 13, 2022

maziyarpanahi mentioned this pull request Jun 13, 2022

Release/400 release candidate #8320

Merged

8 tasks

KshitizGIT deleted the feature/batch-opt-gpu-transformers branch March 2, 2023 10:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimizing batch processing for transformer-based Word Embeddings on GPU #9267

Optimizing batch processing for transformer-based Word Embeddings on GPU #9267

Uh oh!

maziyarpanahi commented Jun 13, 2022 •

edited

Loading

Uh oh!

Uh oh!

Optimizing batch processing for transformer-based Word Embeddings on GPU #9267

Optimizing batch processing for transformer-based Word Embeddings on GPU #9267

Uh oh!

Conversation

maziyarpanahi commented Jun 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

Uh oh!

Uh oh!

maziyarpanahi commented Jun 13, 2022 •

edited

Loading