Open
Description
🚀 Feature
Motivation
I'm new to Torch coming from Keras and TF predominantly and from a new to the library perspective, many layers and loss functions in Torch proper seem to expect batch_first to be True where as in Torchtext that is False and it creates a disconnect between the main library and this extension. Making True the default will align more with Torch and reduce friction when writing models using Torchtext as the data loader.
Pitch
Switch the default arg for batch_first to True from False.
Alternatives
Additional context
This may be addressed in the major overhaul you guys are planning/working on but I'm not sure what you're changing in terms of the datasets and iterators so this may or may not be a relevant suggestion.