@@ -3,19 +3,64 @@ Examples
3
3
4
4
.. currentmodule :: examples
5
5
6
- Vision
6
+ In this section, you will find the data loading implementations (using DataPipes) of various
7
+ popular datasets across different research domains.
8
+
9
+ Audio
7
10
-----------
8
11
12
+ LibriSpeech
13
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
14
+
15
+ `LibriSpeech dataset <https://www.openslr.org/12/ >`_ is corpus of approximately 1000 hours of 16kHz read
16
+ English speech. Here is the
17
+ `DataPipe implementation of LibriSpeech <https://github.com/pytorch/data/blob/main/examples/audio/librispeech.py >`_
18
+ to load the data.
19
+
9
20
Text
10
21
-----------
11
22
12
- Audio
23
+ IMDB
24
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
25
+ This is a `large movie review dataset <http://ai.stanford.edu/~amaas/data/sentiment/ >`_ for binary sentiment
26
+ classification containing 25,000 highly polar movie reviews for training and 25,00 for testing. Here is the
27
+ `DataPipe implementation to load the data <https://github.com/pytorch/data/blob/main/examples/text/imdb.py >`_.
28
+
29
+
30
+ SQuAD
31
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
32
+ `SQuAD (Stanford Question Answering Dataset) <https://rajpurkar.github.io/SQuAD-explorer/ >`_ is a dataset for
33
+ reading comprehension. It consists of a list of questions by crowdworkers on a set of Wikipedia articles. Here are the
34
+ DataPipe implementations for `version 1.1 <https://github.com/pytorch/data/blob/main/examples/text/squad1.py >`_
35
+ is here and `version 2.0 <https://github.com/pytorch/data/blob/main/examples/text/squad2.py >`_.
36
+
37
+ Additional Datasets in TorchText
38
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
39
+ In a separate PyTorch domain library `TorchText <https://github.com/pytorch/text >`_, you will find some of the most
40
+ popular datasets in the NLP field implemented as loadable datasets using DataPipes. You can find
41
+ all of those `NLP datasets here <https://github.com/pytorch/text/tree/main/torchtext/datasets >`_.
42
+
43
+
44
+ Vision
13
45
-----------
14
46
15
- Module contents
16
- ---------------
47
+ Caltech 101
48
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
49
+ The `Caltech 101 dataset <http://www.vision.caltech.edu/Image_Datasets/Caltech101/ >`_ contains pictures of objects
50
+ belonging to 101 categories. Here is the
51
+ `DataPipe implementation of Caltech 101 <https://github.com/pytorch/data/blob/main/examples/vision/caltech101.py >`_.
52
+
53
+ Caltech 256
54
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^
55
+ The `Caltech 256 dataset <http://www.vision.caltech.edu/Image_Datasets/Caltech256/ >`_ contains 30607 images
56
+ from 256 categories. Here is the
57
+ `DataPipe implementation of Caltech 256 <https://github.com/pytorch/data/blob/main/examples/vision/caltech256.py >`_.
58
+
59
+ Additional Datasets in TorchVision
60
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
61
+ In a separate PyTorch domain library `TorchVision <https://github.com/pytorch/vision >`_, you will find some of the most
62
+ popular datasets in the computer vision field implemented as loadable datasets using DataPipes. You can find all of
63
+ those `vision datasets here <https://github.com/pytorch/vision/tree/main/torchvision/prototype/datasets/_builtin >`_.
17
64
18
- .. automodule :: examples
19
- :members:
20
- :undoc-members:
21
- :show-inheritance:
65
+ Note that these implementations are currently in the prototype phase, but they should be fully supported
66
+ in the coming months. Nonetheless, they demonstrate the different ways DataPipes can be used for data loading.
0 commit comments