Skip to content

Commit d15c8f9

Browse files
NivekTfacebook-github-bot
authored andcommitted
Categorizing IterDataPipes (#219)
Summary: Pull Request resolved: #219 Categorizing IterDataPipes into more specific categories. Still need to add category descriptions. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D34221378 Pulled By: NivekT fbshipit-source-id: 98d833bee75c1d99f8b1c574b3266d5ef29cf75a
1 parent bd51d9c commit d15c8f9

File tree

1 file changed

+132
-46
lines changed

1 file changed

+132
-46
lines changed

docs/source/torchdata.datapipes.iter.rst

Lines changed: 132 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,109 @@ This is an updated version of ``IterableDataset`` in ``torch``.
1515
.. autoclass:: IterDataPipe
1616

1717

18-
We have three types of Iterable DataPipes:
18+
We have different types of Iterable DataPipes:
1919

20-
1. Load - help you interact with the file systems or online databases (e.g. FileOpener, GDriveReader)
20+
1. Archive - open and decompress archive files of different formats.
2121

22-
2. Transform - transform elements within DataPipes (e.g. batching, shuffling)
22+
2. Augmenting - augment your samples (e.g. adding index, or cycle through indefinitely).
2323

24-
3. Utility - utility functions (e.g. caching, CSV parsing, filtering)
24+
3. Combinatorial - perform combinatorial operations (e.g. sampling, shuffling).
2525

26-
Load DataPipes
26+
4. Combining/Splitting - interact with multiple DataPipes by combining them or splitting one to many.
27+
28+
5. Grouping - group samples within a DataPipe
29+
30+
6. IO - interacting with the file systems or remote server (e.g. downloading, opening,
31+
saving files, and listing the files in directories).
32+
33+
7. Mapping - apply the a given function to each element in the DataPipe.
34+
35+
8. Others - perform miscellaneous set of operations.
36+
37+
9. Selecting - select specific samples within a DataPipe.
38+
39+
10. Text - parse, read, and transform text files and data
40+
41+
Archive DataPipes
42+
-------------------------
43+
44+
These DataPipes help opening and decompressing archive files of different formats.
45+
46+
.. autosummary::
47+
:nosignatures:
48+
:toctree: generated/
49+
:template: datapipe.rst
50+
51+
Extractor
52+
RarArchiveLoader
53+
TarArchiveReader
54+
XzFileReader
55+
ZipArchiveReader
56+
57+
Augmenting DataPipes
58+
-----------------------------
59+
These DataPipes help to augment your samples.
60+
61+
.. autosummary::
62+
:nosignatures:
63+
:toctree: generated/
64+
:template: datapipe.rst
65+
66+
Cycler
67+
Enumerator
68+
IndexAdder
69+
70+
Combinatorial DataPipes
71+
-----------------------------
72+
These DataPipes help to perform combinatorial operations.
73+
74+
.. autosummary::
75+
:nosignatures:
76+
:toctree: generated/
77+
:template: datapipe.rst
78+
79+
Sampler
80+
Shuffler
81+
82+
Combining/Spliting DataPipes
83+
-----------------------------
84+
These tend to involve multiple DataPipes, combining them or splitting one to many.
85+
86+
.. autosummary::
87+
:nosignatures:
88+
:toctree: generated/
89+
:template: datapipe.rst
90+
91+
Concater
92+
Demultiplexer
93+
Forker
94+
IterKeyZipper
95+
MapKeyZipper
96+
Multiplexer
97+
SampleMultiplexer
98+
UnZipper
99+
Zipper
100+
101+
Grouping DataPipes
102+
-----------------------------
103+
These DataPipes have you group samples within a DataPipe.
104+
105+
.. autosummary::
106+
:nosignatures:
107+
:toctree: generated/
108+
:template: datapipe.rst
109+
110+
Batcher
111+
BucketBatcher
112+
Collator
113+
Grouper
114+
UnBatcher
115+
116+
IO DataPipes
27117
-------------------------
28118

29-
These DataPipes help you interact with the file systems or online databases (e.g. FileOpener, GDriveReader).
119+
These DataPipes help interacting with the file systems or remote server (e.g. downloading, opening,
120+
saving files, and listing the files in directories).
30121

31122
.. autosummary::
32123
:nosignatures:
@@ -42,73 +133,68 @@ These DataPipes help you interact with the file systems or online databases (e.g
42133
HttpReader
43134
IoPathFileLister
44135
IoPathFileOpener
136+
IoPathSaver
45137
OnlineReader
46138
ParquetDataFrameLoader
139+
Saver
47140

48-
49-
Transform DataPipes
141+
Mapping DataPipes
50142
-------------------------
51143

52-
These DataPipes transform elements within DataPipes (e.g. batching, shuffling).
144+
These DataPipes apply the a given function to each element in the DataPipe.
53145

54146
.. autosummary::
55147
:nosignatures:
56148
:toctree: generated/
57149
:template: datapipe.rst
58150

59-
Batcher
60-
BucketBatcher
61-
Shuffler
151+
FlatMapper
152+
Mapper
62153

63-
Utility DataPipes
154+
Other DataPipes
64155
-------------------------
65-
66-
These DataPipes provide utility functions (e.g. caching, CSV parsing, filtering).
156+
A miscellaneous set of DataPipes with different functionalities.
67157

68158
.. autosummary::
69159
:nosignatures:
70160
:toctree: generated/
71161
:template: datapipe.rst
72162

73-
CSVDictParser
74-
CSVParser
75-
Collator
76-
Concater
77-
Cycler
78163
DataFrameMaker
79-
Demultiplexer
80164
EndOnDiskCacheHolder
81-
Enumerator
82-
Extractor
83-
Filter
84-
FlatMapper
85-
Forker
86-
Grouper
87165
HashChecker
88-
Header
89166
InMemoryCacheHolder
90-
IndexAdder
91-
IoPathSaver
92-
IterKeyZipper
93167
IterableWrapper
168+
OnDiskCacheHolder
169+
ShardingFilter
170+
171+
Selecting DataPipes
172+
-------------------------
173+
174+
These DataPipes helps you select specific samples within a DataPipe.
175+
176+
.. autosummary::
177+
:nosignatures:
178+
:toctree: generated/
179+
:template: datapipe.rst
180+
181+
Filter
182+
Header
183+
184+
Text DataPipes
185+
-----------------------------
186+
These DataPipes help you parse, read, and transform text files and data.
187+
188+
.. autosummary::
189+
:nosignatures:
190+
:toctree: generated/
191+
:template: datapipe.rst
192+
193+
CSVDictParser
194+
CSVParser
94195
JsonParser
95196
LineReader
96-
MapKeyZipper
97-
Mapper
98-
Multiplexer
99-
OnDiskCacheHolder
100197
ParagraphAggregator
101-
RarArchiveLoader
102198
RoutedDecoder
103199
Rows2Columnar
104-
SampleMultiplexer
105-
Sampler
106-
Saver
107-
ShardingFilter
108200
StreamReader
109-
TarArchiveReader
110-
UnBatcher
111-
UnZipper
112-
XzFileReader
113-
ZipArchiveReader
114-
Zipper

0 commit comments

Comments
 (0)