@@ -138,302 +138,6 @@ of releases [here](https://github.com/tensorflow/io/releases).
138
138
| 0.2.0 | 1.12.0 | Jan 29, 2019 |
139
139
| 0.1.0 | 1.12.0 | Dec 16, 2018 |
140
140
141
- ## Development
142
-
143
- ### IDE Setup
144
-
145
- For instructions on how to configure Visual Studio Code for developing TensorFlow I/O, please refer to
146
- https://github.com/tensorflow/io/blob/master/docs/vscode.md
147
-
148
- ### Lint
149
-
150
- TensorFlow I/O's code conforms to Bazel Buildifier, Clang Format, Black, and Pyupgrade.
151
- Please use the following command to check the source code and identify lint issues:
152
- ```
153
- $ bazel run //tools/lint:check
154
- ```
155
-
156
- For Bazel Buildifier and Clang Format, the following command will automatically identify
157
- and fix any lint errors:
158
- ```
159
- $ bazel run //tools/lint:lint
160
- ```
161
-
162
- Alternatively, if you only want to perform lint check using individual linters,
163
- then you can selectively pass ` black ` , ` pyupgrade ` , ` bazel ` , or ` clang ` to the above commands.
164
-
165
- For example, a ` black ` specific lint check can be done using:
166
- ```
167
- $ bazel run //tools/lint:check -- black
168
- ```
169
-
170
- Lint fix using Bazel Buildifier and Clang Format can be done using:
171
- ```
172
- $ bazel run //tools/lint:lint -- bazel clang
173
- ```
174
-
175
- Lint check using ` black ` and ` pyupgrade ` for an individual python file can be done using:
176
- ```
177
- $ bazel run //tools/lint:check -- black pyupgrade -- tensorflow_io/core/python/ops/version_ops.py
178
- ```
179
-
180
- Lint fix an individual python file with black and pyupgrade using:
181
- ```
182
- $ bazel run //tools/lint:lint -- black pyupgrade -- tensorflow_io/core/python/ops/version_ops.py
183
- ```
184
-
185
- ### Notebooks/Tutorials
186
- If you are updating or creating a notebook, please refer to the tutorials and instructions mentioned [ here] ( https://github.com/tensorflow/io/tree/master/docs/tutorials ) .
187
-
188
- ### Python
189
-
190
- #### macOS
191
-
192
- ## Performance Benchmarking
193
-
194
- # Show macOS's default python3
195
- python3 --version
196
-
197
- # Install Bazel version specified in .bazelversion
198
- curl -OL https://github.com/bazelbuild/bazel/releases/download/$ (cat .bazelversion)/bazel-$(cat .bazelversion)-installer-darwin-x86_64.sh
199
- sudo bash -x -e bazel-$(cat .bazelversion)-installer-darwin-x86_64.sh
200
-
201
- # Install tensorflow and configure bazel
202
- sudo ./configure.sh
203
-
204
- # Build shared libraries
205
- bazel build -s --verbose_failures //tensorflow_io/...
206
-
207
- # Once build is complete, shared libraries will be available in
208
- # ` bazel-bin/tensorflow_io/core/python/ops/ ` and it is possible
209
- # to run tests with ` pytest ` , e.g.:
210
- sudo python3 -m pip install pytest
211
- TFIO_DATAPATH=bazel-bin python3 -m pytest -s -v tests/test_serialization_eager.py
212
- ```
213
-
214
- NOTE: When running pytest, `TFIO_DATAPATH=bazel-bin` has to be passed so that python can utilize the generated shared libraries after the build process.
215
-
216
- ##### Troubleshoot
217
-
218
- If Xcode is installed, but `$ xcodebuild -version` is not displaying the expected output, you might need to enable Xcode command line with the command:
219
-
220
- `$ xcode-select -s /Applications/Xcode.app/Contents/Developer`.
221
-
222
- A terminal restart might be required for the changes to take effect.
223
-
224
- Sample output:
225
-
226
- ```
227
- $ xcodebuild -version
228
- Xcode 11.6
229
- Build version 11E708
230
- ```
231
-
232
-
233
- #### Linux
234
-
235
- Development of tensorflow-io on Linux is similar to macOS. The required packages
236
- are gcc, g++, git, bazel, and python 3. Newer versions of gcc or python, other than the default system installed
237
- versions might be required though.
238
-
239
- ##### Ubuntu 20.04
240
-
241
- Ubuntu 20.04 requires gcc/g++, git, and python 3. The following will install dependencies and build
242
- the shared libraries on Ubuntu 20.04:
243
- ```sh
244
- #!/usr/bin/env bash
245
-
246
- # Install gcc/g++, git, unzip/curl (for bazel), and python3
247
- sudo apt-get -y -qq update
248
- sudo apt-get -y -qq install gcc g++ git unzip curl python3-pip
249
-
250
- # Install Bazel version specified in .bazelversion
251
- curl -sSOL https://github.com/bazelbuild/bazel/releases/download/$(cat .bazelversion)/bazel-$(cat .bazelversion)-installer-linux-x86_64.sh
252
- sudo bash -x -e bazel-$(cat .bazelversion)-installer-linux-x86_64.sh
253
-
254
- # Upgrade pip
255
- sudo python3 -m pip install -U pip
256
-
257
- # Install tensorflow and configure bazel
258
- sudo ./configure.sh
259
-
260
- # Build shared libraries
261
- bazel build -s --verbose_failures //tensorflow_io/...
262
-
263
- # Once build is complete, shared libraries will be available in
264
- # `bazel-bin/tensorflow_io/core/python/ops/` and it is possible
265
- # to run tests with `pytest`, e.g.:
266
- sudo python3 -m pip install pytest
267
- TFIO_DATAPATH=bazel-bin python3 -m pytest -s -v tests/test_serialization_eager.py
268
- ```
269
-
270
- ##### CentOS 8
271
-
272
- The steps to build shared libraries for CentOS 8 is similiar to Ubuntu 20.04 above
273
- excpet that
274
- ```
275
- sudo yum install -y python3 python3-devel gcc gcc-c++ git unzip which make
276
- ```
277
- should be used instead to install gcc/g++, git, unzip/which (for bazel), and python3.
278
-
279
- ##### CentOS 7
280
-
281
- On CentOS 7, the default python and gcc version are too old to build tensorflow-io's shared
282
- libraries (.so). The gcc provided by Developer Toolset and rh-python36 should be used instead.
283
- Also, the libstdc++ has to be linked statically to avoid discrepancy of libstdc++ installed on
284
- CentOS vs. newer gcc version by devtoolset.
285
-
286
- Furthermore, a special flag ` --//tensorflow_io/core:static_build ` has to be passed to Bazel
287
- in order to avoid duplication of symbols in statically linked libraries for file system
288
- plugins.
289
-
290
- The following will install bazel, devtoolset-9, rh-python36, and build the shared libraries:
291
- ``` sh
292
- #! /usr/bin/env bash
293
-
294
- # Install centos-release-scl, then install gcc/g++ (devtoolset), git, and python 3
295
- sudo yum install -y centos-release-scl
296
- sudo yum install -y devtoolset-9 git rh-python36 make
297
-
298
- # Install Bazel version specified in .bazelversion
299
- curl -sSOL https://github.com/bazelbuild/bazel/releases/download/$( cat .bazelversion) /bazel-$( cat .bazelversion) -installer-linux-x86_64.sh
300
- sudo bash -x -e bazel-$( cat .bazelversion) -installer-linux-x86_64.sh
301
-
302
- # Upgrade pip
303
- scl enable rh-python36 devtoolset-9 \
304
- ' python3 -m pip install -U pip'
305
-
306
- # Install tensorflow and configure bazel with rh-python36
307
- scl enable rh-python36 devtoolset-9 \
308
- ' ./configure.sh'
309
-
310
- # Build shared libraries, notice the passing of --//tensorflow_io/core:static_build
311
- BAZEL_LINKOPTS=" -static-libstdc++ -static-libgcc" BAZEL_LINKLIBS=" -lm -l%:libstdc++.a" \
312
- scl enable rh-python36 devtoolset-9 \
313
- ' bazel build -s --verbose_failures --//tensorflow_io/core:static_build //tensorflow_io/...'
314
-
315
- # Once build is complete, shared libraries will be available in
316
- # `bazel-bin/tensorflow_io/core/python/ops/` and it is possible
317
- # to run tests with `pytest`, e.g.:
318
- scl enable rh-python36 devtoolset-9 \
319
- ' python3 -m pip install pytest'
320
-
321
- TFIO_DATAPATH=bazel-bin \
322
- scl enable rh-python36 devtoolset-9 \
323
- ' python3 -m pytest -s -v tests/test_serialization_eager.py'
324
- ```
325
-
326
- #### Python Wheels
327
-
328
- It is possible to build python wheels after bazel build is complete with the following command:
329
- ```
330
- $ python3 setup.py bdist_wheel --data bazel-bin
331
- ```
332
- The .whl file will be available in dist directory. Note the bazel binary directory ` bazel-bin `
333
- has to be passed with ` --data ` args in order for setup.py to locate the necessary share objects,
334
- as ` bazel-bin ` is outside of the ` tensorflow_io ` package directory.
335
-
336
- Alternatively, source install could be done with:
337
- ```
338
- $ TFIO_DATAPATH=bazel-bin python3 -m pip install .
339
- ```
340
- with ` TFIO_DATAPATH=bazel-bin ` passed for the same reason.
341
-
342
- Note installing with ` -e ` is different from the above. The
343
- ```
344
- $ TFIO_DATAPATH=bazel-bin python3 -m pip install -e .
345
- ```
346
- will not install shared object automatically even with ` TFIO_DATAPATH=bazel-bin ` . Instead,
347
- ` TFIO_DATAPATH=bazel-bin ` has to be passed everytime the program is run after the install:
348
- ```
349
- $ TFIO_DATAPATH=bazel-bin python3
350
-
351
- >>> import tensorflow_io as tfio
352
- >>> ...
353
- ```
354
-
355
- #### Docker
356
-
357
- For Python development, a reference Dockerfile [ here] ( tools/docker/devel.Dockerfile ) can be
358
- used to build the TensorFlow I/O package (` tensorflow-io ` ) from source. Additionally, the
359
- pre-built devel images can be used as well:
360
- ``` sh
361
- # Pull (if necessary) and start the devel container
362
- $ docker run -it --rm --name tfio-dev --net=host -v ${PWD} :/v -w /v tfsigio/tfio:latest-devel bash
363
-
364
- # Inside the docker container, ./configure.sh will install TensorFlow or use existing install
365
- (tfio-dev) root@docker-desktop:/v$ ./configure.sh
366
-
367
- # Clean up exisiting bazel build's (if any)
368
- (tfio-dev) root@docker-desktop:/v$ rm -rf bazel-*
369
-
370
- # Build TensorFlow I/O C++. For compilation optimization flags, the default (-march=native)
371
- # optimizes the generated code for your machine's CPU type.
372
- # Reference: https://www.tensorflow.orginstall/source#configuration_options).
373
-
374
- # NOTE: Based on the available resources, please change the number of job workers to:
375
- # -j 4/8/16 to prevent bazel server terminations and resource oriented build errors.
376
-
377
- (tfio-dev) root@docker-desktop:/v$ bazel build -j 8 --copt=-msse4.2 --copt=-mavx --compilation_mode=opt --verbose_failures --test_output=errors --crosstool_top=//third_party/toolchains/gcc7_manylinux2010:toolchain //tensorflow_io/...
378
-
379
-
380
- # Run tests with PyTest, note: some tests require launching additional containers to run (see below)
381
- (tfio-dev) root@docker-desktop:/v$ pytest -s -v tests/
382
- # Build the TensorFlow I/O package
383
- (tfio-dev) root@docker-desktop:/v$ python setup.py bdist_wheel
384
- ```
385
-
386
- A package file ` dist/tensorflow_io-*.whl ` will be generated after a build is successful.
387
-
388
- NOTE: When working in the Python development container, an environment variable
389
- ` TFIO_DATAPATH ` is automatically set to point tensorflow-io to the shared C++
390
- libraries built by Bazel to run ` pytest ` and build the ` bdist_wheel ` . Python
391
- ` setup.py ` can also accept ` --data [path] ` as an argument, for example
392
- ` python setup.py --data bazel-bin bdist_wheel ` .
393
-
394
- NOTE: While the tfio-dev container gives developers an easy to work with
395
- environment, the released whl packages are built differently due to manylinux2010
396
- requirements. Please check [ Build Status and CI] section for more details
397
- on how the released whl packages are generated.
398
-
399
- #### Starting Test Containers
400
-
401
- Some tests require launching a test container before running. In order
402
- to run all tests, execute the following commands:
403
-
404
- ``` sh
405
- $ bash -x -e tests/test_ignite/start_ignite.sh
406
- $ bash -x -e tests/test_kafka/kafka_test.sh
407
- $ bash -x -e tests/test_kinesis/kinesis_test.sh
408
- ```
409
-
410
- ### R
411
-
412
- We provide a reference Dockerfile [ here] ( R-package/scripts/Dockerfile ) for you
413
- so that you can use the R package directly for testing. You can build it via:
414
- ``` sh
415
- $ docker build -t tfio-r-dev -f R-package/scripts/Dockerfile .
416
- ```
417
-
418
- Inside the container, you can start your R session, instantiate a ` SequenceFileDataset `
419
- from an example [ Hadoop SequenceFile] ( https://wiki.apache.org/hadoop/SequenceFile )
420
- [ string.seq] ( R-package/tests/testthat/testdata/string.seq ) , and then use any [ transformation functions] ( https://tensorflow.rstudio.com/tools/tfdatasets/articles/introduction.html#transformations ) provided by [ tfdatasets package] ( https://tensorflow.rstudio.com/tools/tfdatasets/ ) on the dataset like the following:
421
-
422
- ``` r
423
- library(tfio )
424
- dataset <- sequence_file_dataset(" R-package/tests/testthat/testdata/string.seq" ) %> %
425
- dataset_repeat(2 )
426
-
427
- sess <- tf $ Session()
428
- iterator <- make_iterator_one_shot(dataset )
429
- next_batch <- iterator_get_next(iterator )
430
-
431
- until_out_of_range({
432
- batch <- sess $ run(next_batch )
433
- print(batch )
434
- })
435
- ```
436
-
437
141
## Contributing
438
142
439
143
Tensorflow I/O is a community led open source project. As such, the project
0 commit comments