Skip to content

Commit 686ce72

Browse files
kvignesh1420i-ony
authored andcommitted
[docs] Restructure README.md content (tensorflow#1257)
* Refactor README.md content * bump to run ci jobs
1 parent 8644501 commit 686ce72

File tree

3 files changed

+12
-320
lines changed

3 files changed

+12
-320
lines changed

.github/workflows/build.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ jobs:
7676
- run: |
7777
set -x -e
7878
bash -x -e .github/workflows/build.space.sh
79-
python3 .github/workflows/build.instruction.py README.md "##### Ubuntu 20.04" > source.sh
79+
python3 .github/workflows/build.instruction.py docs/development.md "##### Ubuntu 20.04" > source.sh
8080
cat source.sh
8181
docker run -i --rm -v $PWD:/v -w /v --net=host ubuntu:20.04 \
8282
bash -x -e source.sh
@@ -89,7 +89,7 @@ jobs:
8989
- run: |
9090
set -x -e
9191
bash -x -e .github/workflows/build.space.sh
92-
python3 .github/workflows/build.instruction.py docs/development.md "##### Ubuntu 20.04" > source.sh
92+
python3 .github/workflows/build.instruction.py docs/development.md "##### CentOS 7" > source.sh
9393
cat source.sh
9494
docker run -i --rm -v $PWD:/v -w /v --net=host \
9595
-e BAZEL_OPTIMIZATION="${BAZEL_OPTIMIZATION}" \

README.md

Lines changed: 0 additions & 296 deletions
Original file line numberDiff line numberDiff line change
@@ -138,302 +138,6 @@ of releases [here](https://github.com/tensorflow/io/releases).
138138
| 0.2.0 | 1.12.0 | Jan 29, 2019 |
139139
| 0.1.0 | 1.12.0 | Dec 16, 2018 |
140140

141-
## Development
142-
143-
### IDE Setup
144-
145-
For instructions on how to configure Visual Studio Code for developing TensorFlow I/O, please refer to
146-
https://github.com/tensorflow/io/blob/master/docs/vscode.md
147-
148-
### Lint
149-
150-
TensorFlow I/O's code conforms to Bazel Buildifier, Clang Format, Black, and Pyupgrade.
151-
Please use the following command to check the source code and identify lint issues:
152-
```
153-
$ bazel run //tools/lint:check
154-
```
155-
156-
For Bazel Buildifier and Clang Format, the following command will automatically identify
157-
and fix any lint errors:
158-
```
159-
$ bazel run //tools/lint:lint
160-
```
161-
162-
Alternatively, if you only want to perform lint check using individual linters,
163-
then you can selectively pass `black`, `pyupgrade`, `bazel`, or `clang` to the above commands.
164-
165-
For example, a `black` specific lint check can be done using:
166-
```
167-
$ bazel run //tools/lint:check -- black
168-
```
169-
170-
Lint fix using Bazel Buildifier and Clang Format can be done using:
171-
```
172-
$ bazel run //tools/lint:lint -- bazel clang
173-
```
174-
175-
Lint check using `black` and `pyupgrade` for an individual python file can be done using:
176-
```
177-
$ bazel run //tools/lint:check -- black pyupgrade -- tensorflow_io/core/python/ops/version_ops.py
178-
```
179-
180-
Lint fix an individual python file with black and pyupgrade using:
181-
```
182-
$ bazel run //tools/lint:lint -- black pyupgrade -- tensorflow_io/core/python/ops/version_ops.py
183-
```
184-
185-
### Notebooks/Tutorials
186-
If you are updating or creating a notebook, please refer to the tutorials and instructions mentioned [here](https://github.com/tensorflow/io/tree/master/docs/tutorials).
187-
188-
### Python
189-
190-
#### macOS
191-
192-
## Performance Benchmarking
193-
194-
# Show macOS's default python3
195-
python3 --version
196-
197-
# Install Bazel version specified in .bazelversion
198-
curl -OL https://github.com/bazelbuild/bazel/releases/download/$(cat .bazelversion)/bazel-$(cat .bazelversion)-installer-darwin-x86_64.sh
199-
sudo bash -x -e bazel-$(cat .bazelversion)-installer-darwin-x86_64.sh
200-
201-
# Install tensorflow and configure bazel
202-
sudo ./configure.sh
203-
204-
# Build shared libraries
205-
bazel build -s --verbose_failures //tensorflow_io/...
206-
207-
# Once build is complete, shared libraries will be available in
208-
# `bazel-bin/tensorflow_io/core/python/ops/` and it is possible
209-
# to run tests with `pytest`, e.g.:
210-
sudo python3 -m pip install pytest
211-
TFIO_DATAPATH=bazel-bin python3 -m pytest -s -v tests/test_serialization_eager.py
212-
```
213-
214-
NOTE: When running pytest, `TFIO_DATAPATH=bazel-bin` has to be passed so that python can utilize the generated shared libraries after the build process.
215-
216-
##### Troubleshoot
217-
218-
If Xcode is installed, but `$ xcodebuild -version` is not displaying the expected output, you might need to enable Xcode command line with the command:
219-
220-
`$ xcode-select -s /Applications/Xcode.app/Contents/Developer`.
221-
222-
A terminal restart might be required for the changes to take effect.
223-
224-
Sample output:
225-
226-
```
227-
$ xcodebuild -version
228-
Xcode 11.6
229-
Build version 11E708
230-
```
231-
232-
233-
#### Linux
234-
235-
Development of tensorflow-io on Linux is similar to macOS. The required packages
236-
are gcc, g++, git, bazel, and python 3. Newer versions of gcc or python, other than the default system installed
237-
versions might be required though.
238-
239-
##### Ubuntu 20.04
240-
241-
Ubuntu 20.04 requires gcc/g++, git, and python 3. The following will install dependencies and build
242-
the shared libraries on Ubuntu 20.04:
243-
```sh
244-
#!/usr/bin/env bash
245-
246-
# Install gcc/g++, git, unzip/curl (for bazel), and python3
247-
sudo apt-get -y -qq update
248-
sudo apt-get -y -qq install gcc g++ git unzip curl python3-pip
249-
250-
# Install Bazel version specified in .bazelversion
251-
curl -sSOL https://github.com/bazelbuild/bazel/releases/download/$(cat .bazelversion)/bazel-$(cat .bazelversion)-installer-linux-x86_64.sh
252-
sudo bash -x -e bazel-$(cat .bazelversion)-installer-linux-x86_64.sh
253-
254-
# Upgrade pip
255-
sudo python3 -m pip install -U pip
256-
257-
# Install tensorflow and configure bazel
258-
sudo ./configure.sh
259-
260-
# Build shared libraries
261-
bazel build -s --verbose_failures //tensorflow_io/...
262-
263-
# Once build is complete, shared libraries will be available in
264-
# `bazel-bin/tensorflow_io/core/python/ops/` and it is possible
265-
# to run tests with `pytest`, e.g.:
266-
sudo python3 -m pip install pytest
267-
TFIO_DATAPATH=bazel-bin python3 -m pytest -s -v tests/test_serialization_eager.py
268-
```
269-
270-
##### CentOS 8
271-
272-
The steps to build shared libraries for CentOS 8 is similiar to Ubuntu 20.04 above
273-
excpet that
274-
```
275-
sudo yum install -y python3 python3-devel gcc gcc-c++ git unzip which make
276-
```
277-
should be used instead to install gcc/g++, git, unzip/which (for bazel), and python3.
278-
279-
##### CentOS 7
280-
281-
On CentOS 7, the default python and gcc version are too old to build tensorflow-io's shared
282-
libraries (.so). The gcc provided by Developer Toolset and rh-python36 should be used instead.
283-
Also, the libstdc++ has to be linked statically to avoid discrepancy of libstdc++ installed on
284-
CentOS vs. newer gcc version by devtoolset.
285-
286-
Furthermore, a special flag `--//tensorflow_io/core:static_build` has to be passed to Bazel
287-
in order to avoid duplication of symbols in statically linked libraries for file system
288-
plugins.
289-
290-
The following will install bazel, devtoolset-9, rh-python36, and build the shared libraries:
291-
```sh
292-
#!/usr/bin/env bash
293-
294-
# Install centos-release-scl, then install gcc/g++ (devtoolset), git, and python 3
295-
sudo yum install -y centos-release-scl
296-
sudo yum install -y devtoolset-9 git rh-python36 make
297-
298-
# Install Bazel version specified in .bazelversion
299-
curl -sSOL https://github.com/bazelbuild/bazel/releases/download/$(cat .bazelversion)/bazel-$(cat .bazelversion)-installer-linux-x86_64.sh
300-
sudo bash -x -e bazel-$(cat .bazelversion)-installer-linux-x86_64.sh
301-
302-
# Upgrade pip
303-
scl enable rh-python36 devtoolset-9 \
304-
'python3 -m pip install -U pip'
305-
306-
# Install tensorflow and configure bazel with rh-python36
307-
scl enable rh-python36 devtoolset-9 \
308-
'./configure.sh'
309-
310-
# Build shared libraries, notice the passing of --//tensorflow_io/core:static_build
311-
BAZEL_LINKOPTS="-static-libstdc++ -static-libgcc" BAZEL_LINKLIBS="-lm -l%:libstdc++.a" \
312-
scl enable rh-python36 devtoolset-9 \
313-
'bazel build -s --verbose_failures --//tensorflow_io/core:static_build //tensorflow_io/...'
314-
315-
# Once build is complete, shared libraries will be available in
316-
# `bazel-bin/tensorflow_io/core/python/ops/` and it is possible
317-
# to run tests with `pytest`, e.g.:
318-
scl enable rh-python36 devtoolset-9 \
319-
'python3 -m pip install pytest'
320-
321-
TFIO_DATAPATH=bazel-bin \
322-
scl enable rh-python36 devtoolset-9 \
323-
'python3 -m pytest -s -v tests/test_serialization_eager.py'
324-
```
325-
326-
#### Python Wheels
327-
328-
It is possible to build python wheels after bazel build is complete with the following command:
329-
```
330-
$ python3 setup.py bdist_wheel --data bazel-bin
331-
```
332-
The .whl file will be available in dist directory. Note the bazel binary directory `bazel-bin`
333-
has to be passed with `--data` args in order for setup.py to locate the necessary share objects,
334-
as `bazel-bin` is outside of the `tensorflow_io` package directory.
335-
336-
Alternatively, source install could be done with:
337-
```
338-
$ TFIO_DATAPATH=bazel-bin python3 -m pip install .
339-
```
340-
with `TFIO_DATAPATH=bazel-bin` passed for the same reason.
341-
342-
Note installing with `-e` is different from the above. The
343-
```
344-
$ TFIO_DATAPATH=bazel-bin python3 -m pip install -e .
345-
```
346-
will not install shared object automatically even with `TFIO_DATAPATH=bazel-bin`. Instead,
347-
`TFIO_DATAPATH=bazel-bin` has to be passed everytime the program is run after the install:
348-
```
349-
$ TFIO_DATAPATH=bazel-bin python3
350-
351-
>>> import tensorflow_io as tfio
352-
>>> ...
353-
```
354-
355-
#### Docker
356-
357-
For Python development, a reference Dockerfile [here](tools/docker/devel.Dockerfile) can be
358-
used to build the TensorFlow I/O package (`tensorflow-io`) from source. Additionally, the
359-
pre-built devel images can be used as well:
360-
```sh
361-
# Pull (if necessary) and start the devel container
362-
$ docker run -it --rm --name tfio-dev --net=host -v ${PWD}:/v -w /v tfsigio/tfio:latest-devel bash
363-
364-
# Inside the docker container, ./configure.sh will install TensorFlow or use existing install
365-
(tfio-dev) root@docker-desktop:/v$ ./configure.sh
366-
367-
# Clean up exisiting bazel build's (if any)
368-
(tfio-dev) root@docker-desktop:/v$ rm -rf bazel-*
369-
370-
# Build TensorFlow I/O C++. For compilation optimization flags, the default (-march=native)
371-
# optimizes the generated code for your machine's CPU type.
372-
# Reference: https://www.tensorflow.orginstall/source#configuration_options).
373-
374-
# NOTE: Based on the available resources, please change the number of job workers to:
375-
# -j 4/8/16 to prevent bazel server terminations and resource oriented build errors.
376-
377-
(tfio-dev) root@docker-desktop:/v$ bazel build -j 8 --copt=-msse4.2 --copt=-mavx --compilation_mode=opt --verbose_failures --test_output=errors --crosstool_top=//third_party/toolchains/gcc7_manylinux2010:toolchain //tensorflow_io/...
378-
379-
380-
# Run tests with PyTest, note: some tests require launching additional containers to run (see below)
381-
(tfio-dev) root@docker-desktop:/v$ pytest -s -v tests/
382-
# Build the TensorFlow I/O package
383-
(tfio-dev) root@docker-desktop:/v$ python setup.py bdist_wheel
384-
```
385-
386-
A package file `dist/tensorflow_io-*.whl` will be generated after a build is successful.
387-
388-
NOTE: When working in the Python development container, an environment variable
389-
`TFIO_DATAPATH` is automatically set to point tensorflow-io to the shared C++
390-
libraries built by Bazel to run `pytest` and build the `bdist_wheel`. Python
391-
`setup.py` can also accept `--data [path]` as an argument, for example
392-
`python setup.py --data bazel-bin bdist_wheel`.
393-
394-
NOTE: While the tfio-dev container gives developers an easy to work with
395-
environment, the released whl packages are built differently due to manylinux2010
396-
requirements. Please check [Build Status and CI] section for more details
397-
on how the released whl packages are generated.
398-
399-
#### Starting Test Containers
400-
401-
Some tests require launching a test container before running. In order
402-
to run all tests, execute the following commands:
403-
404-
```sh
405-
$ bash -x -e tests/test_ignite/start_ignite.sh
406-
$ bash -x -e tests/test_kafka/kafka_test.sh
407-
$ bash -x -e tests/test_kinesis/kinesis_test.sh
408-
```
409-
410-
### R
411-
412-
We provide a reference Dockerfile [here](R-package/scripts/Dockerfile) for you
413-
so that you can use the R package directly for testing. You can build it via:
414-
```sh
415-
$ docker build -t tfio-r-dev -f R-package/scripts/Dockerfile .
416-
```
417-
418-
Inside the container, you can start your R session, instantiate a `SequenceFileDataset`
419-
from an example [Hadoop SequenceFile](https://wiki.apache.org/hadoop/SequenceFile)
420-
[string.seq](R-package/tests/testthat/testdata/string.seq), and then use any [transformation functions](https://tensorflow.rstudio.com/tools/tfdatasets/articles/introduction.html#transformations) provided by [tfdatasets package](https://tensorflow.rstudio.com/tools/tfdatasets/) on the dataset like the following:
421-
422-
```r
423-
library(tfio)
424-
dataset <- sequence_file_dataset("R-package/tests/testthat/testdata/string.seq") %>%
425-
dataset_repeat(2)
426-
427-
sess <- tf$Session()
428-
iterator <- make_iterator_one_shot(dataset)
429-
next_batch <- iterator_get_next(iterator)
430-
431-
until_out_of_range({
432-
batch <- sess$run(next_batch)
433-
print(batch)
434-
})
435-
```
436-
437141
## Contributing
438142

439143
Tensorflow I/O is a community led open source project. As such, the project

0 commit comments

Comments
 (0)