Skip to content

Add support for dynamically loaded DAOS libraries #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 22, 2022

Conversation

krehm
Copy link

@krehm krehm commented Jun 18, 2022

Currently if DAOS libraries are not installed on a node, the
libtensorflow_io_plugins.so will fail to load due to unsatisfied
externals, and all modular filesystems are then unusable, not
just DFS. This PR changes the DFS plugin to dynamically load
the DAOS libraries so that the DFS filesystem is available if
DAOS is installed, but the other modular filesystems are still
available if DAOS is not installed.

The checks for the DAOS libraries and the daos_init() call are
now done at filesystem registration time, not as part of each
function call in the filesystem API. If the libraries are not
installed then the DFS filesystem will not be not registered,
and no calls into DFS functions will ever occur. In this case
tensorflow will just report
"File system scheme 'dfs' not implemented"
when a "dfs://" path is used.

A number of separate functions existed each of which was only
called once as part of DFS destruction, these were combined into
the DFS destructor for simplicity. Similar recombinations were
done to simplify DFS construction.

Signed-off-by: Kevan Rehm [email protected]

Currently if DAOS libraries are not installed on a node, the
libtensorflow_io_plugins.so will fail to load due to unsatisfied
externals, and all modular filesystems are then unusable, not
just DFS.  This PR changes the DFS plugin to dynamically load
the DAOS libraries so that the DFS filesystem is available if
DAOS is installed, but the other modular filesystems are still
available if DAOS is not installed.

The checks for the DAOS libraries and the daos_init() call are
now done at filesystem registration time, not as part of each
function call in the filesystem API.  If the libraries are not
installed then the DFS filesystem will not be not registered,
and no calls into DFS functions will ever occur.  In this case
tensorflow will just report
    "File system scheme 'dfs' not implemented"
when a "dfs://" path is used.

A number of separate functions existed each of which was only
called once as part of DFS destruction, these were combined into
the DFS destructor for simplicity.  Similar recombinations were
done to simplify DFS construction.

Signed-off-by: Kevan Rehm <[email protected]>
@omar-hmarzouk omar-hmarzouk merged commit eb90e69 into daos-stack:devel Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants