Accessing and using NASA's Earth observing data with fsspec, or the software built on top of it like the Pangeo stack, is harder than it should be. This package aims to abstract those complications away, and provide a convenient Python filesystem interface to NASA's Earth observing data.
The challenges this pacakge aims to overcome are detailed in our overview document, and briefly restated here:
- Most NASA Earth observing dataset require authenticated HTTP access via NASA's Earthdata Login (EDL). However,
fsspecdoes not support EDL/OAuth2 out of the box. - NASA supports different access patterns for cloud-based and on-prem datasets hosted at the various Distributed Active Archive Centers (DAACs), where each DAAC may support only certain access patterns and auth mechanisms.
- Handling the above two challenges for large-scale, distributed workflows with tools like Dask adds additional complications.
edlfs is being developed to hide those complications for users so interacting with NASA's Earth observing data, even at global-scale, is straightforward, much like how s3fs hides the complications of working with cloud-data from users.
import edlfs
print(edlfs.__version__)In order to easily manage dependencies, we recommend using dedicated project environments via Anaconda/Miniconda or Python virtual environments.
NOTE: edlfs will be available on PyPI and conda-forge with the v0.1.0 release, which is coming soon! Until then, use the Development install.
edlfs can be installed into a conda environment with
conda install -c conda-forge edlfsor into a virtual environment with
python -m pip install edlfsedlfs provides a docker container image with all the necessary dependencies pre-installed. To get the latest released version:
docker pull ghcr.io/nasa-openscapes/edlfs:latesta specific release version (>=v0.1.0 only):
docker pull ghcr.io/nasa-openscapes/edlfs:0.1.0or the current development version:
docker pull ghcr.io/nasa-openscapes/edlfs:testTo run the container and jump into a bash shell inside:
docker run -it --rm ghcr.io/nasa-openscapes/edlfs:latestTo mount your current directory inside the container so that files will be written back to your local machine:
docker run -it -v ${PWD}:/home/conda/work --rm ghcr.io/nasa-openscapes/edlfs:latest
cd workFor more docker run options, see: https://docs.docker.com/engine/reference/run/.
Found a bug? Want to request a feature? Open an issue
General questions? Suggestions? Or anything else? Start a discussion
Don't hesitate to reach out; we would love to hear from you!
To contribute to edlfs, first check out our Code of Conduct and our contributing guide.
To create a development environment for edlfs, we recommend using conda/mamba to create a development environment. First fork the repo and then:
git clone https://github.com/[OWNER]/edlfs.git
cd edlfs
mamba env update -f environment.yml # will create if env. doesn't already exist
mamba activate edlfs
python -m pip install -e .Note: Each time you go to make new changes/create new feature branches, you may want to ensure the environment and install are up-to-date by running:
# from the repository root
mamba env update -f environment.yml
mamba deactivate && mamba activate edlfs
python -m pip install -e .Feel free to add your name here, or if you want to sign up to be a maintainer, in the package authors.