Skip to content

Commit 57a6c64

Browse files
author
dcherian
committed
Merge branch 'refactor-plot-utils' into yohai-ds_scatter
* refactor-plot-utils: (22 commits) review comment. small rename stale requires a label (pydata#2701) Update indexing.rst (pydata#2700) add line break to message posted (pydata#2698) Config for closing stale issues (pydata#2684) to_dict without data (pydata#2659) Update asv.conf.json (pydata#2693) try no rasterio in py36 env (pydata#2691) Detailed report for testing.assert_equal and testing.assert_identical (pydata#1507) Hotfix for pydata#2662 (pydata#2678) Update README.rst (pydata#2682) Fix test failures with numpy=1.16 (pydata#2675) lint Back to map_dataarray_line Refactor out cmap_params, cbar_kwargs processing Refactor out colorbar making to plot.utils._add_colorbar flake8 facetgrid refactor Refactor out utility functions. ...
2 parents 1d939af + 351a466 commit 57a6c64

28 files changed

+734
-376
lines changed

.github/stale.yml

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Configuration for probot-stale - https://github.com/probot/stale
2+
3+
# Number of days of inactivity before an Issue or Pull Request becomes stale
4+
daysUntilStale: 700 # start with a large number and reduce shortly
5+
6+
# Number of days of inactivity before an Issue or Pull Request with the stale label is closed.
7+
# Set to false to disable. If disabled, issues still need to be closed manually, but will remain marked as stale.
8+
daysUntilClose: 30
9+
10+
# Issues or Pull Requests with these labels will never be considered stale. Set to `[]` to disable
11+
exemptLabels:
12+
- pinned
13+
- security
14+
- "[Status] Maybe Later"
15+
16+
# Set to true to ignore issues in a project (defaults to false)
17+
exemptProjects: false
18+
19+
# Set to true to ignore issues in a milestone (defaults to false)
20+
exemptMilestones: false
21+
22+
# Set to true to ignore issues with an assignee (defaults to false)
23+
exemptAssignees: true
24+
25+
# Label to use when marking as stale
26+
staleLabel: stale
27+
28+
# Comment to post when marking as stale. Set to `false` to disable
29+
markComment: |
30+
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
31+
If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically
32+
33+
# Comment to post when removing the stale label.
34+
# unmarkComment: >
35+
# Your comment here.
36+
37+
# Comment to post when closing a stale Issue or Pull Request.
38+
# closeComment: >
39+
# Your comment here.
40+
41+
# Limit the number of actions per hour, from 1-30. Default is 30
42+
limitPerRun: 1 # start with a small number
43+
44+
45+
# Limit to only `issues` or `pulls`
46+
# only: issues
47+
48+
# Optionally, specify configuration settings that are specific to just 'issues' or 'pulls':
49+
# pulls:
50+
# daysUntilStale: 30
51+
# markComment: >
52+
# This pull request has been automatically marked as stale because it has not had
53+
# recent activity. It will be closed if no further activity occurs. Thank you
54+
# for your contributions.
55+
56+
# issues:
57+
# exemptLabels:
58+
# - confirmed

.travis.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ script:
6060
- python --version
6161
- python -OO -c "import xarray"
6262
- if [[ "$CONDA_ENV" == "docs" ]]; then
63-
conda install -c conda-forge sphinx sphinx_rtd_theme sphinx-gallery numpydoc;
63+
conda install -c conda-forge --override-channels sphinx sphinx_rtd_theme sphinx-gallery numpydoc "gdal>2.2.4";
6464
sphinx-build -n -j auto -b html -d _build/doctrees doc _build/html;
6565
elif [[ "$CONDA_ENV" == "lint" ]]; then
6666
pycodestyle xarray ;

README.rst

Lines changed: 26 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -9,49 +9,47 @@ xarray: N-D labeled arrays and datasets
99
:target: https://coveralls.io/r/pydata/xarray
1010
.. image:: https://readthedocs.org/projects/xray/badge/?version=latest
1111
:target: http://xarray.pydata.org/
12-
.. image:: https://img.shields.io/pypi/v/xarray.svg
13-
:target: https://pypi.python.org/pypi/xarray/
14-
.. image:: https://zenodo.org/badge/13221727.svg
15-
:target: https://zenodo.org/badge/latestdoi/13221727
1612
.. image:: http://img.shields.io/badge/benchmarked%20by-asv-green.svg?style=flat
1713
:target: http://pandas.pydata.org/speed/xarray/
18-
.. image:: https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A
19-
:target: http://numfocus.org
14+
.. image:: https://img.shields.io/pypi/v/xarray.svg
15+
:target: https://pypi.python.org/pypi/xarray/
2016

2117
**xarray** (formerly **xray**) is an open source project and Python package
2218
that makes working with labelled multi-dimensional arrays simple,
2319
efficient, and fun!
2420

25-
Multi-dimensional (a.k.a. N-dimensional, ND) arrays (sometimes called
26-
"tensors") are an essential part of computational science.
27-
They are encountered in a wide range of fields, including physics, astronomy,
28-
geoscience, bioinformatics, engineering, finance, and deep learning.
29-
In Python, NumPy_ provides the fundamental data structure and API for
30-
working with raw ND arrays.
31-
However, real-world datasets are usually more than just raw numbers;
32-
they have labels which encode information about how the array values map
33-
to locations in space, time, etc.
21+
Xarray introduces labels in the form of dimensions, coordinates and
22+
attributes on top of raw NumPy_-like arrays, which allows for a more
23+
intuitive, more concise, and less error-prone developer experience.
24+
The package includes a large and growing library of domain-agnostic functions
25+
for advanced analytics and visualization with these data structures.
3426

35-
By introducing *dimensions*, *coordinates*, and *attributes* on top of raw
36-
NumPy-like arrays, xarray is able to understand these labels and use them to
37-
provide a more intuitive, more concise, and less error-prone experience.
38-
Xarray also provides a large and growing library of functions for advanced
39-
analytics and visualization with these data structures.
4027
Xarray was inspired by and borrows heavily from pandas_, the popular data
4128
analysis package focused on labelled tabular data.
42-
Xarray can read and write data from most common labeled ND-array storage
43-
formats and is particularly tailored to working with netCDF_ files, which were
44-
the source of xarray's data model.
29+
It is particularly tailored to working with netCDF_ files, which were the
30+
source of xarray's data model, and integrates tightly with dask_ for parallel
31+
computing.
4532

46-
.. _NumPy: http://www.numpy.org/
33+
.. _NumPy: http://www.numpy.org
4734
.. _pandas: http://pandas.pydata.org
35+
.. _dask: http://dask.org
4836
.. _netCDF: http://www.unidata.ucar.edu/software/netcdf
4937

5038
Why xarray?
5139
-----------
5240

53-
Adding dimensions names and coordinate indexes to numpy's ndarray_ makes many
54-
powerful array operations possible:
41+
Multi-dimensional (a.k.a. N-dimensional, ND) arrays (sometimes called
42+
"tensors") are an essential part of computational science.
43+
They are encountered in a wide range of fields, including physics, astronomy,
44+
geoscience, bioinformatics, engineering, finance, and deep learning.
45+
In Python, NumPy_ provides the fundamental data structure and API for
46+
working with raw ND arrays.
47+
However, real-world datasets are usually more than just raw numbers;
48+
they have labels which encode information about how the array values map
49+
to locations in space, time, etc.
50+
51+
Xarray doesn't just keep track of labels on arrays -- it uses them to provide a
52+
powerful and concise interface. For example:
5553

5654
- Apply operations over dimensions by name: ``x.sum('time')``.
5755
- Select values by label instead of integer location:
@@ -65,42 +63,10 @@ powerful array operations possible:
6563
- Keep track of arbitrary metadata in the form of a Python dictionary:
6664
``x.attrs``.
6765

68-
pandas_ provides many of these features, but it does not make use of dimension
69-
names, and its core data structures are fixed dimensional arrays.
70-
71-
Why isn't pandas enough?
72-
------------------------
73-
74-
pandas_ excels at working with tabular data. That suffices for many statistical
75-
analyses, but physical scientists rely on N-dimensional arrays -- which is
76-
where xarray comes in.
77-
78-
xarray aims to provide a data analysis toolkit as powerful as pandas_ but
79-
designed for working with homogeneous N-dimensional arrays
80-
instead of tabular data. When possible, we copy the pandas API and rely on
81-
pandas's highly optimized internals (in particular, for fast indexing).
82-
83-
Why netCDF?
84-
-----------
85-
86-
Because xarray implements the same data model as the netCDF_ file format,
87-
xarray datasets have a natural and portable serialization format. But it is also
88-
easy to robustly convert an xarray ``DataArray`` to and from a numpy ``ndarray``
89-
or a pandas ``DataFrame`` or ``Series``, providing compatibility with the full
90-
`PyData ecosystem <http://pydata.org/>`__.
91-
92-
Our target audience is anyone who needs N-dimensional labeled arrays, but we
93-
are particularly focused on the data analysis needs of physical scientists --
94-
especially geoscientists who already know and love netCDF_.
95-
96-
.. _ndarray: http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html
97-
.. _pandas: http://pandas.pydata.org
98-
.. _netCDF: http://www.unidata.ucar.edu/software/netcdf
99-
10066
Documentation
10167
-------------
10268

103-
The official documentation is hosted on ReadTheDocs at http://xarray.pydata.org/
69+
Learn more about xarray in its official documentation at http://xarray.pydata.org/
10470

10571
Contributing
10672
------------
@@ -148,7 +114,7 @@ __ http://climate.com/
148114
License
149115
-------
150116

151-
Copyright 2014-2018, xarray Developers
117+
Copyright 2014-2019, xarray Developers
152118

153119
Licensed under the Apache License, Version 2.0 (the "License");
154120
you may not use this file except in compliance with the License.

asv_bench/asv.conf.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040

4141
// The Pythons you'd like to test against. If not provided, defaults
4242
// to the current version of Python used to run `asv`.
43-
"pythons": ["2.7", "3.6"],
43+
"pythons": ["3.6"],
4444

4545
// The matrix of dependencies to test. Each key is the name of a
4646
// package (in PyPI) and the values are version numbers. An empty

ci/requirements-py36.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,14 +20,14 @@ dependencies:
2020
- scipy
2121
- seaborn
2222
- toolz
23-
- rasterio
23+
# - rasterio # xref #2683
2424
- bottleneck
2525
- zarr
2626
- pseudonetcdf>=3.0.1
2727
- eccodes
2828
- cdms2
29-
- pynio
30-
- iris>=1.10
29+
# - pynio # xref #2683
30+
# - iris>=1.10 # xref #2683
3131
- pydap
3232
- lxml
3333
- pip:

doc/faq.rst

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,9 @@ pandas is a fantastic library for analysis of low-dimensional labelled data -
1818
if it can be sensibly described as "rows and columns", pandas is probably the
1919
right choice. However, sometimes we want to use higher dimensional arrays
2020
(`ndim > 2`), or arrays for which the order of dimensions (e.g., columns vs
21-
rows) shouldn't really matter. For example, climate and weather data is often
22-
natively expressed in 4 or more dimensions: time, x, y and z.
21+
rows) shouldn't really matter. For example, the images of a movie can be
22+
natively represented as an array with four dimensions: time, row, column and
23+
color.
2324

2425
Pandas has historically supported N-dimensional panels, but deprecated them in
2526
version 0.20 in favor of Xarray data structures. There are now built-in methods
@@ -39,9 +40,8 @@ if you were using Panels:
3940
xarray ``Dataset``.
4041

4142
You can :ref:`read about switching from Panels to Xarray here <panel transition>`.
42-
Pandas gets a lot of things right, but scientific users need fully multi-
43-
dimensional data structures.
44-
43+
Pandas gets a lot of things right, but many science, engineering and complex
44+
analytics use cases need fully multi-dimensional data structures.
4545

4646
How do xarray data structures differ from those found in pandas?
4747
----------------------------------------------------------------
@@ -65,7 +65,9 @@ multi-dimensional data-structures.
6565

6666
That said, you should only bother with xarray if some aspect of data is
6767
fundamentally multi-dimensional. If your data is unstructured or
68-
one-dimensional, stick with pandas.
68+
one-dimensional, pandas is usually the right choice: it has better performance
69+
for common operations such as ``groupby`` and you'll find far more usage
70+
examples online.
6971

7072

7173
Why don't aggregations return Python scalars?

doc/index.rst

Lines changed: 11 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -5,29 +5,21 @@ xarray: N-D labeled arrays and datasets in Python
55
that makes working with labelled multi-dimensional arrays simple,
66
efficient, and fun!
77

8-
Multi-dimensional (a.k.a. N-dimensional, ND) arrays (sometimes called
9-
"tensors") are an essential part of computational science.
10-
They are encountered in a wide range of fields, including physics, astronomy,
11-
geoscience, bioinformatics, engineering, finance, and deep learning.
12-
In Python, NumPy_ provides the fundamental data structure and API for
13-
working with raw ND arrays.
14-
However, real-world datasets are usually more than just raw numbers;
15-
they have labels which encode information about how the array values map
16-
to locations in space, time, etc.
17-
18-
By introducing *dimensions*, *coordinates*, and *attributes* on top of raw
19-
NumPy-like arrays, xarray is able to understand these labels and use them to
20-
provide a more intuitive, more concise, and less error-prone experience.
21-
Xarray also provides a large and growing library of functions for advanced
22-
analytics and visualization with these data structures.
8+
Xarray introduces labels in the form of dimensions, coordinates and
9+
attributes on top of raw NumPy_-like arrays, which allows for a more
10+
intuitive, more concise, and less error-prone developer experience.
11+
The package includes a large and growing library of domain-agnostic functions
12+
for advanced analytics and visualization with these data structures.
13+
2314
Xarray was inspired by and borrows heavily from pandas_, the popular data
2415
analysis package focused on labelled tabular data.
25-
Xarray can read and write data from most common labeled ND-array storage
26-
formats and is particularly tailored to working with netCDF_ files, which were
27-
the source of xarray's data model.
16+
It is particularly tailored to working with netCDF_ files, which were the
17+
source of xarray's data model, and integrates tightly with dask_ for parallel
18+
computing.
2819

29-
.. _NumPy: http://www.numpy.org/
20+
.. _NumPy: http://www.numpy.org
3021
.. _pandas: http://pandas.pydata.org
22+
.. _dask: http://dask.org
3123
.. _netCDF: http://www.unidata.ucar.edu/software/netcdf
3224

3325
Documentation

doc/indexing.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -371,7 +371,7 @@ Vectorized indexing also works with ``isel``, ``loc``, and ``sel``:
371371
ind = xr.DataArray([['a', 'b'], ['b', 'a']], dims=['a', 'b'])
372372
da.loc[:, ind] # same as da.sel(y=ind)
373373
374-
These methods may and also be applied to ``Dataset`` objects
374+
These methods may also be applied to ``Dataset`` objects
375375

376376
.. ipython:: python
377377

doc/io.rst

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,16 @@ require external libraries and dicts can easily be pickled, or converted to
8181
json, or geojson. All the values are converted to lists, so dicts might
8282
be quite large.
8383

84+
To export just the dataset schema, without the data itself, use the
85+
``data=False`` option:
86+
87+
.. ipython:: python
88+
89+
ds.to_dict(data=False)
90+
91+
This can be useful for generating indices of dataset contents to expose to
92+
search indices or other automated data discovery tools.
93+
8494
.. _io.netcdf:
8595

8696
netCDF
@@ -665,7 +675,7 @@ To read a consolidated store, pass the ``consolidated=True`` option to
665675
:py:func:`~xarray.open_zarr`::
666676

667677
ds = xr.open_zarr('foo.zarr', consolidated=True)
668-
678+
669679
Xarray can't perform consolidation on pre-existing zarr datasets. This should
670680
be done directly from zarr, as described in the
671681
`zarr docs <https://zarr.readthedocs.io/en/latest/tutorial.html#consolidating-metadata>`_.

doc/related-projects.rst

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Xarray related projects
44
-----------------------
55

6-
Here below is a list of several existing libraries that build
6+
Here below is a list of existing open source projects that build
77
functionality upon xarray. See also section :ref:`internals` for more
88
details on how to build xarray extensions.
99

@@ -39,11 +39,16 @@ Geosciences
3939

4040
Machine Learning
4141
~~~~~~~~~~~~~~~~
42-
- `cesium <http://cesium-ml.org/>`_: machine learning for time series analysis
42+
- `ArviZ <https://arviz-devs.github.io/arviz/>`_: Exploratory analysis of Bayesian models, built on top of xarray.
4343
- `Elm <https://ensemble-learning-models.readthedocs.io>`_: Parallel machine learning on xarray data structures
4444
- `sklearn-xarray (1) <https://phausamann.github.io/sklearn-xarray>`_: Combines scikit-learn and xarray (1).
4545
- `sklearn-xarray (2) <https://sklearn-xarray.readthedocs.io/en/latest/>`_: Combines scikit-learn and xarray (2).
4646

47+
Other domains
48+
~~~~~~~~~~~~~
49+
- `ptsa <https://pennmem.github.io/ptsa_new/html/index.html>`_: EEG Time Series Analysis
50+
- `pycalphad <https://pycalphad.org/docs/latest/>`_: Computational Thermodynamics in Python
51+
4752
Extend xarray capabilities
4853
~~~~~~~~~~~~~~~~~~~~~~~~~~
4954
- `Collocate <https://github.com/cistools/collocate>`_: Collocate xarray trajectories in arbitrary physical dimensions
@@ -61,9 +66,10 @@ Visualization
6166
- `hvplot <https://hvplot.pyviz.org/>`_ : A high-level plotting API for the PyData ecosystem built on HoloViews.
6267
- `psyplot <https://psyplot.readthedocs.io>`_: Interactive data visualization with python.
6368

64-
Other
65-
~~~~~
66-
- `ptsa <https://pennmem.github.io/ptsa_new/html/index.html>`_: EEG Time Series Analysis
67-
- `pycalphad <https://pycalphad.org/docs/latest/>`_: Computational Thermodynamics in Python
69+
Non-Python projects
70+
~~~~~~~~~~~~~~~~~~~
71+
- `xframe <https://github.com/QuantStack/xframe>`_: C++ data structures inspired by xarray.
72+
- `AxisArrays <https://github.com/JuliaArrays/AxisArrays.jl>`_ and
73+
`NamedArrays <https://github.com/davidavdav/NamedArrays.jl>`_: similar data structures for Julia.
6874

6975
More projects can be found at the `"xarray" Github topic <https://github.com/topics/xarray>`_.

0 commit comments

Comments
 (0)