From adcfa87c019a0b15f2139ee0bd0a096080587055 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Sun, 1 Jan 2023 15:39:23 -0500 Subject: [PATCH 01/32] why hierarchical data --- docs/source/hierarchical-data.rst | 139 ++++++++++++++++++++++++++++++ 1 file changed, 139 insertions(+) create mode 100644 docs/source/hierarchical-data.rst diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst new file mode 100644 index 00000000..d20aa576 --- /dev/null +++ b/docs/source/hierarchical-data.rst @@ -0,0 +1,139 @@ +.. _hierarchical data: + +Working With Hierarchical Data +============================== + +Why Hierarchical Data? +---------------------- + +Many real-world datasets are composed of multiple differing components, +and it can often be be useful to think of these in terms of a hierarchy of related groups of data. +Examples of data which one might want organise in a grouped or hierarchical manner include: + +- Simulation data at multiple resolutions, +- Observational data about the same system but from multiple different types of sensors, +- Mixed experimental and theoretical data, +- A systematic study recording the same experiment but with different parameters, +- Heterogenous data, such as demographic and metereological data, + +or even any combination of the above. + +Often datasets like this cannot easily fit into a single ``xarray.Dataset`` object, +or are more usefully thought of as groups of related ``xarray.Dataset`` objects. +For this purpose we provide the ``DataTree`` class. + +This page explains in detail how to understand and use the different features of the ``DataTree`` class for your own heirarchical data needs. + +.. _creating a family tree: + +Creating a Family Tree +---------------------- + +The three main ways of creating a ``DataTree`` object are described briefly in :ref:`creating a datatree`. +Here we go into more detail about how to create a tree node-by-node, using a family tree as an example. + +This could perhaps go in a tutorial? + +(i.e. how to create and manipulate a tree structure from scratch node-by-node, with no data in it). + +Create Simpson's family tree + +Start with Homer, Bart and Lisa + +Add Maggie by setting children on homer + +check that this also set's Maggie's parent + +Add long-lost relations + +add Abe by setting + +(Abe's father, Homer's cousin?) + +add Herbert by setting + +.. _navigating trees: + +Navigating Trees +---------------- + +Node Relationships +~~~~~~~~~~~~~~~~~~ + +Root, ancestors, parent, children, leaves + +Tree of life? + +leaves are either currently living or died out with no descendants +Root is beginning of life +ancestors are evolutionary history + +find common ancestor + +Alien life not in same tree? + +Filesystem-like Paths +~~~~~~~~~~~~~~~~~~~~~ + +file-like access via paths + + +.. _manipulating trees: + +Manipulating Trees +------------------ + +Altering Tree Branches +~~~~~~~~~~~~~~~~~~~~~~ + +pruning, grafting + +Tree of life? + +Graft new discoveries onto the tree? + +Prune when we realise something is in the wrong place? + +Save our updated tree out with ``to_dict`` + +Subsetting Tree Nodes +~~~~~~~~~~~~~~~~~~~~~ + +subset, filter + +Filter the Simpsons by age? + +Subset only the living leaves of the evolutionary tree? + + +.. _tree computation: + +Computation +----------- + +Operations on Trees +~~~~~~~~~~~~~~~~~~~ + +Mapping of methods + + +Mapping Custom Functions Over Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.subtree, map_over_subtree + + +.. _multiple trees: + +Operating on Multiple Trees +--------------------------- + +Comparing trees +~~~~~~~~~~~~~~~ + +isomorphism + +Mapping over Multiple Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +map_over_subtree with binary function From 54907757926baedfd824407cc3b01a056bf8d68d Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Sun, 1 Jan 2023 15:42:54 -0500 Subject: [PATCH 02/32] add hierarchical data page to index --- docs/source/hierarchical-data.rst | 2 +- docs/source/index.rst | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index d20aa576..019834f8 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -1,4 +1,4 @@ -.. _hierarchical data: +.. _hierarchical-data: Working With Hierarchical Data ============================== diff --git a/docs/source/index.rst b/docs/source/index.rst index 9448e232..e0e39de7 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -12,6 +12,7 @@ Datatree Quick Overview Tutorial Data Model + Hierarchical Data Reading and Writing Files API Reference Terminology From be81f7872e107269c944a978ed45245cd4f079a6 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Sun, 1 Jan 2023 18:10:11 -0500 Subject: [PATCH 03/32] Simpsons family tree --- docs/source/hierarchical-data.rst | 115 +++++++++++++++++++++++++++--- 1 file changed, 104 insertions(+), 11 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 019834f8..318ca310 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -3,6 +3,17 @@ Working With Hierarchical Data ============================== +.. ipython:: python + :suppress: + + import numpy as np + import pandas as pd + import xarray as xr + from datatree import DataTree + + np.random.seed(123456) + np.set_printoptions(threshold=10) + Why Hierarchical Data? ---------------------- @@ -30,27 +41,101 @@ Creating a Family Tree ---------------------- The three main ways of creating a ``DataTree`` object are described briefly in :ref:`creating a datatree`. -Here we go into more detail about how to create a tree node-by-node, using a family tree as an example. +Here we go into more detail about how to create a tree node-by-node, using a famous family tree from the Simpsons cartoon as an example. + +Let's start by defining nodes representing the two siblings, Bart and Lisa Simpson: + +.. ipython:: python + + bart = DataTree(name="Bart") + lisa = DataTree(name="Lisa") + +Each of these node objects knows their own ``.name``, but they currently have no relationship to one another. +We can connect them by creating another node representing a common parent, Homer Simpson: + +.. ipython:: python + + homer = DataTree(name="Homer", children={"Bart": bart, "Lisa": lisa}) + +Here we set the children of Homer in the node's constructor. +We now have a small family tree + +.. ipython:: python + + homer + +where we can see how these individual Simpson family members are related to one another. +The nodes representing Bart and Lisa are now connected - we can confirm their sibling rivalry by examining the ``.siblings`` property: + +.. ipython:: python + + list(bart.siblings) + +But oops, we forgot Homer's third daughter, Maggie! Let's add her by updating Homer's ``.children`` property to include her: + +.. ipython:: python + + maggie = DataTree(name="Maggie") + homer.children = {"Bart": bart, "Lisa": lisa, "Maggie": maggie} + homer -This could perhaps go in a tutorial? +Let's check that Maggie knows who her Dad is: -(i.e. how to create and manipulate a tree structure from scratch node-by-node, with no data in it). +.. ipython:: python -Create Simpson's family tree + maggie.parent.name -Start with Homer, Bart and Lisa +That's good - updating the properties of our nodes does not break the internal consistency of our tree, as changes of parentage are automatically reflected on both nodes. -Add Maggie by setting children on homer + These children obviously have another parent, Marge Simpson, but ``DataTree`` nodes can only have a maximum of one parent. + Genealogical `family trees are not even technically trees `_ in the mathematical sense - + the fact that distant relatives can mate makes it a directed acyclic graph. + Trees of ``DataTree`` objects cannot represent this. -check that this also set's Maggie's parent +Homer is currently listed as having no parent (the so-called "root node" of this tree), but we can update his ``.parent`` property: -Add long-lost relations +.. ipython:: python -add Abe by setting + abe = DataTree(name="Abe") + homer.parent = abe -(Abe's father, Homer's cousin?) +Abe is now the "root" of this tree, which we can see by examining the ``.root`` property of any node in the tree + +.. ipython:: python + + maggie.root.name + +We can see the whole tree by printing Abe's node or just part of the tree by printing Homer's node: + +.. ipython:: python + + abe + homer + +We can see that Homer is aware of his parentage, and we say that Homer and his children form a "subtree" of the larger Simpson family tree. + +In episode 28, Abe Simpson reveals that he had another son, Herbert "Herb" Simpson. +We can add Herbert to the family tree without displacing Homer by ``.assign``-ing another child to Abe: + +# TODO write the ``assign`` or ``assign_nodes`` method on ``DataTree`` so that this example works + +.. ipython:: python + :okexcept: + + herb = DataTree(name="Herb") + abe.assign({"Herbert": herb}) + +# TODO Name permanence of herb versus herbert (or abe versus abraham) + +Certain manipulations of our tree are forbidden, if they would create an inconsistent result. +In episode 51 of the show Futurama, Philip J. Fry travels back in time and accidentally becomes his own Grandfather. +If we try similar time-travelling hijinks with Homer, we get a ``InvalidTreeError`` raised: + +.. ipython:: python + :okexcept: + + abe.parent = homer -add Herbert by setting .. _navigating trees: @@ -77,6 +162,8 @@ Filesystem-like Paths file-like access via paths +see relative to of bart to herbert + .. _manipulating trees: @@ -103,6 +190,8 @@ subset, filter Filter the Simpsons by age? +Need to first recreate tree with age data in it + Subset only the living leaves of the evolutionary tree? @@ -116,6 +205,10 @@ Operations on Trees Mapping of methods +Arithmetic + +cause all Simpsons to age simultaneously + Mapping Custom Functions Over Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From e05fe6d1ff2b083e08de10bd9a3f80e96d1e681f Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Sun, 1 Jan 2023 19:21:01 -0500 Subject: [PATCH 04/32] evolutionary tree --- docs/source/hierarchical-data.rst | 92 ++++++++++++++++++++++++++----- 1 file changed, 79 insertions(+), 13 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 318ca310..9c5d15cf 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -35,10 +35,15 @@ For this purpose we provide the ``DataTree`` class. This page explains in detail how to understand and use the different features of the ``DataTree`` class for your own heirarchical data needs. +.. _node relationships: + +Node Relationships +------------------ + .. _creating a family tree: Creating a Family Tree ----------------------- +~~~~~~~~~~~~~~~~~~~~~~ The three main ways of creating a ``DataTree`` object are described briefly in :ref:`creating a datatree`. Here we go into more detail about how to create a tree node-by-node, using a famous family tree from the Simpsons cartoon as an example. @@ -136,26 +141,78 @@ If we try similar time-travelling hijinks with Homer, we get a ``InvalidTreeErro abe.parent = homer +.. _evolutionary tree: -.. _navigating trees: +Ancestry in an Evolutionary Tree +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Navigating Trees ----------------- +Let's use a different example of a tree to discuss more complex relationships between nodes - the phylogenetic tree, or tree of life. -Node Relationships -~~~~~~~~~~~~~~~~~~ +.. ipython:: python -Root, ancestors, parent, children, leaves + vertebrates = DataTree.from_dict( + name="Vertebrae", + d={ + "/Sharks": None, + "/Bony Skeleton/Ray-finned Fish": None, + "/Bony Skeleton/Four Limbs/Amphibians": None, + "/Bony Skeleton/Four Limbs/Amniotic Egg/Hair/Primates": None, + "/Bony Skeleton/Four Limbs/Amniotic Egg/Hair/Rodents & Rabbits": None, + "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Crocodiles": None, + "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Dinosaurs": None, + "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Birds": None, + }, + ) + + primates = vertebrates["/Bony Skeleton/Four Limbs/Amniotic Egg/Hair/Primates"] + dinosaurs = vertebrates[ + "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Dinosaurs" + ] + +We have used the ``.from_dict`` constructor method as an alternate way to quickly create a whole tree, +and file-like syntax (to be explained shortly) to select two nodes of interest. + +This tree shows various families of species, grouped by their common features (making it technically a `"Cladogram" `_, +rather than an evolutionary tree). + +Here both the species and the features used to group them are represented by ``DataTree`` node objects - there is no distinction in types of node. +We can however get a list of only the nodes we used to represent species by using the fact that all those nodes have no children - they are "leaf nodes". +We can check if a node is a leaf with ``.is_leaf``, and get a list of all leaves with the ``.leaves`` property: -Tree of life? +.. ipython:: python + :okexcept -leaves are either currently living or died out with no descendants -Root is beginning of life -ancestors are evolutionary history + primates.is_leaf + [node.name for node in vertebrates.leaves] + +Pretending that this is a true evolutionary tree for a moment, we can find the features of the evolutionary ancestors (so-called "ancestor" nodes), +the distinguishing feature of the common ancestor of all vertebrate life (the root node), +and even the distinguishing feature of the common ancestor of any two species (the common ancestor of two nodes): + +.. ipython:: python -find common ancestor + [node.name for node in primates.ancestors] + primates.root.name + primates.find_common_ancestor(dinosaurs).name + +We can only find a common ancestor between two nodes that lie in the same tree. +If we try to find the common evolutionary ancestor between primates and an Alien species that has no relationship to Earth's evolutionary tree, +an error will be raised. + +.. ipython:: python + :okexcept: + + alien = DataTree(name="Xenomorph") + primates.find_common_ancestor(alien) + + +.. _navigating trees: + +Navigating Trees +---------------- + +Can move around trees using properties, but there are also neater ways to access nodes. -Alien life not in same tree? Filesystem-like Paths ~~~~~~~~~~~~~~~~~~~~~ @@ -165,6 +222,12 @@ file-like access via paths see relative to of bart to herbert +Attribute-like access +~~~~~~~~~~~~~~~~~~~~~ + +# TODO attribute-like access is not yet implemented, see issue #98 + + .. _manipulating trees: Manipulating Trees @@ -192,6 +255,7 @@ Filter the Simpsons by age? Need to first recreate tree with age data in it +leaves are either currently living or died out with no descendants Subset only the living leaves of the evolutionary tree? @@ -209,6 +273,8 @@ Arithmetic cause all Simpsons to age simultaneously +Find total number of species +Find total biomass Mapping Custom Functions Over Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From f9ae6fded52c47865e6bf1527acb3ec949be7136 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 2 Jan 2023 13:19:51 -0500 Subject: [PATCH 05/32] WIP rearrangement of creating trees --- docs/source/data-structures.rst | 38 ++++--------- docs/source/hierarchical-data.rst | 92 ++++++++++++++++++++++++++++--- 2 files changed, 95 insertions(+), 35 deletions(-) diff --git a/docs/source/data-structures.rst b/docs/source/data-structures.rst index 67e0e608..7a0cca60 100644 --- a/docs/source/data-structures.rst +++ b/docs/source/data-structures.rst @@ -71,7 +71,10 @@ Again these are not normally used unless explicitly accessed by the user. Creating a DataTree ~~~~~~~~~~~~~~~~~~~ -There are three ways to create a ``DataTree`` from scratch. The first is to create each node individually, +There are three ways to create a ``DataTree`` from scratch. + + +One way to create a create a ``DataTree`` from scratch is to create each node individually, specifying the nodes' relationship to one another as you create each one. The ``DataTree`` constructor takes: @@ -122,37 +125,18 @@ Is is at tree construction time that consistency checks are enforced. For instan node0.parent = node2 -The second way is to build the tree from a dictionary of filesystem-like paths and corresponding ``xarray.Dataset`` objects. - -This relies on a syntax inspired by unix-like filesystems, where the "path" to a node is specified by the keys of each intermediate node in sequence, -separated by forward slashes. The root node is referred to by ``"/"``, so the path from our current root node to its grand-child would be ``"/Oak/Bonsai"``. -A path specified from the root (as opposed to being specified relative to an arbitrary node in the tree) is sometimes also referred to as a -`"fully qualified name" `_. +Alternatively you can also create a ``DataTree`` object from -If we have a dictionary where each key is a valid path, and each value is either valid data or ``None``, -we can construct a complex tree quickly using the alternative constructor ``:py:func::DataTree.from_dict``: - -.. ipython:: python - - d = { - "/": xr.Dataset({"foo": "orange"}), - "/a": xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])}), - "/a/b": xr.Dataset({"zed": np.NaN}), - "a/c/d": None, - } - dt = DataTree.from_dict(d) - dt - -Notice that this method will also create any intermediate empty node necessary to reach the end of the specified path -(i.e. the node labelled `"c"` in this case.) - -Finally the third way is from a file. if you have a file containing data on disk (such as a netCDF file or a Zarr Store), you can also create a datatree by opening the -file using ``:py:func::~datatree.open_datatree``. See the page on :ref:`reading and writing files ` for more details. +- An ``xarray.Dataset`` using ``Dataset.to_node()`` (not yet implemented), +- A dictionary mapping directory-like paths to either ``DataTree`` nodes or data, using ``DataTree.from_dict()``, +- A netCDF or Zarr file on disk with ``open_datatree()``. See :ref:`reading and writing files `. DataTree Contents ~~~~~~~~~~~~~~~~~ +TODO create this example datatree but without using ``from_dict`` + Like ``xarray.Dataset``, ``DataTree`` implements the python mapping interface, but with values given by either ``xarray.DataArray`` objects or other ``DataTree`` objects. .. ipython:: python @@ -187,8 +171,6 @@ Like with ``Dataset``, you can access the data and coordinate variables of a nod Dictionary-like methods ~~~~~~~~~~~~~~~~~~~~~~~ -We can update the contents of the tree in-place using a dictionary-like syntax. - We can update a datatree in-place using Python's standard dictionary syntax, similar to how we can for Dataset objects. For example, to create this example datatree from scratch, we could have written: diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 9c5d15cf..c1648006 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -170,7 +170,7 @@ Let's use a different example of a tree to discuss more complex relationships be ] We have used the ``.from_dict`` constructor method as an alternate way to quickly create a whole tree, -and file-like syntax (to be explained shortly) to select two nodes of interest. +and :ref:`filesystem-like syntax `_ (to be explained shortly) to select two nodes of interest. This tree shows various families of species, grouped by their common features (making it technically a `"Cladogram" `_, rather than an evolutionary tree). @@ -180,7 +180,7 @@ We can however get a list of only the nodes we used to represent species by usin We can check if a node is a leaf with ``.is_leaf``, and get a list of all leaves with the ``.leaves`` property: .. ipython:: python - :okexcept + :okexcept: primates.is_leaf [node.name for node in vertebrates.leaves] @@ -211,22 +211,100 @@ an error will be raised. Navigating Trees ---------------- -Can move around trees using properties, but there are also neater ways to access nodes. +There are various ways to access the different nodes in a tree. +Properties +~~~~~~~~~~ -Filesystem-like Paths -~~~~~~~~~~~~~~~~~~~~~ +We can navigate trees using the ``.parent`` and ``.children`` properties of each node, for example: -file-like access via paths +.. ipython:: python -see relative to of bart to herbert + lisa.parent.children["Bart"].name +but there are also more convenient ways to access nodes. + +Dictionary-like interface +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Children are stored on each node as a key-value mapping from name to child node or variable. +They can be accessed and altered via the ``__getitem__`` and ``__setitem__`` syntax. +In general ``DataTree`` objects support almost the entire set of dict-like methods, +including ``keys``, ``values``, ``items``, ``__delitem__`` and ``update``. + +Note that the dict-like interface combines access to child ``DataTree`` nodes and stored ``DataArray``s, +so if we have a node that contains both children and data, calling ``.keys()`` will list both names of child nodes and +names of data variables: + +.. ipython:: python + + dt = DataTree.from_dict( + {"/": xr.Dataset({"foo": 0, "bar": 1}), "/a": None, "/b": None} + ) + print(dt) + list(dt.keys()) + +This means that the names of variables and of child nodes must be different to one another. Attribute-like access ~~~~~~~~~~~~~~~~~~~~~ # TODO attribute-like access is not yet implemented, see issue #98 +.. _filesystem paths: + +Filesystem-like Paths +~~~~~~~~~~~~~~~~~~~~~ + +Hierarchical trees can be thought of as analogous to file systems. +Each node is like a directory, and each directory can contain both more sub-directories and data. + +Datatree objects support a syntax inspired by unix-like filesystems, +where the "path" to a node is specified by the keys of each intermediate node in sequence, +separated by forward slashes. + +.. ipython:: python + + abe["Homer/Bart"].name + +The root node is referred to by ``"/"``, so the path from the root node to its grand-child would be ``"/child/grandchild"``, e.g. + +EXAMPLE of path from root + +A path specified from the root (as opposed to being specified relative to an arbitrary node in the tree) is sometimes also referred to as a +`"fully qualified name" `_. + +file-like access via paths + +set something using a relative path + +example of finding relative path, from bart to herbert? + + +Create a node with intermediates via ``__setitem__`` + +You can use this feature to build a nested tree from a dictionary of filesystem-like paths and corresponding ``xarray.Dataset`` objects in a single step. +If we have a dictionary where each key is a valid path, and each value is either valid data or ``None``, +we can construct a complex tree quickly using the alternative constructor ``:py:func::DataTree.from_dict``: + +.. ipython:: python + + d = { + "/": xr.Dataset({"foo": "orange"}), + "/a": xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])}), + "/a/b": xr.Dataset({"zed": np.NaN}), + "a/c/d": None, + } + dt = DataTree.from_dict(d) + dt + +Notice that this method will also create any intermediate empty node necessary to reach the end of the specified path +(i.e. the node labelled `"c"` in this case.) + +.. note:: + + You can even make the filesystem analogy concrete by using ``open_mfdatatree`` or ``save_mfdatatree`` # TODO not yet implemented - see GH issue 51 + .. _manipulating trees: From f625b9528f718f488c48de9876451007eba15c7d Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Tue, 3 Jan 2023 21:19:42 -0500 Subject: [PATCH 06/32] fixed examples in data structures page --- docs/source/data-structures.rst | 39 +++++++++++++++------------------ 1 file changed, 18 insertions(+), 21 deletions(-) diff --git a/docs/source/data-structures.rst b/docs/source/data-structures.rst index 7a0cca60..98ad5af2 100644 --- a/docs/source/data-structures.rst +++ b/docs/source/data-structures.rst @@ -71,9 +71,6 @@ Again these are not normally used unless explicitly accessed by the user. Creating a DataTree ~~~~~~~~~~~~~~~~~~~ -There are three ways to create a ``DataTree`` from scratch. - - One way to create a create a ``DataTree`` from scratch is to create each node individually, specifying the nodes' relationship to one another as you create each one. @@ -84,16 +81,16 @@ The ``DataTree`` constructor takes: - ``children``: The various child nodes (if there are any), given as a mapping from string keys to ``DataTree`` objects. - ``name``: A string to use as the name of this node. -Let's make a datatree node without anything in it: +Let's make a single datatree node with some example data in it: .. ipython:: python from datatree import DataTree - # create root node - node1 = DataTree(name="Oak") + ds1 = xr.Dataset({"foo": "orange"}) + dt = DataTree(name="root", data=ds1) # create root node - node1 + dt At this point our node is also the root node, as every tree has a root node. @@ -101,29 +98,32 @@ We can add a second node to this tree either by referring to the first node in t .. ipython:: python + ds2 = xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])}) # add a child by referring to the parent node - node2 = DataTree(name="Bonsai", parent=node1) + node2 = DataTree(name="a", parent=dt, data=ds2) or by dynamically updating the attributes of one node to refer to another: .. ipython:: python - # add a grandparent by updating the .parent property of an existing node - node0 = DataTree(name="General Sherman") - node1.parent = node0 + # add a second child by first creating a new node ... + ds3 = xr.Dataset({"zed": np.NaN}) + node3 = DataTree(name='b', data=ds3) + # ... then updating its .parent property + node3.parent = dt -Our tree now has three nodes within it, and one of the two new nodes has become the new root: +Our tree now has three nodes within it: .. ipython:: python - node0 + dt Is is at tree construction time that consistency checks are enforced. For instance, if we try to create a `cycle` the constructor will raise an error: .. ipython:: python :okexcept: - node0.parent = node2 + dt.parent = node3 Alternatively you can also create a ``DataTree`` object from @@ -135,8 +135,6 @@ Alternatively you can also create a ``DataTree`` object from DataTree Contents ~~~~~~~~~~~~~~~~~ -TODO create this example datatree but without using ``from_dict`` - Like ``xarray.Dataset``, ``DataTree`` implements the python mapping interface, but with values given by either ``xarray.DataArray`` objects or other ``DataTree`` objects. .. ipython:: python @@ -178,11 +176,10 @@ For example, to create this example datatree from scratch, we could have written .. ipython:: python - dt = DataTree() + dt = DataTree(name="root") dt["foo"] = "orange" dt["a"] = DataTree(data=xr.Dataset({"bar": 0}, coords={"y": ("y", [0, 1, 2])})) dt["a/b/zed"] = np.NaN - dt["a/c/d"] = DataTree() dt To change the variables in a node of a ``DataTree``, you can use all the standard dictionary @@ -191,6 +188,6 @@ methods, including ``values``, ``items``, ``__delitem__``, ``get`` and Note that assigning a ``DataArray`` object to a ``DataTree`` variable using ``__setitem__`` or ``update`` will :ref:`automatically align` the array(s) to the original node's indexes. -If you copy a ``DataTree`` using the ``:py:func::copy`` function or the :py:meth:`~xarray.DataTree.copy` it will copy the entire tree, -including all parents and children. -Like for ``Dataset``, this copy is shallow by default, but you can copy all the data by calling ``dt.copy(deep=True)``. +If you copy a ``DataTree`` using the ``:py:func::copy`` function or the :py:meth:`~xarray.DataTree.copy` it will copy the subtree, +meaning that node and children below it, but no parents above it. +Like for ``Dataset``, this copy is shallow by default, but you can copy all the underlying data arrays by calling ``dt.copy(deep=True)``. From c0ea814ce652cf6b93b64bd99212c7012023148e Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Tue, 3 Jan 2023 21:34:04 -0500 Subject: [PATCH 07/32] dict-like navigation --- docs/source/hierarchical-data.rst | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index c1648006..e0651388 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -172,6 +172,10 @@ Let's use a different example of a tree to discuss more complex relationships be We have used the ``.from_dict`` constructor method as an alternate way to quickly create a whole tree, and :ref:`filesystem-like syntax `_ (to be explained shortly) to select two nodes of interest. +.. ipython:: python + + vertebrates + This tree shows various families of species, grouped by their common features (making it technically a `"Cladogram" `_, rather than an evolutionary tree). @@ -180,7 +184,6 @@ We can however get a list of only the nodes we used to represent species by usin We can check if a node is a leaf with ``.is_leaf``, and get a list of all leaves with the ``.leaves`` property: .. ipython:: python - :okexcept: primates.is_leaf [node.name for node in vertebrates.leaves] @@ -227,24 +230,29 @@ but there are also more convenient ways to access nodes. Dictionary-like interface ~~~~~~~~~~~~~~~~~~~~~~~~~ -Children are stored on each node as a key-value mapping from name to child node or variable. +Children are stored on each node as a key-value mapping from name to child node. They can be accessed and altered via the ``__getitem__`` and ``__setitem__`` syntax. In general ``DataTree`` objects support almost the entire set of dict-like methods, including ``keys``, ``values``, ``items``, ``__delitem__`` and ``update``. -Note that the dict-like interface combines access to child ``DataTree`` nodes and stored ``DataArray``s, +.. ipython:: python + + vertebrates["Bony Skeleton"]["Ray-finned Fish"] + +Note that the dict-like interface combines access to child ``DataTree`` nodes and stored ``DataArrays``, so if we have a node that contains both children and data, calling ``.keys()`` will list both names of child nodes and names of data variables: .. ipython:: python - dt = DataTree.from_dict( - {"/": xr.Dataset({"foo": 0, "bar": 1}), "/a": None, "/b": None} + dt = DataTree( + data=xr.Dataset({"foo": 0, "bar": 1}), + children={"a": DataTree(), "b": DataTree()} ) print(dt) list(dt.keys()) -This means that the names of variables and of child nodes must be different to one another. +This also means that the names of variables and of child nodes must be different to one another. Attribute-like access ~~~~~~~~~~~~~~~~~~~~~ From 016531251b3e4e611d8dff029b7bbe04aeb3c83c Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Tue, 3 Jan 2023 22:12:35 -0500 Subject: [PATCH 08/32] filesystem-like paths explained --- docs/source/hierarchical-data.rst | 45 ++++++++++++++++++++++--------- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index e0651388..95f35519 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -158,7 +158,6 @@ Let's use a different example of a tree to discuss more complex relationships be "/Bony Skeleton/Four Limbs/Amphibians": None, "/Bony Skeleton/Four Limbs/Amniotic Egg/Hair/Primates": None, "/Bony Skeleton/Four Limbs/Amniotic Egg/Hair/Rodents & Rabbits": None, - "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Crocodiles": None, "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Dinosaurs": None, "/Bony Skeleton/Four Limbs/Amniotic Egg/Two Fenestrae/Birds": None, }, @@ -267,29 +266,47 @@ Filesystem-like Paths Hierarchical trees can be thought of as analogous to file systems. Each node is like a directory, and each directory can contain both more sub-directories and data. +.. note:: + + You can even make the filesystem analogy concrete by using ``open_mfdatatree`` or ``save_mfdatatree`` # TODO not yet implemented - see GH issue 51 + Datatree objects support a syntax inspired by unix-like filesystems, where the "path" to a node is specified by the keys of each intermediate node in sequence, separated by forward slashes. +This is an extension of the conventional dictionary ``__getitem__`` syntax to allow navigation across multiple levels of the tree. + +Like with filepaths, paths within the tree can either be relative to the current node, e.g. .. ipython:: python abe["Homer/Bart"].name + abe["./Homer/Bart"].name # alternative syntax +or relative to the root node. +A path specified from the root (as opposed to being specified relative to an arbitrary node in the tree) is sometimes also referred to as a +`"fully qualified name" `_, +or as an "absolute path". The root node is referred to by ``"/"``, so the path from the root node to its grand-child would be ``"/child/grandchild"``, e.g. -EXAMPLE of path from root +.. ipython:: python -A path specified from the root (as opposed to being specified relative to an arbitrary node in the tree) is sometimes also referred to as a -`"fully qualified name" `_. + # absolute path will start from root node + lisa["/Homer/Bart"].name -file-like access via paths +Relative paths between nodes also support the ``"../"`` syntax to mean the parent of the current node. +We can use this with ``__setitem__`` to add a missing entry to our evolutionary tree, but add it relative to a more familiar node of interest: -set something using a relative path +.. ipython:: python -example of finding relative path, from bart to herbert? + primates["../../Two Fenestrae/Crocodiles"] = DataTree() + print(vertebrates) +Given two nodes in a tree, we can find their relative path: -Create a node with intermediates via ``__setitem__`` +.. ipython:: python + :okexcept: + + bart.find_relative_path(herbert) You can use this feature to build a nested tree from a dictionary of filesystem-like paths and corresponding ``xarray.Dataset`` objects in a single step. If we have a dictionary where each key is a valid path, and each value is either valid data or ``None``, @@ -306,13 +323,11 @@ we can construct a complex tree quickly using the alternative constructor ``:py: dt = DataTree.from_dict(d) dt -Notice that this method will also create any intermediate empty node necessary to reach the end of the specified path -(i.e. the node labelled `"c"` in this case.) - .. note:: - You can even make the filesystem analogy concrete by using ``open_mfdatatree`` or ``save_mfdatatree`` # TODO not yet implemented - see GH issue 51 - + Notice that using the path-like syntax will also create any intermediate empty nodes necessary to reach the end of the specified path + (i.e. the node labelled `"c"` in this case.) + This is to help avoid lots of redundant entries when creating deeply-nested trees using ``.from_dict``. .. _manipulating trees: @@ -341,6 +356,10 @@ Filter the Simpsons by age? Need to first recreate tree with age data in it +.. ipython:: + + simpsons.filter(node.age > 18) + leaves are either currently living or died out with no descendants Subset only the living leaves of the evolutionary tree? From 2de37ecb03c01dddd361d36abd8be0f490da6b82 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Wed, 4 Jan 2023 11:13:06 -0500 Subject: [PATCH 09/32] split PR into parts --- docs/source/hierarchical-data.rst | 73 ------------------------------- 1 file changed, 73 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 95f35519..ac29cae1 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -328,76 +328,3 @@ we can construct a complex tree quickly using the alternative constructor ``:py: Notice that using the path-like syntax will also create any intermediate empty nodes necessary to reach the end of the specified path (i.e. the node labelled `"c"` in this case.) This is to help avoid lots of redundant entries when creating deeply-nested trees using ``.from_dict``. - -.. _manipulating trees: - -Manipulating Trees ------------------- - -Altering Tree Branches -~~~~~~~~~~~~~~~~~~~~~~ - -pruning, grafting - -Tree of life? - -Graft new discoveries onto the tree? - -Prune when we realise something is in the wrong place? - -Save our updated tree out with ``to_dict`` - -Subsetting Tree Nodes -~~~~~~~~~~~~~~~~~~~~~ - -subset, filter - -Filter the Simpsons by age? - -Need to first recreate tree with age data in it - -.. ipython:: - - simpsons.filter(node.age > 18) - -leaves are either currently living or died out with no descendants -Subset only the living leaves of the evolutionary tree? - - -.. _tree computation: - -Computation ------------ - -Operations on Trees -~~~~~~~~~~~~~~~~~~~ - -Mapping of methods - -Arithmetic - -cause all Simpsons to age simultaneously - -Find total number of species -Find total biomass - -Mapping Custom Functions Over Trees -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.subtree, map_over_subtree - - -.. _multiple trees: - -Operating on Multiple Trees ---------------------------- - -Comparing trees -~~~~~~~~~~~~~~~ - -isomorphism - -Mapping over Multiple Trees -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -map_over_subtree with binary function From 15fa84a8e9e95c76d351c898151c223c44307c5c Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Wed, 4 Jan 2023 11:17:00 -0500 Subject: [PATCH 10/32] plan --- docs/source/hierarchical-data.rst | 73 +++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index ac29cae1..95f35519 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -328,3 +328,76 @@ we can construct a complex tree quickly using the alternative constructor ``:py: Notice that using the path-like syntax will also create any intermediate empty nodes necessary to reach the end of the specified path (i.e. the node labelled `"c"` in this case.) This is to help avoid lots of redundant entries when creating deeply-nested trees using ``.from_dict``. + +.. _manipulating trees: + +Manipulating Trees +------------------ + +Altering Tree Branches +~~~~~~~~~~~~~~~~~~~~~~ + +pruning, grafting + +Tree of life? + +Graft new discoveries onto the tree? + +Prune when we realise something is in the wrong place? + +Save our updated tree out with ``to_dict`` + +Subsetting Tree Nodes +~~~~~~~~~~~~~~~~~~~~~ + +subset, filter + +Filter the Simpsons by age? + +Need to first recreate tree with age data in it + +.. ipython:: + + simpsons.filter(node.age > 18) + +leaves are either currently living or died out with no descendants +Subset only the living leaves of the evolutionary tree? + + +.. _tree computation: + +Computation +----------- + +Operations on Trees +~~~~~~~~~~~~~~~~~~~ + +Mapping of methods + +Arithmetic + +cause all Simpsons to age simultaneously + +Find total number of species +Find total biomass + +Mapping Custom Functions Over Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.subtree, map_over_subtree + + +.. _multiple trees: + +Operating on Multiple Trees +--------------------------- + +Comparing trees +~~~~~~~~~~~~~~~ + +isomorphism + +Mapping over Multiple Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +map_over_subtree with binary function From 12d209c7030f0881490351b9d68825b2c6f42775 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Wed, 4 Jan 2023 16:18:34 +0000 Subject: [PATCH 11/32] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/data-structures.rst | 2 +- docs/source/hierarchical-data.rst | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/source/data-structures.rst b/docs/source/data-structures.rst index 98ad5af2..6bf1fe1a 100644 --- a/docs/source/data-structures.rst +++ b/docs/source/data-structures.rst @@ -108,7 +108,7 @@ or by dynamically updating the attributes of one node to refer to another: # add a second child by first creating a new node ... ds3 = xr.Dataset({"zed": np.NaN}) - node3 = DataTree(name='b', data=ds3) + node3 = DataTree(name="b", data=ds3) # ... then updating its .parent property node3.parent = dt diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 95f35519..e2d4384d 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -236,7 +236,7 @@ including ``keys``, ``values``, ``items``, ``__delitem__`` and ``update``. .. ipython:: python - vertebrates["Bony Skeleton"]["Ray-finned Fish"] + vertebrates["Bony Skeleton"]["Ray-finned Fish"] Note that the dict-like interface combines access to child ``DataTree`` nodes and stored ``DataArrays``, so if we have a node that contains both children and data, calling ``.keys()`` will list both names of child nodes and @@ -246,7 +246,7 @@ names of data variables: dt = DataTree( data=xr.Dataset({"foo": 0, "bar": 1}), - children={"a": DataTree(), "b": DataTree()} + children={"a": DataTree(), "b": DataTree()}, ) print(dt) list(dt.keys()) From ebc5b7585fa54b02d432fe2b4193c9e64fa4a8f7 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Thu, 5 Jan 2023 10:28:15 -0500 Subject: [PATCH 12/32] fix ipython bug --- docs/source/hierarchical-data.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 6ef1886c..08b5a97a 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -358,14 +358,14 @@ Filter the Simpsons by age? Need to first recreate tree with age data in it -.. ipython:: +.. ipython:: python + :okexcept: - simpsons.filter(node.age > 18) + simpsons.filter(node.age > 18) leaves are either currently living or died out with no descendants Subset only the living leaves of the evolutionary tree? - .. _tree computation: Computation From 2a4286f978a9091bf0bf26db21c04e4e37f03813 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Thu, 5 Jan 2023 15:30:39 +0000 Subject: [PATCH 13/32] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/hierarchical-data.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 08b5a97a..a3b8da06 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -359,9 +359,9 @@ Filter the Simpsons by age? Need to first recreate tree with age data in it .. ipython:: python - :okexcept: + :okexcept: - simpsons.filter(node.age > 18) + simpsons.filter(node.age > 18) leaves are either currently living or died out with no descendants Subset only the living leaves of the evolutionary tree? From 7605fe88fd1d9ab60ce50dd7d435465a8ac793d7 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Thu, 5 Jan 2023 16:19:12 -0500 Subject: [PATCH 14/32] filter simpsons family tree by age --- docs/source/hierarchical-data.rst | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index db19d81f..38f5242f 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -361,9 +361,32 @@ Filter the Simpsons by age? Need to first recreate tree with age data in it .. ipython:: python - :okexcept: - simpsons.filter(node.age > 18) + simpsons = DataTree.from_dict( + d={ + "/": xr.Dataset({"age": 83}), + "/Herbert": xr.Dataset({"age": 40}), + "/Homer": xr.Dataset({"age": 39}), + "/Homer/Bart": xr.Dataset({"age": 10}), + "/Homer/Lisa": xr.Dataset({"age": 8}), + "/Homer/Maggie": xr.Dataset({"age": 1}), + }, + name="Abe", + ) + simpsons + +.. ipython:: python + + def filter(dt, filterfunc): + filtered_nodes = {node.path: node.ds for node in dt.subtree if filterfunc(node)} + return DataTree.from_dict(filtered_nodes, name=dt.root.name) + +.. ipython:: python + + filter(simpsons, lambda node: node["age"] > 18) + + + leaves are either currently living or died out with no descendants Subset only the living leaves of the evolutionary tree? From d4772e38365690c4d66a9dbcc7c70767ffd7f1e8 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Fri, 6 Jan 2023 11:45:36 -0500 Subject: [PATCH 15/32] use new filter method --- docs/source/hierarchical-data.rst | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 38f5242f..5fb33551 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -377,15 +377,7 @@ Need to first recreate tree with age data in it .. ipython:: python - def filter(dt, filterfunc): - filtered_nodes = {node.path: node.ds for node in dt.subtree if filterfunc(node)} - return DataTree.from_dict(filtered_nodes, name=dt.root.name) - -.. ipython:: python - - filter(simpsons, lambda node: node["age"] > 18) - - + simpsons.filter(lambda node: node["age"] > 18) leaves are either currently living or died out with no descendants From e633b8133b8710fcd4d4b9dc078c7fce5d29d92f Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Fri, 6 Jan 2023 14:02:56 -0500 Subject: [PATCH 16/32] test about filter --- docs/source/hierarchical-data.rst | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 5fb33551..7fbf1d67 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -338,8 +338,8 @@ we can construct a complex tree quickly using the alternative constructor :py:me Manipulating Trees ------------------ -Altering Tree Branches -~~~~~~~~~~~~~~~~~~~~~~ +Moving Tree Branches +~~~~~~~~~~~~~~~~~~~~ pruning, grafting @@ -354,11 +354,11 @@ Save our updated tree out with ``to_dict`` Subsetting Tree Nodes ~~~~~~~~~~~~~~~~~~~~~ -subset, filter - -Filter the Simpsons by age? +We can subset our tree to select only nodes of interest in various ways. -Need to first recreate tree with age data in it +The :py:meth:`DataTree.filter` method can be used to retain only the nodes of a tree that meet a certain condition. +For example, we could recreate the Simpson's family tree with the ages of each individual, then filter for only the adults: +First lets recreate the tree but with an `age` data variable in every node: .. ipython:: python @@ -375,14 +375,22 @@ Need to first recreate tree with age data in it ) simpsons +Now let's filter out the minors: + .. ipython:: python simpsons.filter(lambda node: node["age"] > 18) +The result is a new tree, containing only the nodes matching the condition. leaves are either currently living or died out with no descendants Subset only the living leaves of the evolutionary tree? +Collapsing Subtrees +~~~~~~~~~~~~~~~~~~~ + +Merge all nodes in one subtree into a single dataset + .. _tree computation: Computation From 487df12ee99234364ca8f771545e84ce8ec25e9c Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Fri, 6 Jan 2023 14:12:00 -0500 Subject: [PATCH 17/32] simple example of mapping over a subtree --- docs/source/hierarchical-data.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 7fbf1d67..6eed6ccf 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -413,6 +413,17 @@ Mapping Custom Functions Over Trees .subtree, map_over_subtree +.. ipython:: python + + def fast_forward(ds: xr.Dataset, years: float) -> xr.Dataset: + """Add some years to the age""" + new_ds = ds.copy() + new_ds["age"] = ds["age"] + years + return new_ds + +.. ipython:: python + + simpsons.map_over_subtree(fast_forward, years=10) .. _multiple trees: From c1bd68c99a5860bb64e1d965e05ea004452b8fc3 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Fri, 6 Jan 2023 14:17:10 -0500 Subject: [PATCH 18/32] ideas for docs on iterating over trees --- docs/source/hierarchical-data.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 6eed6ccf..77f035ed 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -333,6 +333,13 @@ we can construct a complex tree quickly using the alternative constructor :py:me (i.e. the node labelled `"c"` in this case.) This is to help avoid lots of redundant entries when creating deeply-nested trees using :py:meth:`DataTree.from_dict`. +Iterating over trees +~~~~~~~~~~~~~~~~~~~~ + +for loops over ``.subtree`` +rebuilding trees using ``.subtree``, ``.path``, and ``.from_dict`` + + .. _manipulating trees: Manipulating Trees From 8b4f705c40a82ad372950cc3bc2b32e89e9e3174 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Fri, 6 Jan 2023 15:00:54 -0500 Subject: [PATCH 19/32] add section on iterating over subtree --- docs/source/hierarchical-data.rst | 37 ++++++++++++++++++++++++++----- 1 file changed, 31 insertions(+), 6 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index a6503e4e..f0576d4a 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -175,7 +175,7 @@ Let's use a different example of a tree to discuss more complex relationships be ] We have used the :py:meth:`~DataTree.from_dict` constructor method as an alternate way to quickly create a whole tree, -and :ref:`filesystem-like syntax `_ (to be explained shortly) to select two nodes of interest. +and :ref:`filesystem paths` (to be explained shortly) to select two nodes of interest. .. ipython:: python @@ -335,12 +335,34 @@ we can construct a complex tree quickly using the alternative constructor :py:me (i.e. the node labelled `"c"` in this case.) This is to help avoid lots of redundant entries when creating deeply-nested trees using :py:meth:`DataTree.from_dict`. +.. _iterating over trees: + Iterating over trees ~~~~~~~~~~~~~~~~~~~~ -for loops over ``.subtree`` -rebuilding trees using ``.subtree``, ``.path``, and ``.from_dict`` +You can iterate over every node in a tree using the subtree :py:class:`~DataTree.subtree` property. +This returns an iterable of nodes, which yields them in depth-first order. + +.. ipython:: python + + for node in vertebrates.subtree: + print(node.path) + +A very useful pattern is to use :py:class:`~DataTree.subtree` conjunction with the :py:class:`~DataTree.path` property to manipulate the nodes however you wish, +then rebuild a new tree using :py:meth:`DataTree.from_dict()`. + +For example, we could keep only the nodes containing data by looping over all nodes, +checking if they contain any data using :py:class:`~DataTree.has_data`, +then rebuilding a new tree using only the paths of those nodes: + +.. ipython:: python + + non_empty_nodes = {node.path: node.ds for node in dt.subtree if node.has_data} + DataTree.from_dict(non_empty_nodes) + +You can see this tree is similar to the ``dt`` object above, except that it is missing the empty nodes ``a/c`` and ``a/c/d``. +(If you want to keep the name of the root node, you will need to add the ``name`` kwarg to :py:class:`from_dict`, i.e. ``DataTree.from_dict(non_empty_nodes, name=dt.root.name)``.) .. _manipulating trees: @@ -360,6 +382,9 @@ Prune when we realise something is in the wrong place? Save our updated tree out with ``to_dict`` +leaves are either currently living or died out with no descendants +Subset only the living leaves of the evolutionary tree? + Subsetting Tree Nodes ~~~~~~~~~~~~~~~~~~~~~ @@ -392,8 +417,8 @@ Now let's filter out the minors: The result is a new tree, containing only the nodes matching the condition. -leaves are either currently living or died out with no descendants -Subset only the living leaves of the evolutionary tree? +(Yes, under the hood :py:meth:`~DataTree.filter` is just syntactic sugar for the pattern we showed you in :ref:`iterating over trees` !) + Collapsing Subtrees ~~~~~~~~~~~~~~~~~~~ @@ -420,7 +445,7 @@ Find total biomass Mapping Custom Functions Over Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.subtree, map_over_subtree +map_over_subtree .. ipython:: python From cf5c2c0f05156472c0e65f9d3e98a960bd1301cc Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 17:54:53 -0400 Subject: [PATCH 20/32] text to accompany Simpsons family aging example --- docs/source/hierarchical-data.rst | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 073c59a6..43b76810 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -450,13 +450,22 @@ Find total biomass Mapping Custom Functions Over Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -map_over_subtree +You can map custom computation over the nodes in a tree using :py:func:`map_over_subtree`. +You can map any function, so long as it takes `xarray.Dataset` objects as one (or more) of the input arguments, +and returns one (or more) xarray datasets. + +.. note:: + + Functions passed to :py:func:`map_over_subtree` cannot alter nodes in-place. + Instead they must return new `xarray.Dataset` objects. + +For example, we can alter the ages of the entire Simpson family at once .. ipython:: python def fast_forward(ds: xr.Dataset, years: float) -> xr.Dataset: - """Add some years to the age""" - new_ds = ds.copy() + """Add some years to the age variable""" + new_ds = ds.copy() # (necessary because we cannot alter dt.ds in-place) new_ds["age"] = ds["age"] + years return new_ds From 23b9d2108e399c111b121329eb268ca4ac3d35d0 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 18:54:06 -0400 Subject: [PATCH 21/32] add voltage dataset --- docs/source/hierarchical-data.rst | 89 ++++++++++++++++++++++++++++--- 1 file changed, 81 insertions(+), 8 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 43b76810..331ec911 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -430,22 +430,85 @@ Collapsing Subtrees Merge all nodes in one subtree into a single dataset +Find total number of species +Find total biomass + .. _tree computation: Computation ----------- -Operations on Trees -~~~~~~~~~~~~~~~~~~~ +`DataTree` objects are also useful for performing computations, not just for organizing data. -Mapping of methods +Operations and Methods on Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Arithmetic +To show how applying operations across a whole tree at once can be useful, +let's first create a example scientific dataset. -cause all Simpsons to age simultaneously +.. ipython:: python -Find total number of species -Find total biomass + def time_stamps(n_samples, T): + """Create an array of evenly-spaced time stamps""" + return xr.DataArray(data=np.linspace(0, 2 * np.pi * T, n_samples), dims=['time']) + + def signal_generator(t, f, A, phase): + """Generate an example electrical-like waveform""" + return A * np.sin(f * t.data + phase) + + time_stamps1 = time_stamps(n_samples=15, T=1.5) + time_stamps2 = time_stamps(n_samples=10, T=1.0) + + readings = DataTree.from_dict( + { + "/oscilloscope1": xr.Dataset( + { + "potential": ('time', signal_generator(time_stamps1, f=2, A=1.2, phase=0.5)), + "current": ('time', signal_generator(time_stamps1, f=2, A=1.2, phase=1)), + }, + coords={'time': time_stamps1}, + ), + "/oscilloscope2": xr.Dataset( + { + "potential": ('time', signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.2)), + "current": ('time', signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7)), + }, + coords={'time': time_stamps2}, + ), + } + ) + readings + +Most xarray computation methods also exist as methods on datatree objects, +so you can for example take the mean value of these two timeseries at once: + +.. ipython:: python + + readings.mean(dim='time') + +This works by mapping the standard :py:meth:`xarray.Dataset.mean()` method over the dataset stored in each node of the +tree one-by-one. + +The arguments passed to the method are used for every node, so the values of the arguments you pass might be valid for one node and invalid for another + +.. ipython:: python + :okexcept + + readings.isel(time=12) + +Notice that the error raised helpfully indicates which node of the tree the operation failed on. + +Arithmetic Methods on Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Arithmetic methods are also implemented, so you can e.g. add a scalar to every dataset in the tree at once. +For example, we can advance the timeline of the Simpsons by a decade just by + +.. ipython:: python + + simpsons + 10 + +See that the same change (fast-forwarding by adding 10 years to the age of each character) has been applied to every node. Mapping Custom Functions Over Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -459,7 +522,9 @@ and returns one (or more) xarray datasets. Functions passed to :py:func:`map_over_subtree` cannot alter nodes in-place. Instead they must return new `xarray.Dataset` objects. -For example, we can alter the ages of the entire Simpson family at once +RMS voltage + +For example, we could have altered the ages of the entire Simpson family at once using a custom function instead: .. ipython:: python @@ -482,8 +547,16 @@ Comparing trees ~~~~~~~~~~~~~~~ isomorphism +:py:class:`IsomorphismError` + +Arithmetic Between Multiple Trees +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +P = VI Mapping over Multiple Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~ map_over_subtree with binary function +example? +meter readings? \ No newline at end of file From 997ed416664ec7fb0a4bd5910f1465231c11afd3 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 18:58:40 -0400 Subject: [PATCH 22/32] RMS as example of mapping custom computation --- docs/source/hierarchical-data.rst | 19 ++++++------------- 1 file changed, 6 insertions(+), 13 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 331ec911..76f355df 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -513,7 +513,7 @@ See that the same change (fast-forwarding by adding 10 years to the age of each Mapping Custom Functions Over Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You can map custom computation over the nodes in a tree using :py:func:`map_over_subtree`. +You can map custom computation over each node in a tree using :py:func:`map_over_subtree`. You can map any function, so long as it takes `xarray.Dataset` objects as one (or more) of the input arguments, and returns one (or more) xarray datasets. @@ -522,21 +522,14 @@ and returns one (or more) xarray datasets. Functions passed to :py:func:`map_over_subtree` cannot alter nodes in-place. Instead they must return new `xarray.Dataset` objects. -RMS voltage +For example, can calculate the Root Mean Square value of these signals: -For example, we could have altered the ages of the entire Simpson family at once using a custom function instead: +.. ipython:: ipython -.. ipython:: python - - def fast_forward(ds: xr.Dataset, years: float) -> xr.Dataset: - """Add some years to the age variable""" - new_ds = ds.copy() # (necessary because we cannot alter dt.ds in-place) - new_ds["age"] = ds["age"] + years - return new_ds - -.. ipython:: python + def rms(signal): + return np.sqrt(np.mean(signal**2)) - simpsons.map_over_subtree(fast_forward, years=10) + rms(readings) .. _multiple trees: From 0dbe2183a42fe7f6dd5187b8be245880073f6a5c Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 19:16:37 -0400 Subject: [PATCH 23/32] isomorphism --- docs/source/hierarchical-data.rst | 32 ++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 76f355df..f4493257 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -492,7 +492,7 @@ tree one-by-one. The arguments passed to the method are used for every node, so the values of the arguments you pass might be valid for one node and invalid for another .. ipython:: python - :okexcept + :okexcept: readings.isel(time=12) @@ -536,11 +536,33 @@ For example, can calculate the Root Mean Square value of these signals: Operating on Multiple Trees --------------------------- -Comparing trees -~~~~~~~~~~~~~~~ +The examples so far have involved mapping functions or methods over the nodes of a single tree, +but we can generalize this to mapping functions over multiple trees at once. + +Comparing Trees for Isomorphism +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For it to make sense to map a single non-unary function over the nodes of multiple trees at once, +each tree needs to have the same structure. Specifically two trees can only be considered similar, or "isomorphic", +if they have the same number of nodes, and each corresponding node has the same number of children. +We can check if any two trees are isomorphic using the :py:meth:`DataTree.isomorphic` method. + +.. ipython:: ipython + :okexcept: + + dt1 = DataTree.from_dict({'a': None, 'a/b': None}) + dt2 = DataTree.from_dict({'a': None}) + dt1.isomorphic(dt2) + + dt3 = DataTree.from_dict({'a': None, 'b': None}) + dt1.isomorphic(dt3) + + dt4 = DataTree.from_dict({'A': None, 'A/B': xr.Dataset({'foo': 1})}) + dt1.isomorphic(dt4) + -isomorphism -:py:class:`IsomorphismError` +If the trees are not isomorphic a :py:class:`~TreeIsomorphismError` will be raised. +Notice that corresponding tree nodes do not need to have the same name or contain the same data in order to be considered isomorphic. Arithmetic Between Multiple Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From e2aa95fa317aebe0449ee4e8f7f2bfe1b4791068 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 19:22:17 -0400 Subject: [PATCH 24/32] P=IV example of binary multiplication --- docs/source/hierarchical-data.rst | 42 ++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 6 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index f4493257..dced066f 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -459,7 +459,7 @@ let's first create a example scientific dataset. time_stamps1 = time_stamps(n_samples=15, T=1.5) time_stamps2 = time_stamps(n_samples=10, T=1.0) - readings = DataTree.from_dict( + voltages = DataTree.from_dict( { "/oscilloscope1": xr.Dataset( { @@ -477,14 +477,14 @@ let's first create a example scientific dataset. ), } ) - readings + voltages Most xarray computation methods also exist as methods on datatree objects, so you can for example take the mean value of these two timeseries at once: .. ipython:: python - readings.mean(dim='time') + voltages.mean(dim='time') This works by mapping the standard :py:meth:`xarray.Dataset.mean()` method over the dataset stored in each node of the tree one-by-one. @@ -494,7 +494,7 @@ The arguments passed to the method are used for every node, so the values of the .. ipython:: python :okexcept: - readings.isel(time=12) + voltages.isel(time=12) Notice that the error raised helpfully indicates which node of the tree the operation failed on. @@ -560,14 +560,44 @@ We can check if any two trees are isomorphic using the :py:meth:`DataTree.isomor dt4 = DataTree.from_dict({'A': None, 'A/B': xr.Dataset({'foo': 1})}) dt1.isomorphic(dt4) - If the trees are not isomorphic a :py:class:`~TreeIsomorphismError` will be raised. Notice that corresponding tree nodes do not need to have the same name or contain the same data in order to be considered isomorphic. Arithmetic Between Multiple Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -P = VI +Arithmetic operations like multiplication are binary operations, so as long as we have wo isomorphic trees, +we can do arithmetic between them. + +.. ipython:: ipython + + currents = DataTree.from_dict( + { + "/oscilloscope1": xr.Dataset( + { + "current": ('time', signal_generator(time_stamps1, f=2, A=1.2, phase=1)), + }, + coords={'time': time_stamps1}, + ), + "/oscilloscope2": xr.Dataset( + { + "current": ('time', signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7)), + }, + coords={'time': time_stamps2}, + ), + } + ) + currents + + currents.isomorphic(voltages) + +We could use this feature to quickly calculate the electrical power in our signal, P=IV. + +.. ipython:: ipython + + power = currents * voltages + power + Mapping over Multiple Trees ~~~~~~~~~~~~~~~~~~~~~~~~~~~ From ccc9957ce12121804cc6bd45ebfb8b007942d35c Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon, 23 Oct 2023 23:25:07 +0000 Subject: [PATCH 25/32] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/hierarchical-data.rst | 34 +++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index dced066f..1775dc3f 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -450,12 +450,16 @@ let's first create a example scientific dataset. def time_stamps(n_samples, T): """Create an array of evenly-spaced time stamps""" - return xr.DataArray(data=np.linspace(0, 2 * np.pi * T, n_samples), dims=['time']) + return xr.DataArray( + data=np.linspace(0, 2 * np.pi * T, n_samples), dims=["time"] + ) + def signal_generator(t, f, A, phase): """Generate an example electrical-like waveform""" return A * np.sin(f * t.data + phase) + time_stamps1 = time_stamps(n_samples=15, T=1.5) time_stamps2 = time_stamps(n_samples=10, T=1.0) @@ -463,17 +467,29 @@ let's first create a example scientific dataset. { "/oscilloscope1": xr.Dataset( { - "potential": ('time', signal_generator(time_stamps1, f=2, A=1.2, phase=0.5)), - "current": ('time', signal_generator(time_stamps1, f=2, A=1.2, phase=1)), + "potential": ( + "time", + signal_generator(time_stamps1, f=2, A=1.2, phase=0.5), + ), + "current": ( + "time", + signal_generator(time_stamps1, f=2, A=1.2, phase=1), + ), }, - coords={'time': time_stamps1}, + coords={"time": time_stamps1}, ), "/oscilloscope2": xr.Dataset( { - "potential": ('time', signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.2)), - "current": ('time', signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7)), + "potential": ( + "time", + signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.2), + ), + "current": ( + "time", + signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7), + ), }, - coords={'time': time_stamps2}, + coords={"time": time_stamps2}, ), } ) @@ -484,7 +500,7 @@ so you can for example take the mean value of these two timeseries at once: .. ipython:: python - voltages.mean(dim='time') + voltages.mean(dim="time") This works by mapping the standard :py:meth:`xarray.Dataset.mean()` method over the dataset stored in each node of the tree one-by-one. @@ -604,4 +620,4 @@ Mapping over Multiple Trees map_over_subtree with binary function example? -meter readings? \ No newline at end of file +meter readings? From 3c6418b9235204156427287ed2f1c837bb676115 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 19:27:04 -0400 Subject: [PATCH 26/32] remove unfinished sections --- docs/source/hierarchical-data.rst | 33 ------------------------------- 1 file changed, 33 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index dced066f..d59c7eab 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -374,22 +374,6 @@ You can see this tree is similar to the ``dt`` object above, except that it is m Manipulating Trees ------------------ -Moving Tree Branches -~~~~~~~~~~~~~~~~~~~~ - -pruning, grafting - -Tree of life? - -Graft new discoveries onto the tree? - -Prune when we realise something is in the wrong place? - -Save our updated tree out with ``to_dict`` - -leaves are either currently living or died out with no descendants -Subset only the living leaves of the evolutionary tree? - Subsetting Tree Nodes ~~~~~~~~~~~~~~~~~~~~~ @@ -424,15 +408,6 @@ The result is a new tree, containing only the nodes matching the condition. (Yes, under the hood :py:meth:`~DataTree.filter` is just syntactic sugar for the pattern we showed you in :ref:`iterating over trees` !) - -Collapsing Subtrees -~~~~~~~~~~~~~~~~~~~ - -Merge all nodes in one subtree into a single dataset - -Find total number of species -Find total biomass - .. _tree computation: Computation @@ -597,11 +572,3 @@ We could use this feature to quickly calculate the electrical power in our signa power = currents * voltages power - - -Mapping over Multiple Trees -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -map_over_subtree with binary function -example? -meter readings? \ No newline at end of file From e102fd96d7457e735f9629da93b4f09359abd7a8 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 19:35:11 -0400 Subject: [PATCH 27/32] fix --- docs/source/hierarchical-data.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index d59c7eab..64cd55eb 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -504,6 +504,9 @@ For example, can calculate the Root Mean Square value of these signals: def rms(signal): return np.sqrt(np.mean(signal**2)) + +.. ipython:: ipython + rms(readings) .. _multiple trees: From 14f366d21e184a0ad62680f346dea1376485c645 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Mon, 23 Oct 2023 23:36:42 +0000 Subject: [PATCH 28/32] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/hierarchical-data.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 94ee70f0..d60f7617 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -591,4 +591,3 @@ We could use this feature to quickly calculate the electrical power in our signa power = currents * voltages power - From 7c51b7ba1ff8f8c87d28150662eca7603f926101 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 19:37:31 -0400 Subject: [PATCH 29/32] whatsnew --- docs/source/whats-new.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/source/whats-new.rst b/docs/source/whats-new.rst index 8e62adce..20bd63e3 100644 --- a/docs/source/whats-new.rst +++ b/docs/source/whats-new.rst @@ -38,6 +38,9 @@ Bug fixes Documentation ~~~~~~~~~~~~~ +- Added new sections to page on ``Working with Hierarchical Data`` (:pull:`180`) + By `Tom Nicholas `_. + Internal Changes ~~~~~~~~~~~~~~~~ From 4d3dab50ed0f4c62c2b4cc59d95b33e4ffeeb990 Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 19:54:48 -0400 Subject: [PATCH 30/32] fix2 --- docs/source/hierarchical-data.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index d60f7617..3652b14a 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -513,13 +513,14 @@ and returns one (or more) xarray datasets. Functions passed to :py:func:`map_over_subtree` cannot alter nodes in-place. Instead they must return new `xarray.Dataset` objects. -For example, can calculate the Root Mean Square value of these signals: +For example, we can define a function to calculate the Root Mean Square of a timeseries .. ipython:: ipython def rms(signal): return np.sqrt(np.mean(signal**2)) +Then calculate the RMS value of these signals: .. ipython:: ipython From 96bc0d4966bd47ec1db836620463d098d2f2028f Mon Sep 17 00:00:00 2001 From: Thomas Nicholas Date: Mon, 23 Oct 2023 21:50:54 -0400 Subject: [PATCH 31/32] fix3 --- docs/source/hierarchical-data.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index 3652b14a..dc1e50a4 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -515,14 +515,14 @@ and returns one (or more) xarray datasets. For example, we can define a function to calculate the Root Mean Square of a timeseries -.. ipython:: ipython +.. ipython:: python def rms(signal): return np.sqrt(np.mean(signal**2)) Then calculate the RMS value of these signals: -.. ipython:: ipython +.. ipython:: python rms(readings) @@ -542,7 +542,7 @@ each tree needs to have the same structure. Specifically two trees can only be c if they have the same number of nodes, and each corresponding node has the same number of children. We can check if any two trees are isomorphic using the :py:meth:`DataTree.isomorphic` method. -.. ipython:: ipython +.. ipython:: python :okexcept: dt1 = DataTree.from_dict({'a': None, 'a/b': None}) @@ -564,7 +564,7 @@ Arithmetic Between Multiple Trees Arithmetic operations like multiplication are binary operations, so as long as we have wo isomorphic trees, we can do arithmetic between them. -.. ipython:: ipython +.. ipython:: python currents = DataTree.from_dict( { @@ -588,7 +588,7 @@ we can do arithmetic between them. We could use this feature to quickly calculate the electrical power in our signal, P=IV. -.. ipython:: ipython +.. ipython:: python power = currents * voltages power From bffdb7bdb4e078f872e0b3fc246c5653b54baae8 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Tue, 24 Oct 2023 01:51:15 +0000 Subject: [PATCH 32/32] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- docs/source/hierarchical-data.rst | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/docs/source/hierarchical-data.rst b/docs/source/hierarchical-data.rst index dc1e50a4..7795c9e2 100644 --- a/docs/source/hierarchical-data.rst +++ b/docs/source/hierarchical-data.rst @@ -545,14 +545,14 @@ We can check if any two trees are isomorphic using the :py:meth:`DataTree.isomor .. ipython:: python :okexcept: - dt1 = DataTree.from_dict({'a': None, 'a/b': None}) - dt2 = DataTree.from_dict({'a': None}) + dt1 = DataTree.from_dict({"a": None, "a/b": None}) + dt2 = DataTree.from_dict({"a": None}) dt1.isomorphic(dt2) - dt3 = DataTree.from_dict({'a': None, 'b': None}) + dt3 = DataTree.from_dict({"a": None, "b": None}) dt1.isomorphic(dt3) - dt4 = DataTree.from_dict({'A': None, 'A/B': xr.Dataset({'foo': 1})}) + dt4 = DataTree.from_dict({"A": None, "A/B": xr.Dataset({"foo": 1})}) dt1.isomorphic(dt4) If the trees are not isomorphic a :py:class:`~TreeIsomorphismError` will be raised. @@ -570,15 +570,21 @@ we can do arithmetic between them. { "/oscilloscope1": xr.Dataset( { - "current": ('time', signal_generator(time_stamps1, f=2, A=1.2, phase=1)), + "current": ( + "time", + signal_generator(time_stamps1, f=2, A=1.2, phase=1), + ), }, - coords={'time': time_stamps1}, + coords={"time": time_stamps1}, ), "/oscilloscope2": xr.Dataset( { - "current": ('time', signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7)), + "current": ( + "time", + signal_generator(time_stamps2, f=1.6, A=1.6, phase=0.7), + ), }, - coords={'time': time_stamps2}, + coords={"time": time_stamps2}, ), } )