-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Description
In some cases, a global domain of a simulation or workload is needed for analysis or training. When using the SmartRedis clients in an MPI application, each rank sends it's own data to the Orchestrator
(redis/keydb) database. a method, Client.aggregate
could be very useful in combining data back into a global field.
Justification
Online analysis is a primary use case here. For example, let's say we are running a weather simulation and would like to view a global snapshot of the domain every 50 timesteps. currently, one must create a helper function to retrieve data from all the ranks that sent data to the database and recombine it into a global field before it can be analyized.
Also, future methods like to_xarray
become much more useful if a domain can be aggregated first.
Implementation Strategy
This ticket is meant to provide a place for feedback on this issue. The result of this ticket is a design document that will be shared with the community to collect feedback.
I've been ideating on this however, and I'm thinking something like the following method
from smartredis import Client
client = Client(cluster=True)
dataset = client.aggregate("hello_world_rank_*")
This example would collect all the tensors in the database the start with hello_world_rank_
. Essentially this is like glob syntax for specifying which tensors to retrieve.
Open questions:
- Does a simple concat of all tensors work for most use cases?
- How does the user specify how the domain should be recombined in more complex cases? (unstructured mesh?)
- Multiple fields in one dataset/aggregate call?
- How can we use the metadata fields of the
DataSet
object to make this easier for users?
It's important that when we design this method, we keep in mind that the idea is to be able to quickly convert to something like an xarray dataset.
for example:
from smartredis import Client
client = Client(cluster=True)
dataset = client.aggregate("hello_world_rank_*").to_xarray()
dataset.plot() # use xarray methods now.
Acceptance Criteria
- Create a design document with how this aggregation method will work (discussion should take place here)
- Post link to the design document here
- Incorporate feedback and open issue to implement
Client.aggregate
Tagging some people who I think would provide good feedback on this issue.
@ashao @rabernat @mellis13 @nbren12