-
Notifications
You must be signed in to change notification settings - Fork 25
docs: new sharding docs #1370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: new sharding docs #1370
Conversation
a9ac5dd
to
09a1ff7
Compare
09a1ff7
to
b06c89d
Compare
|
||
### 4-way Batch & 2-way Model Parallelism | ||
|
||
## Related links |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is relevant here since that's about hardware rather than software (although it's some hardware which is very related to XLA), but recently I came across this very good overview about TPUs: https://henryhmko.github.io/posts/tpu/tpu.html
fc927e0
to
6a2c54f
Compare
bf868d3
to
5784970
Compare
b5f8b3a
to
5b3e7d7
Compare
11a85b5
to
ea4367d
Compare
<!-- | ||
TODO describe how arrays are the "global data arrays, even though data is itself only stored | ||
on relevant device and computation is performed only devices with the required data | ||
(effectively showing under the hood how execution occurs) | ||
--> | ||
|
||
<!-- | ||
TODO make a simple conway's game of life, or heat equation using sharding simulation example | ||
to show how a ``typical MPI'' simulation can be written using sharding. | ||
--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the TODOs, also the ones below, are not hidden: https://enzymead.github.io/Reactant.jl/stable/tutorials/sharding
No description provided.