Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
d672b84
add jit-friendly dropout w rate in call
priyakasimbeg Apr 17, 2025
aa25e20
remove nan_to_num convertion
priyakasimbeg May 5, 2025
85a3578
update models with custom dropout layer
priyakasimbeg May 5, 2025
9354079
add functional dropout for criteo, fastmri, and vit
priyakasimbeg May 5, 2025
feb9cc5
add functional dropout for ogbg
priyakasimbeg May 5, 2025
9bba078
modify wmt model for dropout passing
priyakasimbeg May 15, 2025
31f6019
modify wmt model for dropout passing
priyakasimbeg May 15, 2025
e36d294
reformatting and dropout fixes to fastmri and vit
priyakasimbeg May 29, 2025
363da8a
dropout fix for criteo1tb jax
priyakasimbeg May 29, 2025
341bf89
dropout fix for criteo1tb jax
priyakasimbeg May 29, 2025
f0c385b
remove aux dropout option from conformer and from init_model_fn signa…
priyakasimbeg May 29, 2025
7af5c94
add dropout piping for conformer and deepspeech
priyakasimbeg May 31, 2025
cbd065b
pipe dropout through model_fn
priyakasimbeg May 31, 2025
31babfd
fix syntax
priyakasimbeg Jun 4, 2025
95d67db
dropout changes wmt jax
priyakasimbeg Jun 4, 2025
2c96b88
modify dockerfile
priyakasimbeg Jun 4, 2025
54786a6
modify docker build script
priyakasimbeg Jun 4, 2025
9c189ad
Update metrics.py - fix for ogbg pytorch
davidtweedle Jun 5, 2025
246d68e
fsmall fixes
priyakasimbeg Jun 5, 2025
0c8dd14
change docker base image to 12.1.1
priyakasimbeg Jun 6, 2025
a78fa66
update base image
priyakasimbeg Jun 6, 2025
4beef49
add slurm instructions
priyakasimbeg Jun 7, 2025
2b782dd
formatting docs
priyakasimbeg Jun 7, 2025
dfe4fb4
update instrucitons
priyakasimbeg Jun 7, 2025
f0019ac
small fix
priyakasimbeg Jun 7, 2025
3cb012e
remove aux_dropout from submission_runner.py
priyakasimbeg Jun 7, 2025
fdc956b
Update metrics.py
davidtweedle Jun 9, 2025
6c888df
Update metrics.py
davidtweedle Jun 9, 2025
e4a55ab
Update metrics.py
davidtweedle Jun 9, 2025
07f89a2
Update metrics.py
davidtweedle Jun 9, 2025
3e436c7
Update metrics.py
davidtweedle Jun 9, 2025
b306076
dropout fix criteo, fastmri, vit, conf
Niccolo-Ajroldi Jun 10, 2025
3e7a396
dropout fix deepspeech, ogbg
Niccolo-Ajroldi Jun 11, 2025
e80add4
remove attention_dropout_rate from wmt
Niccolo-Ajroldi Jun 11, 2025
84b1bd1
dropout fix on wmt
Niccolo-Ajroldi Jun 11, 2025
af08bb9
fix dropout, ALL tested
Niccolo-Ajroldi Jun 11, 2025
7a6651a
add dropout equivalence tests
Niccolo-Ajroldi Jun 11, 2025
a7ff3d1
moved custom dropout to pytorch_utils
Niccolo-Ajroldi Jun 11, 2025
f26ab02
remove aux_dropout from pytorch workloads
Niccolo-Ajroldi Jun 11, 2025
8723937
Update submission.py
priyakasimbeg Jun 11, 2025
e0a0e62
criteo rm dropout from init
Niccolo-Ajroldi Jun 12, 2025
1e2f379
criteo rm dropout from init
Niccolo-Ajroldi Jun 12, 2025
f10e3dc
criteo rm dropout from init
Niccolo-Ajroldi Jun 12, 2025
027b053
criteo rm dropout from init
Niccolo-Ajroldi Jun 12, 2025
74c43aa
fastmri rm dropout from init
Niccolo-Ajroldi Jun 12, 2025
64276ef
vit rm dropout at init
Niccolo-Ajroldi Jun 12, 2025
44029d2
vit rm dropout at init
Niccolo-Ajroldi Jun 12, 2025
44ffec1
add default dropout test
Niccolo-Ajroldi Jun 12, 2025
9d12fa6
add default dropout test
Niccolo-Ajroldi Jun 12, 2025
ac45a9f
conformer: rm dropout_rate from init
Niccolo-Ajroldi Jun 12, 2025
31d64f6
rm dropout_rate at init from all workloads
Niccolo-Ajroldi Jun 12, 2025
5e192dd
remove dropout_rate from init_model_fn for all jax workloads
priyakasimbeg Jun 12, 2025
23828cd
remove dropout from model initialization call in submission_runner.py
priyakasimbeg Jun 12, 2025
86b8624
remove dropout check for None and use default instead if not passed
priyakasimbeg Jun 12, 2025
0128c9f
pipe dropout to model_fn, set default in workload
Niccolo-Ajroldi Jun 13, 2025
a7cba1a
remove aux_dropout from pytorch workloads
Niccolo-Ajroldi Jun 13, 2025
05bff91
fix to model_fn default dropout value
priyakasimbeg Jun 13, 2025
d8e39b0
fix to model_fn default dropout_rate
Niccolo-Ajroldi Jun 15, 2025
7a00158
rm models_dropout torch files
Niccolo-Ajroldi Jun 15, 2025
f7d99a6
fixes
priyakasimbeg Jun 17, 2025
4f9a4b3
Merge branch 'dev' into dropout_jax
priyakasimbeg Jun 17, 2025
3a41559
fix reference_algorithm_tests.py
priyakasimbeg Jun 18, 2025
6b6f2a6
Merge pull request #873 from Niccolo-Ajroldi/dropout_pytorch
priyakasimbeg Jun 18, 2025
7c43022
fixes to ogbg and fastmri
priyakasimbeg Jun 18, 2025
894f4fb
fixes to fastmri and deepspeech
priyakasimbeg Jun 18, 2025
0bcf484
fixes to conformer vit
priyakasimbeg Jun 18, 2025
73c2276
conformer and vit fix for dropout refactor
priyakasimbeg Jun 18, 2025
5ff94d2
wmt fixes
priyakasimbeg Jun 18, 2025
9090e43
fix linting
priyakasimbeg Jun 18, 2025
4e69255
formatting
priyakasimbeg Jun 18, 2025
3ac97ae
fix formatting
priyakasimbeg Jun 18, 2025
badf124
fix test
priyakasimbeg Jun 18, 2025
eff3ea1
fix lint errors
priyakasimbeg Jun 18, 2025
f7fd6c7
formatting
priyakasimbeg Jun 18, 2025
8fc4cc5
fix spacing issues
priyakasimbeg Jun 18, 2025
99c3111
formatting
priyakasimbeg Jun 18, 2025
c2f4ed0
formatting
priyakasimbeg Jun 18, 2025
ae8ca68
Merge pull request #864 from mlcommons/dropout_jax
priyakasimbeg Jun 18, 2025
b20f49d
formatting
priyakasimbeg Jun 19, 2025
0ea37ee
fix
priyakasimbeg Jun 19, 2025
594f285
pylint fixes
priyakasimbeg Jun 19, 2025
f14ff8f
isort fixes
priyakasimbeg Jun 19, 2025
2a8586a
pylint fixes
priyakasimbeg Jun 19, 2025
ad36a7c
add dropout tests
priyakasimbeg Jun 21, 2025
d3f25d8
add tests
priyakasimbeg Jun 21, 2025
caacb84
add tests
priyakasimbeg Jun 21, 2025
8b0a125
fix wmt test
priyakasimbeg Jun 21, 2025
6c7d695
remove dropout fix tests
priyakasimbeg Jun 24, 2025
66f5ed3
fix formatting
priyakasimbeg Jun 25, 2025
62b1cc9
remove reference model implementations used for testing
priyakasimbeg Jun 25, 2025
161c264
lint fix
priyakasimbeg Jun 25, 2025
ac76d4f
formatting fixes
priyakasimbeg Jun 25, 2025
e4eacea
fix linting
priyakasimbeg Jun 25, 2025
0f43049
fix linting
priyakasimbeg Jun 25, 2025
a151382
pylint fix
priyakasimbeg Jun 25, 2025
f9fbbab
Merge pull request #875 from mlcommons/dropout_support
priyakasimbeg Jun 25, 2025
78a1409
Formatting, ignore `.eggs/` in yapf
fsschneider Jun 18, 2025
462e8b7
Replace yapf, pylint, isort with ruff
fsschneider Jun 18, 2025
383db7a
Replace pre-commit with ruff
fsschneider Jun 18, 2025
2c28136
Use extend-select instead, and reduce lint rules
fsschneider Jun 18, 2025
999f7a2
Replace linting GH actions with ruff
fsschneider Jun 18, 2025
7b245d6
Add ruff badge
fsschneider Jun 18, 2025
277674c
Update style testing with ruff
fsschneider Jun 18, 2025
830d2c2
Format submission_runner
fsschneider Jun 23, 2025
d84eddf
Format submissions/
fsschneider Jun 23, 2025
fbbeafa
Format scoring/
fsschneider Jun 23, 2025
f4ae9be
Format reference_algorithms/
fsschneider Jun 23, 2025
e5209d1
Format tests/
fsschneider Jun 23, 2025
f026711
Format prize_qualification_baselines/
fsschneider Jun 23, 2025
531c99e
Format datasets/
fsschneider Jun 23, 2025
c34af17
Format algoperf/
fsschneider Jun 23, 2025
9725554
Format docker/
fsschneider Jun 23, 2025
7b18fff
Lint tests/
fsschneider Jun 23, 2025
f34bb6d
Lint submissions/
fsschneider Jun 23, 2025
0aeb545
Remove perf. profile tests as it is only a placeholder
fsschneider Jun 23, 2025
4802dfb
Lint scoring/
fsschneider Jun 23, 2025
5e97e78
Lint prize_qualification_baselines/
fsschneider Jun 23, 2025
4ae5418
Lint datasets/
fsschneider Jun 23, 2025
e3f1b74
Lint reference_algorithms/
fsschneider Jun 23, 2025
566b6d9
Lint algoperf/
fsschneider Jun 23, 2025
e846648
Remove unnecessary isort=off commands
fsschneider Jun 23, 2025
3e425e0
Update Ruff linting rules in pyproject.toml to include additional opt…
fsschneider Jun 23, 2025
8bca401
Add pylint errors to linting rules
fsschneider Jun 23, 2025
09aca7f
Fix formatting
fsschneider Jun 25, 2025
2f3c23c
Merge pull request #871 from davidtweedle/ogbg_fix
priyakasimbeg Jun 26, 2025
f2b4feb
Merge pull request #874 from fsschneider/ruff
priyakasimbeg Jun 26, 2025
631f959
add example sbatch script
priyakasimbeg Jul 31, 2025
db69fce
remove index url for jax installation
priyakasimbeg Jul 31, 2025
51eb65f
revert change in dockerfile on dev
priyakasimbeg Jul 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 10 additions & 29 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,54 +3,35 @@ name: Linting
on: [push, pull_request]

jobs:
pylint:
ruff-linting:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.11.10
uses: actions/setup-python@v2
with:
python-version: 3.11.10
- name: Install pylint
- name: Install ruff
run: |
python -m pip install --upgrade pip
pip install pylint==2.16.1
- name: Run pylint
pip install ruff==0.12.0
- name: Run ruff linter
run: |
pylint algoperf
pylint reference_algorithms
pylint prize_qualification_baselines
pylint submission_runner.py
pylint tests
ruff check

isort:
ruff-formatter:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.11.10
uses: actions/setup-python@v2
with:
python-version: 3.11.10
- name: Install isort
- name: Install ruff
run: |
python -m pip install --upgrade pip
pip install isort==5.12.0
- name: Run isort
pip install ruff==0.12.0
- name: Run ruff formatter
run: |
isort . --check --diff
ruff format --check

yapf:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.11.10
uses: actions/setup-python@v2
with:
python-version: 3.11.10
- name: Install yapf
run: |
python -m pip install --upgrade pip
pip install yapf==0.32 toml
- name: Run yapf
run: |
yapf . --diff --recursive
20 changes: 8 additions & 12 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
repos:
- repo: https://github.com/google/yapf
rev: v0.32.0
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.12.0
hooks:
- id: yapf
args: ["--in-place", "--parallel", "--verbose", "--recursive"]
- repo: https://github.com/pycqa/isort
rev: 5.10.1
hooks:
- id: isort
- repo: https://github.com/pycqa/pylint
rev: v2.16.1
hooks:
- id: pylint
# Run the linter (don't change files).
- id: ruff-check
# Run the formatter (don't change files).
- id: ruff-format
args: ["--check"]
39 changes: 18 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@
<a href="https://arxiv.org/abs/2306.07179" target="_blank">Benchmark</a>/<a href="https://openreview.net/forum?id=CtM5xjRSfm" target="_blank">Results</a> Paper
</p>

[![CI](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/CI.yml/badge.svg)](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/CI.yml)
[![Lint](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/linting.yml/badge.svg)](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/linting.yml)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://github.com/mlcommons/algorithmic-efficiency/blob/main/LICENSE.md)
[![Code style: yapf](https://img.shields.io/badge/code%20style-yapf-orange)](https://github.com/google/yapf)
[![Discord](https://dcbadge.vercel.app/api/server/5FPXK7SMt6?style=flat)](https://discord.gg/5FPXK7SMt6)
[![CI Status](https://img.shields.io/github/actions/workflow/status/mlcommons/algorithmic-efficiency/CI.yml?style=flat-square&logo=github&label=CI)](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/CI.yml)
[![Linting Status](https://img.shields.io/github/actions/workflow/status/mlcommons/algorithmic-efficiency/linting.yml?style=flat-square&logo=github&label=Linting)](https://github.com/mlcommons/algorithmic-efficiency/actions/workflows/linting.yml)
[![Code Style Ruff](https://img.shields.io/badge/Code%20Style-Ruff-brightgreen?style=flat-square&logo=ruff)](https://github.com/astral-sh/ruff)
[![GitHub License](https://img.shields.io/github/license/mlcommons/algorithmic-efficiency?style=flat-square&label=License)](LICENSE.md)
[![Discord](https://dcbadge.limes.pink/api/server/5FPXK7SMt6?style=flat-square)](https://discord.gg/5FPXK7SMt6)

---

Expand All @@ -28,11 +28,12 @@ Submissions are evaluated based on their "time-to-result", i.e., the wall-clock

---

> This is the repository for the *AlgoPerf: Training Algorithms benchmark* measuring neural network training speedups due to algorithmic improvements.
> This is the repository for the _AlgoPerf: Training Algorithms benchmark_ measuring neural network training speedups due to algorithmic improvements.
> It is developed by the [MLCommons Algorithms Working Group](https://mlcommons.org/en/groups/research-algorithms/).
> This repository holds the benchmark code, the benchmark's [**technical documentation**](/docs/DOCUMENTATION.md) and [**getting started guides**](/docs/GETTING_STARTED.md). For a detailed description of the benchmark design, see our [**introductory paper**](https://arxiv.org/abs/2306.07179), for the results of the inaugural competition see our [**results paper**](https://openreview.net/forum?id=CtM5xjRSfm).
>
> **See our [AlgoPerf Leaderboard](https://github.com/mlcommons/submissions_algorithms) for the latest results of the benchmark and to submit your algorithm.**

---

> [!IMPORTANT]
Expand All @@ -50,22 +51,21 @@ Submissions are evaluated based on their "time-to-result", i.e., the wall-clock

## Installation

> [!TIP]
> **If you have any questions about the benchmark competition or you run into any issues, please feel free to contact us.** Either [file an issue](https://github.com/mlcommons/algorithmic-efficiency/issues), ask a question on [our Discord](https://discord.gg/5FPXK7SMt6) or [join our weekly meetings](https://mlcommons.org/en/groups/research-algorithms/).
> [!TIP] > **If you have any questions about the benchmark competition or you run into any issues, please feel free to contact us.** Either [file an issue](https://github.com/mlcommons/algorithmic-efficiency/issues), ask a question on [our Discord](https://discord.gg/5FPXK7SMt6) or [join our weekly meetings](https://mlcommons.org/en/groups/research-algorithms/).

You can install this package and dependencies in a [Python virtual environment](/docs/GETTING_STARTED.md#python-virtual-environment) or use a [Docker/Singularity/Apptainer container](/docs/GETTING_STARTED.md#docker) (recommended).
We recommend using a Docker container (or alternatively, a Singularity/Apptainer container) to ensure a similar environment to our scoring and testing environments.
Both options are described in detail in the [**Getting Started**](/docs/GETTING_STARTED.md) document.

*TL;DR to install the Jax version for GPU run:*
_TL;DR to install the Jax version for GPU run:_

```bash
pip3 install -e '.[pytorch_cpu]'
pip3 install -e '.[jax_gpu]' -f 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html'
pip3 install -e '.[full]'
```

*TL;DR to install the PyTorch version for GPU run:*
_TL;DR to install the PyTorch version for GPU run:_

```bash
pip3 install -e '.[jax_cpu]'
Expand All @@ -77,7 +77,7 @@ pip3 install -e '.[full]'

For detailed instructions on developing your own algorithm in the benchmark see the [Getting Started](/docs/GETTING_STARTED.md) document.

*TL;DR running a JAX workload:*
_TL;DR running a JAX workload:_

```bash
python3 submission_runner.py \
Expand All @@ -89,7 +89,7 @@ python3 submission_runner.py \
--tuning_search_space=reference_algorithms/paper_baselines/adamw/tuning_search_space.json
```

*TL;DR running a PyTorch workload:*
_TL;DR running a PyTorch workload:_

```bash
python3 submission_runner.py \
Expand Down Expand Up @@ -117,17 +117,15 @@ Our [**Contributing**](/docs/CONTRIBUTING.md) document provides further MLCommon

## License

The *AlgoPerf* codebase is licensed under the [Apache License 2.0](/LICENSE.md).
The _AlgoPerf_ codebase is licensed under the [Apache License 2.0](/LICENSE.md).

## Paper and Citing the AlgoPerf Benchmark

In our paper ["Benchmarking Neural Network Training Algorithms"](http://arxiv.org/abs/2306.07179) we motivate, describe, and justify the *AlgoPerf: Training Algorithms* benchmark.
In our paper ["Benchmarking Neural Network Training Algorithms"](http://arxiv.org/abs/2306.07179) we motivate, describe, and justify the _AlgoPerf: Training Algorithms_ benchmark.

If you are using the *AlgoPerf benchmark*, its codebase, baselines, or workloads, please consider citing our paper:
If you are using the _AlgoPerf benchmark_, its codebase, baselines, or workloads, please consider citing our paper:

> [Dahl, Schneider, Nado, et al.<br/>
> **Benchmarking Neural Network Training Algorithms**<br/>
> *arXiv 2306.07179*](http://arxiv.org/abs/2306.07179)
> [Dahl, Schneider, Nado, et al.<br/> > **Benchmarking Neural Network Training Algorithms**<br/> > _arXiv 2306.07179_](http://arxiv.org/abs/2306.07179)

```bibtex
@Misc{Dahl2023AlgoPerf,
Expand All @@ -139,10 +137,9 @@ If you are using the *AlgoPerf benchmark*, its codebase, baselines, or workloads
}
```

If you use the results from the first *AlgoPerf competition*, please consider citing the results paper, as well as the relevant submissions:
If you use the results from the first _AlgoPerf competition_, please consider citing the results paper, as well as the relevant submissions:

> [Kasimbeg, Schneider, Eschenhagen, et al.<br/>
> **Accelerating neural network training: An analysis of the AlgoPerf competition**<br/>
> [Kasimbeg, Schneider, Eschenhagen, et al.<br/> > **Accelerating neural network training: An analysis of the AlgoPerf competition**<br/>
> ICLR 2025](https://openreview.net/forum?id=CtM5xjRSfm)

```bibtex
Expand Down
2 changes: 1 addition & 1 deletion algoperf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@

from ._version import version as __version__

__all__ = ["__version__"]
__all__ = ['__version__']
Loading
Loading