Skip to content

Benchmarks

Rahmad Afandi edited this page Jun 7, 2026 · 1 revision

Benchmarks

Run on this machine with python benchmark.py (rustpy-xlsxwriter vs XlsxWriter for Excel, vs the stdlib csv module for CSV). Absolute times vary by hardware and load; the speedup ratio is the stable metric.

Input Rows RustPy Baseline Speedup
Records (list of dicts) 1,000,000 ~10–14 s ~86–119 s ~8×
pandas DataFrame 1,000,000 ~4–6 s ~30–45 s ~7×
polars DataFrame 1,000,000 ~4–6 s ~29–39 s ~7×
CSV 1,000,000 ~0.5–0.6 s ~2.4–3.7 s ~5–6×

Why it's fast

  • Rust core via PyO3 — the row/cell loop runs in native code.
  • Arrow zero-copy for pandas/polars — column data read straight from Arrow buffers, no per-value Python conversion.
  • First-row type caching for records — column types detected once from row 1, then a fast path skips the full type cascade.
  • ryu/itoa/zmij for number→string formatting.
  • Constant-memory mode — Excel is written row-by-row, so memory stays flat even for millions of rows (with generator input).

Reproduce

pip install rustpy-xlsxwriter XlsxWriter pandas polars faker
python benchmark.py

Clone this wiki locally