-
-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Help wanted
One of the goals of rust_xlsxwriter
is fidelity with the xlsx file format generated by Excel. This is achieved using integration tests that take files created in Excel 2007 and compares them file by file and element by element with files created using rust_xlsxwriter
.
Here is a typical test file, the associated xlsx file and the test runner code.
This approach has a number of advantages from an maintenance point of view:
- It allows incremental test-drive development of Excel features.
- It allows bug reports to be replicated quickly in Excel and compared with
rust_xlsxwriter
- It avoids subjective arguments about whether
rust_xlsxwriter
or some other third party Excel reading software is correct in its implementation/interpretation of the XLSX file specification since it uses Excel as the standard.
For the end user the benefits of having output files that are effectively identical to files produced by Excel means the maximum possible interoperability with Excel and applications that read XLSX files.
The test suite contains an individual test for each file (although there is sometimes more than one test against the same input file). Each of these tests in compiled into and run as a crate which means the test suite is slow. For usability reasons I don't want to test more than one xlsx file per file/crate (apart from maybe the grouping scheme outlined below).
There are currently ~540 test files and it takes 8+ minutes to run on a 3.2 GHz 6-Core Intel Core i7 with 32GB of fast RAM:
$ time cargo test
real 8m36.340s
user 30m34.062s
sys 9m0.802s
In the GitHub Actions CI this is currently taking around 18 minutes.
There will eventually be around 800 test files so the runtime will be ~50% longer.
nextest
is bit faster but not significantly so. The timing also doesn't include the doc tests:
$ time cargo nextest run
real 7m45.029s
user 26m44.624s
sys 6m59.271s
A few months ago when the test suite took around 4 minutes I tried to consolidate the tests into one crate using the advice in this article on Delete Cargo Integration Tests. This was significantly faster by around 5-10x but didn't allow me to run individual tests (I'm 99% sure). I tried to replicate that again to redo the performance testing and verify the running of individual tests but failed for some reasons related to test refactoring since then.
For comparison the Python bytes test suite runs 1600 integration and unit tests in 18 seconds. The Perl test suite takes around 3 minutes and the C test suite takes 5 minutes.
Anyway to the help wanted: if anyone has any ideas how the test runtime might be improved or if you can get the above "Delete Cargo Integration Tests" approach to work again for comparison let me know. I might be able to come up with a hybrid approach where the tests under development or debug are in their own crates and moved back to an overall test crate/folder afterwards.