Skip to content

Initialialize AABB Rectangle #467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 1, 2019

Conversation

mzgoddard
Copy link
Contributor

@mzgoddard mzgoddard commented Jun 6, 2019

Resolves

Render related VM performance, such as fencing a sprite inside the scratch canvas.

Proposed Changes

  • Initialize AABB Rectangle with a matrix and a radius
  • Pass bounds as destination parameters to getBounds and like functions

Reason for Changes

  • Initialize AABB Rectangle with a matrix and a radius

Current getAABB creates 4 3D vectors, transforms them by a 4x4 matrix and compares their axes to find the left, right, top, bottom extremes for the AABB. We can break that down into something like 2 function calls, 21 object member reads, 6 writes, 15 mulitplications, 13 additions, 2 created objects, and 5 comparisons per vectors. That is 64 operations per vector. Or 256 operations to construct the AABB.

This version by comparisons uses 6 object members reads, 6 multiplications, 6 additions, and 4 absolute value operations. In total that is 22 operations to construct the AABB.

The main cause in the difference in operations is that the current version transforms 4 vectors, and this version transform the 1 vector into the absolute value of its transformed components. Then the work is further reduced by contraining the transformation to 2 dimensions.

  • Pass bounds as destination parameters to getBounds and like functions

Passing a Rectangle to getBounds functions lets us reuse Rectangles to avoid creating lot of garbage that needs to be cleaned up. By chance we can't reuse a rectangle the functions can create their own to return.

Test Coverage

The existing Drawable units that test getAABB.

Benchmark Data

Chrome inspector

This table was collected from the Chrome Inspector's Javascript Profiler tab running the 14844969 benchmark. The values in this table are the percent of time spent under getFastBounds, which is used by SVGSkin to get the getFenceBounds to calculate the fence position when moving a sprite with motion_changex and other blocks. The values do not add up to 100 as many of the functions are nested in another.

function before after
Drawable.getFastBounds 100.00% 100.00%
Drawable.getAABB 53.12% 0.00%
twgl.m4.transformPoint 50.47% 0.00%
twgl.m4.create 22.87% 0.00%
Drawable.updateMatrix 46.79% 100.00%
Drawable._calculateTransform 17.11% 61.47%
twgl.m4.inverse 14.74% 32.47%
twgl.m4.copy 12.29% 0.00%

Some columns do not report any time spent. getAABB and copy are still being called, but time spent there was not captured by the profiler.

scratch-vm benchmark

The following values were collected the scratchfoundation/scratch-vm#2196 change on scratch-vm.

Highlights

These highlighted benchmarks heavily use motion blocks that depend on the changed code.

edit: The before times in this table by accident do not reflect develop. This means the comparison has increased noise. I am leaving the table here for recording repurposes. This does not affect the chrome inspector table.

project id vm initial state device before (bps) after (bps) difference change
14844969 warm up chrome 1822162 2053080 230918 12.67%
safari 2208468 2426242 217774 9.86%
firefox 1483425 1482271 -1154 -0.08%
chromebook 388260 403689 15429 3.97%
ipad mini 373459 387565 14106 3.78%
pi3b 14287 14944 657 4.60%
ready chrome 2035920 2434038 398118 19.55%
safari 2703129 2740529 37400 1.38%
firefox 1548140 1673696 125556 8.11%
chromebook 540785 553688 12903 2.39%
ipad mini 689922 677317 -12605 -1.83%
pi3b 69637 78292 8655 12.43%
187694931 warm up chrome 2331011 2422272 91261 3.92%
safari 2921810 2874617 -47193 -1.62%
firefox 1212903 1247634 34731 2.86%
chromebook 273176 291418 18242 6.68%
ipad mini 243227 294849 51622 21.22%
ready chrome 2386901 2487692 100791 4.22%
safari 2972850 2940244 -32606 -1.10%
firefox 1276710 1270102 -6608 -0.52%
chromebook 306268 340126 33858 11.06%
ipad mini 249506 333149 83643 33.52%
Full Benchmark Suite

This is a table of the benchmarks with one run each for project id, vm state, device, before and after. As such the noise to signal ratio is high here, so there is a high margin of error in the amount performance changed. Most of the identifiable correlation is in the above highlights table.

edit: The before times in this table by accident do not reflect develop. This means the comparison has increased noise. I am leaving the table here for recording repurposes. This does not affect the chrome inspector table.

project id vm initial state device before (bps) after (bps) difference change
130041250 warm up chrome 1399803 1419712 19909 1.42%
safari 1732517 1690767 -41750 -2.41%
firefox 1320150 1328215 8065 0.61%
chromebook 218574 246146 27572 12.61%
ipad mini 112293 112055 -238 -0.21%
pi3b 3283 3540 257 7.83%
ready chrome 1717275 1836876 119601 6.96%
safari 1900025 2350880 450855 23.73%
firefox 1706225 1689923 -16302 -0.96%
chromebook 431992 430162 -1830 -0.42%
ipad mini 342573 353192 10619 3.10%
pi3b 48458 43920 -4538 -9.36%
14844969 warm up chrome 1822162 2053080 230918 12.67%
safari 2208468 2426242 217774 9.86%
firefox 1483425 1482271 -1154 -0.08%
chromebook 388260 403689 15429 3.97%
ipad mini 373459 387565 14106 3.78%
pi3b 14287 14944 657 4.60%
ready chrome 2035920 2434038 398118 19.55%
safari 2703129 2740529 37400 1.38%
firefox 1548140 1673696 125556 8.11%
chromebook 540785 553688 12903 2.39%
ipad mini 689922 677317 -12605 -1.83%
pi3b 69637 78292 8655 12.43%
173918262 warm up chrome 1905545 1865995 -39550 -2.08%
safari 1843594 1828518 -15076 -0.82%
firefox 895659 911689 16030 1.79%
chromebook 368340 358761 -9579 -2.60%
ipad mini 161303 150228 -11075 -6.87%
ready chrome 1710787 1757966 47179 2.76%
safari 1451462 1384875 -66587 -4.59%
firefox 843498 849241 5743 0.68%
chromebook 473729 468929 -4800 -1.01%
ipad mini 212543 230708 18165 8.55%
155128646 warm up chrome 2206972 2319157 112185 5.08%
safari 2477491 2508047 30556 1.23%
firefox 927701 965063 37362 4.03%
chromebook 235365 241038 5673 2.41%
ipad mini 90149 89574 -575 -0.64%
ready chrome 857572 923222 65650 7.66%
safari 777014 710031 -66983 -8.62%
firefox 565102 589029 23927 4.23%
chromebook 268654 274338 5684 2.12%
ipad mini 125298 130533 5235 4.18%
89811578 warm up chrome 2505224 2358625 -146599 -5.85%
safari 2692738 2792763 100025 3.71%
firefox 1291667 1346687 55020 4.26%
chromebook 649607 649992 385 0.06%
ipad mini 768176 779387 11211 1.46%
ready chrome 2734624 2778133 43509 1.59%
safari 2865553 2866529 976 0.03%
firefox 1338340 1395615 57275 4.28%
chromebook 747803 748825 1022 0.14%
ipad mini 718363 763516 45153 6.29%
139193539 warm up chrome 2511984 2469733 -42251 -1.68%
safari 3200181 3072939 -127242 -3.98%
firefox 1950797 1984399 33602 1.72%
chromebook 646954 622788 -24166 -3.74%
ipad mini 455522 446547 -8975 -1.97%
ready chrome 2636520 2605132 -31388 -1.19%
safari 3096980 3211043 114063 3.68%
firefox 2116272 2031181 -85091 -4.02%
chromebook 775236 739014 -36222 -4.67%
ipad mini 670051 683807 13756 2.05%
187694931 warm up chrome 2331011 2422272 91261 3.92%
safari 2921810 2874617 -47193 -1.62%
firefox 1212903 1247634 34731 2.86%
chromebook 273176 291418 18242 6.68%
ipad mini 243227 294849 51622 21.22%
ready chrome 2386901 2487692 100791 4.22%
safari 2972850 2940244 -32606 -1.10%
firefox 1276710 1270102 -6608 -0.52%
chromebook 306268 340126 33858 11.06%
ipad mini 249506 333149 83643 33.52%
219313833 warm up chrome 182974 189544 6570 3.59%
safari 120902 131059 10157 8.40%
firefox 123623 123318 -305 -0.25%
chromebook 33198 31862 -1336 -4.02%
ipad mini 20226 21454 1228 6.07%
ready chrome 186524 195019 8495 4.55%
safari 124610 121296 -3314 -2.66%
firefox 129242 136102 6860 5.31%
chromebook 35059 36306 1247 3.56%
ipad mini 22227 23515 1288 5.79%
236115215 warm up chrome 5860 5798 -62 -1.06%
safari 3795 3788 -7 -0.18%
firefox 4712 4700 -12 -0.25%
chromebook 1689 1705 16 0.95%
ipad mini 1327 1514 187 14.09%
ready chrome 5994 5823 -171 -2.85%
safari 3909 3957 48 1.23%
firefox 4818 4873 55 1.14%
chromebook 1656 1709 53 3.20%
ipad mini 1633 1703 70 4.29%
238750909 warm up chrome 91077 91508 431 0.47%
safari 79338 77245 -2093 -2.64%
firefox 110566 122549 11983 10.84%
chromebook 26546 24669 -1877 -7.07%
ipad mini 9096 10238 1142 12.55%
ready chrome 95192 97301 2109 2.22%
safari 76866 79563 2697 3.51%
firefox 127337 130869 3532 2.77%
chromebook 33367 34648 1281 3.84%
ipad mini 7582 7626 44 0.58%

- pass bounds as a destination parameter
- add initFromMatrixRadius
- use initFromMatrixRadius in getAABB
@adroitwhiz
Copy link
Contributor

How does this compare to what I did in #445?

@mzgoddard
Copy link
Contributor Author

@adroitwhiz I think #467 and #445 go together. We won't need to cache vectors for getAABB after #467, but we'll still need to cache vectors for getTransformedHullPoints. Probably still use #467 results arg and default to the cached _AABB instead of creating fresh Rectangles..

@mzgoddard
Copy link
Contributor Author

I've moved the math proof documentation into a jsdoc tutorial at docs/Rectangle-AABB-Matrix.md. It needs some verb tense normalization but it has the proof steps I like.

Copy link
Contributor

@cwillisf cwillisf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
Extra thanks for documenting the math behind this!

@cwillisf cwillisf removed the blocked label Jul 1, 2019
@cwillisf cwillisf merged commit 32063a2 into scratchfoundation:develop Jul 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants