Skip to content

GS: Merge assembly files in GS#3577

Merged
refractionpcsx2 merged 9 commits intoPCSX2:masterfrom
TellowKrinkle:GSdxMergeASM
Nov 4, 2021
Merged

GS: Merge assembly files in GS#3577
refractionpcsx2 merged 9 commits intoPCSX2:masterfrom
TellowKrinkle:GSdxMergeASM

Conversation

@TellowKrinkle
Copy link
Copy Markdown
Member

@TellowKrinkle TellowKrinkle commented Aug 2, 2020

Replaces 6 copies of the same file ([x86, x64] * [SSE4, AVX, AVX2]) with one slightly more complicated one. I think it'll be easier to maintain this way, but I'm curious as to what other people think

Pros:

  • Only have to edit one file when changing the SW renderer
  • x64 now supports everything x86 does

Cons:

  • Naming the registers with variables can make it less obvious that some registers are the same. (e.g. in a bunch of spots x86 uses one register where x64 uses two, which requires two variables to be able to be different in x64, but makes it less obvious that they're actually the same in x86. An alternative would be to use more split code instead of trying to keep everything on one code path, though I personally would rather not if possible)
  • Lots of macros used to generate boilerplate in xbyak_smart.h as well as to abstract over common differences between our use of assembly in x86 vs x64. I personally don't mind, but I know macros can make things extra hard to understand so please at least look and decide if you think it's OK

Outstanding Issues:

  • While SSE and AVX's comments of C code were all the same, AVX2 is based off of different C code and therefore a lot of the comments are different. A lot of the code with comments from two different C sources, however, is now the same piece of asm for both, so I'm not sure how we want to deal with that. At the moment I haven't transferred over the AVX2 comments at all.
  • I did a terrible job of naming things, especially when they were referring to PS2 hardware things, as I still don't know much about how that actually works (yay for blindly transcribing things). In particular, _rip_local_d is supposed to refer to either m_local.d8 or m_local.d4 (depending on AVX register size), but after realizing m_local also has a d field, that's clearly not a great name. I don't, however, have any clue what would be as I don't know why it's called d in the first place.
  • Windows x64 is still broken, having issues with the display of a bunch of the bios (e.g. the "Sony Computer Entertainment" comes up as a bunch of white squares). I tried a bunch of things to make win64 and macOS 64 closer (e.g. switching all the ifdefs and adding movs to transfer the inputs to the other platform's calling convention) but I both couldn't get Windows x64 to work or macOS x64 to break, so I'm thinking the issue might be somewhere else in GSdx (Windows x64 just crashes in the current version of GSdx so it's never been tested). If anyone has any ideas as to where this issue might be please let me know Fixed

Also, since this is a full rewrite of the SW renderer's assembly routines, lots of testing is needed

@TellowKrinkle
Copy link
Copy Markdown
Member Author

Wait are we building GSdx in C++03 mode? That's the only way to make this error... https://gcc.godbolt.org/z/c8oj9W

@TellowKrinkle
Copy link
Copy Markdown
Member Author

TellowKrinkle commented Aug 2, 2020

Never mind this is the issue https://gcc.godbolt.org/z/a4rvYz

@rspretty
Copy link
Copy Markdown

Works almost perfect with my win8.1 x64 machine! Great work!
Tested both x32 and x64 cmake build.
I've found minor graphics gritch in SOTC(NTSC-J). Color of water surface gets pruple.
Sorry for this post becomes just a report, ASM part of the code is far more difficult than I can read.
sotc-werid-palette

@TellowKrinkle
Copy link
Copy Markdown
Member Author

GSdump please

@rspretty
Copy link
Copy Markdown

Thank you.
This is a dump of above scene.
https://sidenet.ddo.jp/pcsx2dumps/mergeasm-sotc-purple-texture.gs.gz
Same scene dump running on master branch.
https://sidenet.ddo.jp/pcsx2dumps/master-sotc-same-scene.gs.gz

@TellowKrinkle
Copy link
Copy Markdown
Member Author

@rspretty please check and confirm that the issue is fixed

@rspretty
Copy link
Copy Markdown

Confirmed!
Works like a charm. Awesome!!

@Ziemas
Copy link
Copy Markdown
Contributor

Ziemas commented Oct 15, 2021

Yup, this is working fine for me as well now.

@JordanTheToaster
Copy link
Copy Markdown
Member

Checked through my game collection on 64 bit software AVX2 Windows and all seems good to me no issues present on any of the games I played.

@Ziemas
Copy link
Copy Markdown
Contributor

Ziemas commented Oct 23, 2021

I found out this doesn't build with GCC 11.1.0.

/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp: In lambda function:
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:1774:105: error: ‘i’ is not a constant expression
 1774 |                                 return ptr[_g_const + offsetof(GSScanlineConstantData, m_log2_coef_128b[i])];
      |                                                                                                         ^
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:1774:64: warning: ‘offsetof’ within non-standard-layout type ‘GSScanlineConstantData’ is conditionally-supported [-Winvalid-offsetof]
 1774 |                                 return ptr[_g_const + offsetof(GSScanlineConstantData, m_log2_coef_128b[i])];
      |                                                                ^
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:1776:105: error: ‘i’ is not a constant expression
 1776 |                                 return ptr[_g_const + offsetof(GSScanlineConstantData, m_log2_coef_256b[i])];
      |                                                                                                         ^
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:1776:64: warning: ‘offsetof’ within non-standard-layout type ‘GSScanlineConstantData’ is conditionally-supported [-Winvalid-offsetof]
 1776 |                                 return ptr[_g_const + offsetof(GSScanlineConstantData, m_log2_coef_256b[i])];
      |                                                                ^
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp: In member function ‘void GSDrawScanlineCodeGenerator2::ReadTexelImplLoadTexLOD(int, int)’:
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:3366:62: error: ‘lod’ is not a constant expression
 3366 |         Address lod_addr = m_sel.lcm ? _rip_global(lod.i.u32[lod]) : _rip_local(temp.lod.i.u32[lod]);
      |                                                              ^~~
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:3366:40: note: in expansion of macro ‘_rip_global’
 3366 |         Address lod_addr = m_sel.lcm ? _rip_global(lod.i.u32[lod]) : _rip_local(temp.lod.i.u32[lod]);
      |                                        ^~~~~~~~~~~
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:3366:96: error: ‘lod’ is not a constant expression
 3366 |         Address lod_addr = m_sel.lcm ? _rip_global(lod.i.u32[lod]) : _rip_local(temp.lod.i.u32[lod]);
      |                                                                                                ^~~
/home/ziemas/Development/pcsx2/pcsx2/GS/Renderers/SW/GSDrawScanlineCodeGenerator.all.cpp:3366:70: note: in expansion of macro ‘_rip_local’
 3366 |         Address lod_addr = m_sel.lcm ? _rip_global(lod.i.u32[lod]) : _rip_local(temp.lod.i.u32[lod]);

@RedPanda4552
Copy link
Copy Markdown
Contributor

Tested on Dragon Quest 8, Gran Turismo 4, Ratchet and Clank, NCAA 08, and Star Wars Battlefront 2. No issues, working on 64 bit.

@gtgamer468
Copy link
Copy Markdown

I tested Gran Turismo 3, Gran Turismo 4 Prologue, Midnight Club 3, Tekken 5, Sonic Heroes and NASCAR '09 and all of them are working as intended in x64 SW mode.

@refractionpcsx2
Copy link
Copy Markdown
Member

Tested about 10 games of various graphical effects + BIOS, all looks good :)

@JordanTheToaster
Copy link
Copy Markdown
Member

Tested 12 games including bass master fishing, looks good to me.

@TellowKrinkle TellowKrinkle force-pushed the GSdxMergeASM branch 2 times, most recently from 76f4e34 to a6ca0de Compare October 24, 2021 04:46
@TellowKrinkle
Copy link
Copy Markdown
Member Author

I found out this doesn't build with GCC 11.1.0.

Try now

@Ziemas
Copy link
Copy Markdown
Contributor

Ziemas commented Oct 24, 2021

I found out this doesn't build with GCC 11.1.0.

Try now

Yup, works now.

@lightningterror lightningterror changed the title GSdx: Merge assembly files in GSdx GS: Merge assembly files in GSdx Oct 25, 2021
Copy link
Copy Markdown
Contributor

@lightningterror lightningterror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright headers are outdated compared to the new ones we use in GS.

@lightningterror lightningterror changed the title GS: Merge assembly files in GSdx GS: Merge assembly files in GS Oct 25, 2021
If codegen throws an exception, it ends up just crashing when you jump to the incompletely-generated code which is kind of useless
@lightningterror
Copy link
Copy Markdown
Contributor

@lightningterror
Copy link
Copy Markdown
Contributor

I see some new warnings in gcc, maybe would be nice to take care of them.
https://github.com/PCSX2/pcsx2/runs/4100022508?check_suite_focus=true#step:9:566

No point, and made it not a standard layout type
@TellowKrinkle
Copy link
Copy Markdown
Member Author

I see some new warnings in gcc, maybe would be nice to take care of them

Fixed, but test and look for new memory leaks

Hopefully no one was relying on GSAlignedClass having a virtual destructor, I made the new destructor protected to catch the obvious issue (someone trying to delete a GSAlignedClass<32>*), but it won't catch a subclass relying that declaration to implicitly make its own destructor virtual (example). I checked the obvious two (GSRenderer and GSDevice) but there could be others. So far it seems like no one else was aware that GSAlignedClass had a virtual destructor, as everyone has been virtual-ing the destructors of their subclasses...

Copy link
Copy Markdown
Contributor

@lightningterror lightningterror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time for the testers to shine now.

@TellowKrinkle TellowKrinkle marked this pull request as ready for review November 4, 2021 09:50
@JordanTheToaster
Copy link
Copy Markdown
Member

JordanTheToaster commented Nov 4, 2021

Tested on 13 games and showed no issues in any of them. Memory behavior was the same as master and I have found no leaks in my time testing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants