-
Notifications
You must be signed in to change notification settings - Fork 649
fix(win): Address Windows crashes from issue 4641 #4914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes AcademySoftwareFoundation#4641 Has something to do with mixing Windows compiler versions, there's a subtle ABI compatibility issue that this sidesteps. Signed-off-by: Larry Gritz <[email protected]>
zachlewis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really equipped to tell if this actually fixes the reported issues, but it can't hurt to try...
I mean, I did copy it verbatim from YOUR FORK. |
|
Who can really say why I do the things I do? 🤷 I think I was pursuing a handful of different possible solutions, based on suggestions I was receiving in the thread and on Slack; but it was never clear to me what would or wouldn't work, since I couldn't reproduce the issue, and didn't have the means to further investigate. |
|
I don't see how it can harm things, and I've seen three places (you, the Issue referenced, and a thread on vfx-platform discussion email) that all suggest this change. |
|
I don't think it can harm things either, but I suspect it's a red herring. Apparently, manually updating the VC runtime solves the issue for those who are having it; so I suspect what we really want to do is link the static MSVC runtime instead of the dynamic: set(CMAKE_MSVC_RUNTIME_LIBRARY "MultiThreaded$<$<CONFIG:Debug>:DebugDLL>" CACHE STRING "")
cmake_policy(SET CMP0091 NEW) # ensures that CMake obeys the above property |
|
I'll chime in to say that, in the absence of OIIO enforcing a strict MSVC/VC runtime build matrix in some way (which would be a huge and I suspect largely thankless exercise), I think this is the "most correct" solution to the problem. |
|
@nrusch "this is the most correct solution" -- do you mean this PR? |
|
Sorry, yes, I mean this PR.
This would be another good option if it's acceptable from a licensing standpoint and reasonably straightforward, as long as the OIIO wheel doesn't end up shipping any VC runtime DLLs. I'm less familiar with Windows link semantics/quirks than with Linux, but I know there is a flavor of "static linking" on Windows that effectively uses a static shim in front of a DLL, so the end result is that what you end up embedding in your library are sort of proxy symbols to load the "real" symbols from the DLL (which you must also ship). Shipping a runtime DLL with the wheel is something I would try and avoid at all costs, because it basically creates a potential future version of this same sort of binary compatibility issue for any users of the wheel. This is something we've run into internally with e.g. PySide2. |
|
The fix is correct as far as it can be, we ran into this with blender+otio a while ago and this is the best i was able to disect the issue. At the core of the problem is this nugget of wisdom as outlined on the C++ binary compatibility between Visual Studio versions page
we (As in developers in general, not blender specifically) all seemingly have been playing fast and lose with this rule, and have gotten away with it for the most part, this is the first time it actually came around to bite us in the rear, and Microsoft rightfully went "told you so" and pointed to their documentation. using the Linking the crt statically may sound like a appealing option, but you'll have to be a 100000% sure no crt objects will escape (ie no transferring FILE* handles in/out of any public api) as while it is absolutely fine to have two different statically linked crt versions loaded into a single process on windows, when those 2 crt's encounter each others objects it panics and and shoots the process in the face in the most brutal way possible that not even your debugger will have a chance to attach. |
|
I really appreciate the insight, guys!
Ah, now I remember what's going on... I created zachlewis:wheels_static_windows to generate wheels build artifacts for @MrLixm to test, to see if the Here's Liam's test command:
So... if we don't want to do anything too contrived with linking the CRT, I think we need to dig deeper for another escape hatch. |
|
Ah, thanks for mentioning that. That shook another latent memory loose in my brain... Quoting myself from my previous message:
The PySide2 wheel that's available from PyPI does include a copy of the VC runtime and, as you can probably imagine, it's not up to date. That's actually how I originally discovered this mutex compatibility issue... but with OpenColorIO, not OIIO. Because of this, in the test command you posted, the fact that it's importing Now, you tested applying this compiler define to OIIO, but assuming the OIIO wheel includes OCIO, you would need to include the A simple test to confirm this theory is:
|
|
@zachlewis, @LazyDodo, @nrusch: Where are we on this? Is it helpful at least a little and I should merge? Or is it unrelated to the real problem and I should abandon? |
|
I'm so sorry, I must've missed Nathan's reply. Let's definitely give this a shot! It sounds like we may also need to do this for OpenColorIO as well (perhaps in our OCIO recipe for now, and/or in the actual OCIO codebase later). Does that sound reasonable? |
|
I have no wisdom here, the real solution is to upgrade the msvc runtime on the systems #4641 rears its ugly head to whatever oiio was build against, but depending on the runtime environment, that may not always be possible. Given there is no easy solution to the "what msvc runtime version will i be running with"-problem, i'd say lean towards the option the end user loses the least amount of work with and build oiio when targeting pipy with the hatch and hope an incompatibility like this doesn't happen again (but it likely will....) That being said i wouldn't hard-code it like it currently is, there's clearly a performance benefit to be had, otherwise ms would not have broken this for funsies, i'd enable it for the builds that feed into pipy and leave this choice to make for other downstream users of oiio them selves. I mean if you're in control of the runtime environment there's no reason not to ship with this optimization enabled. |
I literally don't know which option you're mean, whether you are telling me to merge or not. Assume that I haven't had substantial Windows experience since the mid 90s, and am basically a child who doesn't understand any of these tradeoffs. I need to be told exactly what to do in the simplest possible language:
|
|
I'm suggesting to build the pipy distribution with |
|
From what I have read, the issue is that fairly recently, MSVS added a new constexpr version of the constructor for std::mutex. Which I guess is important if you must use a mutex in a constexpr context, and maybe it could lead to some minor performance improvement in some cases? But the downside is that if you do it and end up in an environment with mixed runtime library choices, you can get crashes. I feel like making a CMake variable to control it and expecting downstream users to notice that and set it properly is a pretty big ask, unless there is a really compelling advantage to not disabling it when you don't need to. Because if they make the choice wrong, of course they open themselves to the crash, depending on what else it's linked with, right? The thing is, I'm not aware that OIIO ever depends on mutex ctr being constexpr. (Strong supporting evidence: everything was just fine before the constexpr declaration was added to MSVS.) And even if there is a possible small performance improvement, it's a newly added improvement, so the worst case is we run at the same speed as we always did. And maybe the improvement is not even measurable or doesn't apply at all in any of the places we use it. So I'm inclined to just disable it always, to guarantee not getting the crash and save downstream users from guessing wrong. |
|
Fair enough, you can merge it as is then. |
…ndation#4914) Fixes AcademySoftwareFoundation#4641 Has something to do with mixing Windows compiler versions, there's a subtle ABI compatibility issue that this sidesteps. Signed-off-by: Larry Gritz <[email protected]>
…ndation#4914) Fixes AcademySoftwareFoundation#4641 Has something to do with mixing Windows compiler versions, there's a subtle ABI compatibility issue that this sidesteps. Signed-off-by: Larry Gritz <[email protected]>
Fixes #4641
Has something to do with mixing Windows compiler versions, there's a subtle ABI compatibility issue that this sidesteps.