Skip to content

Aws::SQS::SQSClient::ReceiveMessage reacts on sigint in unexpected ways. #1682

@MrMoose

Description

@MrMoose

Confirm by changing [ ] to [x] below to ensure that it's a bug:

Describe the bug
I maintain a plugin for the Unreal Engine which uses SQS to receive messages. Normally I spawn a thread which uses long polling with a 4 second delay and call ReceiveMessage() until the thread is interrupted.

This can happen for various reasons but this is about the user pressing Alt+F4. When this happens, the engine and my threads start their teardown routine. This involves setting an atomic interruption bool on the SQS polling thread and then waiting for it to return from the 4 second long poll delay and join.

This has worked well for almost a year but now, for still unknown reasons (engine upgrade, Aws SDK upgrade.. ?) it doesn't anymore. Instead, ReceiveMessage appears to be killed somehow, causing this assertion:

       void OnRequestSucceeded(const Aws::String& serviceName, const Aws::String& requestName, const std::shared_ptr<const Aws::Http::HttpRequest>& request,
                const Aws::Client::HttpResponseOutcome& outcome, const CoreMetricsCollection& metricsFromCore, const Aws::Vector<void*>& contexts)
        {
            assert(s_monitors);
            assert(contexts.size() == s_monitors->size());
            size_t index = 0;
            for (const auto& interface: *s_monitors)
            {
                interface->OnRequestSucceeded(serviceName, requestName, request, outcome, metricsFromCore, contexts[index++]);
            }
        }

This prevents me from joining the thread gracefully. Sadly I am not sure what caused this behavior. I can't even rule out the possibility that it has been like this for a long time and I just did not notice,

SDK version number
1.9.40

Platform/OS/Hardware/Device
Win64 / Windows 10 / various windows PCs

To Reproduce (observed behavior)

Feel free to look at the code in question here:
https://github.com/MrMoose/CloudConnector/blob/184fb1d664e1335cc33e22fad85e8d052b2c7f01/Source/CloudConnector/Private/AWSPubsubImpl.cpp#L118

When this SQSRunner thread tries to join(), it will fail because ReceiveMessage() does not return gracefully when the process receives SIGINT. Instead above assertion occurs.

Expected behavior
ReceiveMessage returns either gracefully after the 4 second long poll delay or with an error outcome.

Additional context
The SDK is built from source, using CMake 3.20.x and Visual Studio 2019 latest patch. It is a Release DLL / LTO build because others generally didn't work in the Unreal context. Cpp 17 is set as standard, as it is in Unreal. Shared Runtime.
I have tried various versions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions