Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

Synchronize main thread and gpu thread for first render frame #9506

Merged
merged 32 commits into from
Jul 10, 2019

Conversation

gaaclarke
Copy link
Member

This PR makes the main thread flush out a frame synchronously to make sure the iOS UI and the GPU rendering are in sync.

Related issue:
flutter/flutter#32937

More work should be done in addition to this to make sure the first frame renders more quickly since there is a noticeable difference between the first and subsequent frames. On an iPhone SE I was experience ~45ms for synchronization on the first frame with the standard flutter project.

flutter::Shell& shell = [_engine.get() shell];
fml::TimeDelta waitTime =
#if NDEBUG
fml::TimeDelta::FromMilliseconds(100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we introducing a new race condition by putting these timeouts here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tvolkert The current state of affairs is we issue the rendering of a frame asynchronously and we don't pay attention to when it is finished and plow right along on the main thread. That is what is causing the first frame to be black. We should be issuing the async render and actually paying attention to when it is done before the main thread continues on. That's what this is doing.

The reason for the timeout is not because I'm afraid of deadlock, it's because our renderer is inefficient to the point where I'm saying "If rendering the first frame takes over 100ms it's better to show a black frame than to make the main thread wait any longer"

We have outstanding work to do to make the first frame render faster. Right now we've gated all our initialization logic on knowing the dimensions of the surface we are rendering to, which is arbitrary and inefficient. In addition to this synchronization we should decouple that initialization that needs dimensions vs initialization that doesn't need dimensions so we can make the first frame even faster. Right now first frame is ~2.5x slower that the second frame. Even when we optimize that, we will still need to synchronize. We can't just fire off an event and not care about when it is finished.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the difference between:

  1. Showing a black frame at 100ms when this times out
  2. Blocking everything until the first frame is rendered (say in 200ms)

To me, they seem equivalent. In both cases, the user won't be able to see any meaningful pixels or interact with the app at 100ms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liyuqian Both aren't desirable. Having the timeout is safer since in iOS if you occupy the main thread for too long the operating system will kill your app. Also, in the case where there is a bug that makes the attempt to request a frame ignored we won't get deadlocked.

Copy link
Contributor

@liyuqian liyuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch! It would be nice if we can have some unit test that would fail our old code, and pass with this patch. BTW, here's a related first-frame fix in the framework for tracing and drive tests: flutter/flutter#35297

flutter::Shell& shell = [_engine.get() shell];
fml::TimeDelta waitTime =
#if NDEBUG
fml::TimeDelta::FromMilliseconds(100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the difference between:

  1. Showing a black frame at 100ms when this times out
  2. Blocking everything until the first frame is rendered (say in 200ms)

To me, they seem equivalent. In both cases, the user won't be able to see any meaningful pixels or interact with the app at 100ms.

Copy link
Member

@chinmaygarde chinmaygarde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Request Changes" feedback is specifically for the deadlock in the single threaded configuration and the lack of tests (I suggest adding the same to the shell_unittests target).

But, I also have an objection to blocking the main thread on behalf of the application (for what you have already mentioned is an inordinate amount of time). Blocking the main thread is bad practice in almost all scenarios and I wouldn’t want the Flutter engine responsible for blocking the users main thread unnecessarily. In the add-to-app scenarios, if the Flutter application is part of a scrolling view as a table view item, blocking the main thread would make the other non-Flutter UI component stutter for example.

The ideal non-blocking solution would be to put up a placeholder before the Flutter application is launched, set a callback in PlatformView::SetNextFrameCallback and remove (via a fade) the placeholder when the callback is fired.

If the user absolutely does not blocking behavior, they can introduce the synchronization themselves (via the same callbacks exposed via the embedder API). That way, it is not Flutter blocking the main thread, but the user making a conscious decision about what is best in their specific situation.

@gaaclarke
Copy link
Member Author

@chinmaygarde Anytime any thread synchronization happens with the main thread (mutex, atomic, gcd), the main thread is paused. If we have a hard-fast rule we can't pause the main thread, we'll have to remove all thread synchronization we are already doing.

I think you are thinking about the full app case. For add2app we can't put splash screens or superfluous animations in the middle of our clients apps.

We shouldn't ever take more than 100ms to render a frame. If we do, that's a different bug. Even if we can guarantee the first frame will be 16ms we'd still need thread synchronization to make sure we don't see black. Yuquian is just profiling the render times of first frames, we can't say something is too long to wait for before we have any sense of how long it actually takes.

The timeout says it gated on max 100ms, but from my measurements it's around 50ms for first frame.

gaaclarke added 5 commits July 1, 2019 11:18
necessary (we can shelve that conversation for another day).
of forcing a vsync.  Forcing a vsync didn't help us because we didn't
know the dimensions so init wasn't happening.
@gaaclarke
Copy link
Member Author

@chinmaygarde @liyuqian I refactored this change so it just touches 3 files now. It doesn't schedule any rendering and makes the waiting predicate just wait on the first frame. This should be much easier to consider.

@gaaclarke
Copy link
Member Author

I talked with @chinmaygarde and @liyuqian offline. We decided this is the best default since in the worst case it behaves like it currently does, in the best case it behaves like we wished it always could.

There is work on the roadmap to implement a splash screen similar to Android (which is important since we can't guarantee the first frame will render under a specific time). That way the client can chose to either:

  1. Wait for the first frame to render
  2. Display a splash screen

@gaaclarke gaaclarke requested a review from chinmaygarde July 2, 2019 18:54
Copy link
Contributor

@liyuqian liyuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new patch looks a lot nicer and cleaner!

However, a test is still missing to cover the new code, and I think it's too risky to merge this without a proper test. Don't worry if you can't make it today. If I were you, I'll enjoy the vacation before figuring out this non-trivial test.

@gaaclarke gaaclarke requested a review from liyuqian July 2, 2019 23:16
[& waiting_for_first_frame = waiting_for_first_frame_] {
return !waiting_for_first_frame.load();
});
return !success;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks to the tests added, I finally realized that this returns !success instead of success.

This is a little counter-intuitive as most wait_for_something returns success. Maybe create an enum FirstFrameWaitingResult {timedout, successful}; and make that as the return type. That will make it much clearer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree, but I was being consistent with the rest of our API:

bool WaitWithTimeout(TimeDelta timeout);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good finding. I think we probably should change that return type to enum too to reduce confusions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, i've added the Status class as discussed.

fml/status.h Outdated
/// Creates an 'ok' status.
Status();

Status(fml::StatusCode code, fml::StringView message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::string_view here please. We should get rid of fml::StringView now that we can actually use C++17.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added deprecated note to StringView as well.

fml/status.h Outdated
fml::StringView message_;
};

inline Status::Status()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these can be inlined directly in the class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged.

fml/status.h Outdated

/// Google's canonical list of errors, should be in sync with:
/// https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
enum class StatusCode : int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't going to be stuffing these inside messages sent over the wire. This enum doesn't need to be typed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec for Status specifically requires these are int's because of the contract of things like raw_code()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need raw_code either do we? We aren't going to be using these in messages over the wire.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I was thinking it would be helpful if someone wanted to define their own error code. Maybe we don't want that, I removed it.

fml/status.h Outdated
kUnavailable = 14,
kDataLoss = 15,
kUnauthenticated = 16,
kDoNotUseReservedForFutureExpansionUseDefaultInSwitchInstead_ = 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not apply to us and none of them require values.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to go out of our way to be incompatible with absl. These values are just as good as anything else, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not have FML become an agglomeration of things from absl. This was already how some stuff from Base and FXL got in this library with (in some cases) limited use. Let's build and use only the things we need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

K, done.

fml/status.h Outdated
fml::StringView message() const;

private:
int code_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the enum here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}

// /// Makes sure that WaitForFirstFrame works if we rendered a frame with the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent single line comment prefixes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// /// single-thread setup.
TEST_F(ShellTest, WaitForFirstFrameInlined) {
Settings settings = CreateSettingsForFixture();
ThreadHost thread_host("io.flutter.test." + GetCurrentTestName() + ".",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can get rid of the thread host and use GetThreadTaskRunner directly in this unit-test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems inconsistent with the other tests but done.

@@ -271,6 +281,9 @@ class Shell final : public PlatformView::Delegate,
uint64_t next_pointer_flow_id_ = 0;

bool first_frame_rasterized_ = false;
std::atomic<bool> waiting_for_first_frame_ = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: std::atomic_bool

Copy link
Member Author

@gaaclarke gaaclarke Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::atomic<bool> is used later in the file.

fml/status.h Outdated

/// Google's canonical list of errors, should be in sync with:
/// https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
enum class StatusCode : int {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need raw_code either do we? We aren't going to be using these in messages over the wire.

fml/status.h Outdated
kUnavailable = 14,
kDataLoss = 15,
kUnauthenticated = 16,
kDoNotUseReservedForFutureExpansionUseDefaultInSwitchInstead_ = 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not have FML become an agglomeration of things from absl. This was already how some stuff from Base and FXL got in this library with (in some cases) limited use. Let's build and use only the things we need.

fml/status.h Outdated
namespace fml {

/// Google's canonical list of errors, should be in sync with:
/// https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there should be any expectation that these should be in sync with grpc. We don't want folks converting the raw code from the status objects we use in the engine and send them over the wire.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gaaclarke gaaclarke requested a review from chinmaygarde July 10, 2019 00:14
@gaaclarke gaaclarke merged commit 9776043 into flutter:master Jul 10, 2019
engine-flutter-autoroll added a commit to flutter/flutter that referenced this pull request Jul 11, 2019
flutter/engine@75387db...49445ce

git log 75387db..49445ce --no-merges --oneline
49445ce FLEViewController/Engine API changes (flutter/engine#9750)
2a79462 Add Fuchsia build CI presubmit steps (flutter/engine#9736)
67cebdb Roll fuchsia/sdk/core/linux-amd64 from KGmm_RIJoXS19zTm2crjM3RYpmp5Y03-fLUeVdylbTYC to ehWVT9QJbC-vFMM6SkkQM9HJ9oITFCws7FC9JnrFq2gC (flutter/engine#9765)
089c740 Roll fuchsia/sdk/core/mac-amd64 from lCQWEeR_Ert7t_qAbMRycwrRyZC-dIprYPyPJzwPmg4C to EYnRdXFT9l-d8Qkz4zeTRXnqfV3KQzpQhoPs1r0-740C (flutter/engine#9759)
b22410e Include SkParagraph headers only when the enable-skshaper flag is on (flutter/engine#9758)
2cd650d Minimal integration with the Skia text shaper module (flutter/engine#9556)
f775f5e Re-enable the Wuffs GIF decoder (flutter/engine#9466)
aca0482 Make all shell unit tests use the OpenGL rasterizer. (flutter/engine#9746)
bc57291 Make FLEViewController&#39;s view an internal detail (flutter/engine#9741)
9776043 Synchronize main thread and gpu thread for first render frame (flutter/engine#9506)
f600ae8 Use libc&#43;&#43; variant of string view and remove the FML variant. (flutter/engine#9737)
564f53f Revert &#34;Improve caching limits for Skia (#9503)&#34; (flutter/engine#9740)
b453d3c libtxt: fix reference counting of SkFontStyleSets held by font asset providers (flutter/engine#9561)
fa7627d Fix backspace crash on Chinese devices (flutter/engine#9734)
56885f7 Let pushColorFilter accept all types of ColorFilters (flutter/engine#9641)
6dccb21 Roll src/third_party/skia 96fdfe0fe88e..af4e7b6cf616 (1 commits) (flutter/engine#9735)
8511d9b Roll fuchsia/sdk/core/mac-amd64 from byM-kyxL4bemlTYNqhKUfJfZoIUrCSzS6XzsFr4n9-MC to lCQWEeR_Ert7t_qAbMRycwrRyZC-dIprYPyPJzwPmg4C (flutter/engine#9742)
b3bf0a1 Roll fuchsia/sdk/core/linux-amd64 from I2Qe1zxgckzIzMBTztvzeWYsDgcb9Fw-idSI16oIlx8C to KGmm_RIJoXS19zTm2crjM3RYpmp5Y03-fLUeVdylbTYC (flutter/engine#9743)
7e56823 Fix windows test by not attempting to open a directory as a file. (flutter/engine#9745)
6cf0d13 Roll src/third_party/skia a3ffaabcc4f2..96fdfe0fe88e (5 commits) (flutter/engine#9731)
49a00ae Fix Fuchsia build. (flutter/engine#9730)
b3bb39b Roll src/third_party/dart 06c3d7ad3a...09fc76bc51 (flutter/engine#9728)
2284210 Make the license script compatible with recently changed Dart I/O stream APIs (flutter/engine#9725)
ad582b5 Rework image &amp; texture management to use concurrent message queues. (flutter/engine#9486)
1dcd5f5 Roll src/third_party/skia 6b82cf638682..a3ffaabcc4f2 (24 commits) (flutter/engine#9726)
129979c Revert &#34;Roll src/third_party/dart 06c3d7ad3a..7acecda2cc (12 commits)&#34; (flutter/engine#9724)
8020d7e Roll src/third_party/skia 56065d9b875f..6b82cf638682 (3 commits) (flutter/engine#9718)
e24bd78 Roll src/third_party/dart 06c3d7ad3a..7acecda2cc (12 commits)
3d2668c Reland isolate group changes
802bd15 iOS platform view opacity (flutter/engine#9667)
3b6265b Roll src/third_party/dart b5aeaa6796..06c3d7ad3a (44 commits)
887e052 Refactor ColorFilter to have a native wrapper (flutter/engine#9668)

The AutoRoll server is located here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff ([email protected]), and stop
the roller if necessary.
johnsonmh pushed a commit to johnsonmh/flutter that referenced this pull request Jul 30, 2019
flutter/engine@75387db...49445ce

git log 75387db..49445ce --no-merges --oneline
49445ce FLEViewController/Engine API changes (flutter/engine#9750)
2a79462 Add Fuchsia build CI presubmit steps (flutter/engine#9736)
67cebdb Roll fuchsia/sdk/core/linux-amd64 from KGmm_RIJoXS19zTm2crjM3RYpmp5Y03-fLUeVdylbTYC to ehWVT9QJbC-vFMM6SkkQM9HJ9oITFCws7FC9JnrFq2gC (flutter/engine#9765)
089c740 Roll fuchsia/sdk/core/mac-amd64 from lCQWEeR_Ert7t_qAbMRycwrRyZC-dIprYPyPJzwPmg4C to EYnRdXFT9l-d8Qkz4zeTRXnqfV3KQzpQhoPs1r0-740C (flutter/engine#9759)
b22410e Include SkParagraph headers only when the enable-skshaper flag is on (flutter/engine#9758)
2cd650d Minimal integration with the Skia text shaper module (flutter/engine#9556)
f775f5e Re-enable the Wuffs GIF decoder (flutter/engine#9466)
aca0482 Make all shell unit tests use the OpenGL rasterizer. (flutter/engine#9746)
bc57291 Make FLEViewController&flutter#39;s view an internal detail (flutter/engine#9741)
9776043 Synchronize main thread and gpu thread for first render frame (flutter/engine#9506)
f600ae8 Use libc&flutter#43;&flutter#43; variant of string view and remove the FML variant. (flutter/engine#9737)
564f53f Revert &flutter#34;Improve caching limits for Skia (flutter#9503)&flutter#34; (flutter/engine#9740)
b453d3c libtxt: fix reference counting of SkFontStyleSets held by font asset providers (flutter/engine#9561)
fa7627d Fix backspace crash on Chinese devices (flutter/engine#9734)
56885f7 Let pushColorFilter accept all types of ColorFilters (flutter/engine#9641)
6dccb21 Roll src/third_party/skia 96fdfe0fe88e..af4e7b6cf616 (1 commits) (flutter/engine#9735)
8511d9b Roll fuchsia/sdk/core/mac-amd64 from byM-kyxL4bemlTYNqhKUfJfZoIUrCSzS6XzsFr4n9-MC to lCQWEeR_Ert7t_qAbMRycwrRyZC-dIprYPyPJzwPmg4C (flutter/engine#9742)
b3bf0a1 Roll fuchsia/sdk/core/linux-amd64 from I2Qe1zxgckzIzMBTztvzeWYsDgcb9Fw-idSI16oIlx8C to KGmm_RIJoXS19zTm2crjM3RYpmp5Y03-fLUeVdylbTYC (flutter/engine#9743)
7e56823 Fix windows test by not attempting to open a directory as a file. (flutter/engine#9745)
6cf0d13 Roll src/third_party/skia a3ffaabcc4f2..96fdfe0fe88e (5 commits) (flutter/engine#9731)
49a00ae Fix Fuchsia build. (flutter/engine#9730)
b3bb39b Roll src/third_party/dart 06c3d7ad3a...09fc76bc51 (flutter/engine#9728)
2284210 Make the license script compatible with recently changed Dart I/O stream APIs (flutter/engine#9725)
ad582b5 Rework image &amp; texture management to use concurrent message queues. (flutter/engine#9486)
1dcd5f5 Roll src/third_party/skia 6b82cf638682..a3ffaabcc4f2 (24 commits) (flutter/engine#9726)
129979c Revert &flutter#34;Roll src/third_party/dart 06c3d7ad3a..7acecda2cc (12 commits)&flutter#34; (flutter/engine#9724)
8020d7e Roll src/third_party/skia 56065d9b875f..6b82cf638682 (3 commits) (flutter/engine#9718)
e24bd78 Roll src/third_party/dart 06c3d7ad3a..7acecda2cc (12 commits)
3d2668c Reland isolate group changes
802bd15 iOS platform view opacity (flutter/engine#9667)
3b6265b Roll src/third_party/dart b5aeaa6796..06c3d7ad3a (44 commits)
887e052 Refactor ColorFilter to have a native wrapper (flutter/engine#9668)

The AutoRoll server is located here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff ([email protected]), and stop
the roller if necessary.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants